Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contents should include checksum and filename in standard format #127

Open
ctubbsii opened this issue Oct 2, 2021 · 9 comments
Open

Contents should include checksum and filename in standard format #127

ctubbsii opened this issue Oct 2, 2021 · 9 comments
Labels

Comments

@ctubbsii
Copy link

ctubbsii commented Oct 2, 2021

This plugin should support writing files in a standard format, for easier verification. Standard tools have a convenient -c option to verify a checksum file, but this doesn't work with the checksums created by this plugin, because they are not in a standard format.

There are two standard file formats for use with checksum files:

  1. The GNU coreutils format used by sha512sum on GNU/Linux distributions (see man sha512sum). This outputs in the format <checksum><space><spaceInTextModeOrAsteriskInBinaryMode><filename><newline>, repeated for each file whose checksum is contained in the file (in this case, there would only be one file's checksum).
  2. The BSD format used by equivalent UNIX tools in BSD/Unix distributions, which is also supported by GNU coreutils with the --tag option (see man sha512sum). This outputs in the format <ALGNAME><space><lparen><filename><rparen><space><equal><space><checksum><newline> for each checksum.

Both of these standard formats are also supported by the shasum executable backed by the commonly used Digest::SHA perl module.

Here's some examples (using tee to output the content of the checksum file as it is written):

$ shasum -a 512 pom.xml | tee pom.xml.sha512 # using Digest::SHA to create GNU format
b004deb83fa29a7d5b6f43141f8df9f84571ba6e8800ac9a72c5c40ccbdfca2c7866568de697e163fe8c2be7ae1d2ec8b3e907a1b902e5a8ca4bbb7360a9131d  pom.xml
$ shasum -c pom.xml.sha512  # using Digest::SHA to verify GNU format
pom.xml: OK
$ shasum -a 512 --tag pom.xml | tee pom.xml.sha512  # using Digest::SHA to create BSD format
SHA512 (pom.xml) = b004deb83fa29a7d5b6f43141f8df9f84571ba6e8800ac9a72c5c40ccbdfca2c7866568de697e163fe8c2be7ae1d2ec8b3e907a1b902e5a8ca4bbb7360a9131d
$ shasum -c pom.xml.sha512  # using Digest::SHA to verify BSD format
pom.xml: OK
$ sha512sum pom.xml | tee pom.xml.sha512 # using GNU coreutils to create GNU format
b004deb83fa29a7d5b6f43141f8df9f84571ba6e8800ac9a72c5c40ccbdfca2c7866568de697e163fe8c2be7ae1d2ec8b3e907a1b902e5a8ca4bbb7360a9131d  pom.xml
$ sha512sum -c pom.xml.sha512  # using GNU coreutils to verify GNU format
pom.xml: OK
$ sha512sum --tag pom.xml | tee pom.xml.sha512  # using GNU coreutils to create BSD format
SHA512 (pom.xml) = b004deb83fa29a7d5b6f43141f8df9f84571ba6e8800ac9a72c5c40ccbdfca2c7866568de697e163fe8c2be7ae1d2ec8b3e907a1b902e5a8ca4bbb7360a9131d
$ sha512sum -c pom.xml.sha512  # using GNU coreutils to verify BSD format
pom.xml: OK

I didn't show an example with the -b binary flag for the GNU format examples, but I strongly recommend using BSD format anyway, which always uses binary mode when generating and verifying checksums.

For me, using this plugin is a downgrade because the file formats it emits are not easily verified with standard tools. If it output in a standard format (preferably the BSD format, because it shows the algorithm used explicitly, which will be important as SHA3 becomes more common, and always uses binary mode), this plugin would be far more useful.

@remkop
Copy link

remkop commented Dec 21, 2021

I am also looking to use shasum to verify releases with a command like this:

find . -type f -name "*.sha512" -exec shasum -c {} -a 512 \;

Looking at the ArtifactsMojo class, line 128 and OneHashPerFileTarget line 145, the GNU format seems to be already supported.

To switch this on, add <appendFilename>true</appendFilename> to the configuration.

Example usage:

<plugin>
<groupId>net.nicoulaj.maven.plugins</groupId>
<artifactId>checksum-maven-plugin</artifactId>
<version>1.11</version>
<executions>
<execution>
    <id>calculate-checksums</id>
    <goals>
        <goal>files</goal>
    </goals>
    <!-- execute prior to maven-gpg-plugin:sign due to https://github.com/nicoulaj/checksum-maven-plugin/issues/112 -->
    <phase>post-integration-test</phase>
    <configuration>
        <appendFilename>true</appendFilename> <!-- ADD THIS LINE TO THE CONFIGURATION -->
        <algorithms>
            <algorithm>SHA-256</algorithm>
            <algorithm>SHA-512</algorithm>
        </algorithms>
        <!-- https://maven.apache.org/apache-resource-bundles/#source-release-assembly-descriptor -->
        <fileSets>
            <fileSet>
                <directory>${project.build.directory}</directory>
                <includes>
                    <include>${myproject}-${project.version}-src.zip</include>
                    <include>${myproject}-${project.version}-src.tar.gz</include>
                    <include>${myproject}-${project.version}-bin.zip</include>
                    <include>${myproject}-${project.version}-bin.tar.gz</include>
                </includes>
            </fileSet>
        </fileSets>
        <csvSummary>false</csvSummary>
    </configuration>
</execution>

@michael-o
Copy link
Contributor

As far as I remember, OpenSSL produces BSD-style as well.

@michael-o
Copy link
Contributor

michael-o commented Jan 6, 2022

I think best would be to drop this appendFilename altogether and introduce an outputFormat with an interpolator along with two symbolic names:

  • ${digest}
  • ${algorithm}
  • ${filename}
  • GNU
  • BSD

@ctubbsii
Copy link
Author

ctubbsii commented Jan 6, 2022

Also, keep in mind GNU has two formats, one for text mode input (${digest}<space><space>${filename}) and one for binary mode input (${digest}<space><star '*' literal>${filename}) (although, in practice, they are equivalent on GNU systems).

@michael-o
Copy link
Contributor

I am either completely stupid, but I really don't understand the purpose the text mode at all. All of those message digest operate on bytes. What do I miss?

@bondolo
Copy link
Collaborator

bondolo commented Jan 6, 2022

The intention of text mode was that the checksum would normalize line endings for text files. \n or \r or \n\r all would hash the same. Files that differed in line ending would have the same hash value. Today it is a rarely useful feature.

@michael-o
Copy link
Contributor

Here are the relevant algo to name mappings for BSD format: https://github.com/freebsd/freebsd-src/blob/78beb051a2661b873342162b1ec0ad55b4e27261/sbin/md5/md5.c#L122-L156

@michael-o
Copy link
Contributor

I think this is something we really want to have for all Maven-based ASF releases.

@rgoers
Copy link

rgoers commented Feb 21, 2022

As ugly as this is I was able to work around this problem with

      <plugin>
        <artifactId>maven-antrun-plugin</artifactId>
        <version>3.0.0</version>
        <executions>
          <execution>
            <phase>post-integration-test</phase>
            <configuration>
              <target>
                <property name="spaces" value="  "/>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-src.zip.sha256" append="yes">${spaces}apache-log4j-${project.version}-src.zip</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-src.zip.sha512" append="yes">${spaces}apache-log4j-${project.version}-src.zip</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-src.tar.gz.sha256" append="yes">${spaces}apache-log4j-${project.version}-src.tar.gz</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-src.tar.gz.sha512" append="yes">${spaces}apache-log4j-${project.version}-src.tar.gz</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-bin.zip.sha256" append="yes">${spaces}apache-log4j-${project.version}-bin.zip</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-bin.zip.sha512" append="yes">${spaces}apache-log4j-${project.version}-bin.zip</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-bin.tar.gz.sha256" append="yes">${spaces}apache-log4j-${project.version}-bin.tar.gz</concat>
                <concat destfile="${project.build.directory}/apache-log4j-${project.version}-bin.tar.gz.sha512" append="yes">${spaces}apache-log4j-${project.version}-bin.tar.gz</concat>
              </target>
            </configuration>
            <goals>
              <goal>run</goal>
            </goals>
          </execution>
        </executions>
      </plugin>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants