-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[FLINK-20455] Add check for jar file contents #14296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 1a10256 (Thu Dec 03 09:42:08 UTC 2020) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsThe Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
zentol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll need a whitelist of sorts for jars that do not require a LICENSE file, i.e. all those that we don't deploy (like examples). Or we just bundle the license in all jars 🤷
| if (ze.getName().equals("META-INF/NOTICE")) { | ||
| metaInfNoticeSeen = true; | ||
| } | ||
| if (ze.getName().equals("META-INF/LICENSE")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should also reject META-INF/LICENSE.txt
| try (ZipInputStream zis = new ZipInputStream(new FileInputStream(file.toFile()))) { | ||
| ZipEntry ze; | ||
| while ((ze = zis.getNextEntry()) != null) { | ||
| if (ze.getName().equals("LICENSE")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe instead check that no root file contains the word "license"?
| ZipEntry ze; | ||
| while ((ze = zis.getNextEntry()) != null) { | ||
| if (ze.getName().equals("LICENSE")) { | ||
| LOG.error("Jar file {} contains a LICENSE file in the root folder", file); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's explicitly give instructions on what to do here; validate whether the license is required, manually adding it to our licensing setup, and adding an exclusion to the shade plugin.
1a10256 to
0aa70d7
Compare
|
I pushed a new version, with more checks, which addresses all issues found by the extended tool. |
zentol
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a test that ensures all the wrong things are actually picked up. Wouldn't want us to accidentally break these safeguards unknowingly.
|
|
||
| package org.apache.flink.tools.ci.licensecheck; | ||
|
|
||
| import org.apache.logging.log4j.core.util.IOUtils; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't add hard dependencies on log4j
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, damn, that was not intentional ;)
| boolean metaInfNoticeSeen = false; | ||
| boolean metaInfLicenseSeen = false; | ||
| int classFiles = 0; | ||
|
|
||
| List<String> errors = new ArrayList<>(); | ||
| try (ZipInputStream zis = new ZipInputStream(new FileInputStream(file.toFile()))) { | ||
| ZipEntry ze; | ||
| while ((ze = zis.getNextEntry()) != null) { | ||
| final String name = ze.getName(); | ||
| /** | ||
| * There must be a LICENSE file and NOTICE file in the META-INF directory. | ||
| * LICENSE or NOTICE files found outside of the META-INF directory are most likely shading mistakes (we are including the files from other dependencies, thus providing an invalid LICENSE file) | ||
| * | ||
| * In such a case, we recommend updating the shading exclusions, and adding the license file to META-INF/licenses. | ||
| */ | ||
| if (!name.contains("META-INF") // license files in META-INF are expected | ||
| && !name.contains(".class") // some class files contain LICENSE in their name | ||
| && !name.contains(".ftl") // a false positive in flink-python | ||
| && !name.contains("web/3rdpartylicenses.txt") // a false positive in flink-runtime-web | ||
| && !name.endsWith("/") // ignore directories, e.g. "license/" | ||
| && name.toUpperCase().contains("LICENSE")) { | ||
| errors.add("Jar file " + file + " contains a LICENSE file in an unexpected location: " + name); | ||
| } | ||
| if (!name.contains("META-INF") && !name.contains(".class") && name.toUpperCase().contains("NOTICE")) { | ||
| errors.add("Jar file " + file + " contains a NOTICE file in an unexpected location: " + name); | ||
| } | ||
|
|
||
| if (name.equals("META-INF/NOTICE")) { | ||
| String noticeContents = IOUtils.toString(new InputStreamReader(zis)); | ||
| if (!noticeContents.toLowerCase().contains("flink") || !noticeContents.contains("The Apache Software Foundation")) { | ||
| errors.add("The notice file in " + file + ":META-INF/NOTICE does not contain the expected entries"); | ||
| } | ||
| metaInfNoticeSeen = true; | ||
| } | ||
| if (name.equals("META-INF/LICENSE")) { | ||
| String licenseContents = IOUtils.toString(new InputStreamReader(zis)); | ||
| if (!licenseContents.contains("Apache License") || !licenseContents.contains("Version 2.0, January 2004")) { | ||
| errors.add("The notice file in " + file + ":META-INF/LICENSE does not contain the expected entries"); | ||
| } | ||
| metaInfLicenseSeen = true; | ||
| } | ||
| if (name.endsWith(".class")) { | ||
| classFiles++; | ||
| } | ||
| zis.closeEntry(); | ||
| } | ||
| } | ||
|
|
||
| // Empty test jars (with no class files) are skipped | ||
| if (classFiles == 0 && file.toString().endsWith("-tests.jar")) { | ||
| return 0; | ||
| } | ||
| for (String error: errors) { | ||
| LOG.warn(error); | ||
| } | ||
| int severeIssues = errors.size(); | ||
|
|
||
| if (!metaInfLicenseSeen) { | ||
| LOG.warn("Missing META-INF/LICENSE in {}", file); | ||
| severeIssues++; | ||
| } | ||
|
|
||
| if (!metaInfNoticeSeen) { | ||
| LOG.warn("Missing META-INF/NOTICE in {}", file); | ||
| severeIssues++; | ||
| } | ||
| return severeIssues; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| boolean metaInfNoticeSeen = false; | |
| boolean metaInfLicenseSeen = false; | |
| int classFiles = 0; | |
| List<String> errors = new ArrayList<>(); | |
| try (ZipInputStream zis = new ZipInputStream(new FileInputStream(file.toFile()))) { | |
| ZipEntry ze; | |
| while ((ze = zis.getNextEntry()) != null) { | |
| final String name = ze.getName(); | |
| /** | |
| * There must be a LICENSE file and NOTICE file in the META-INF directory. | |
| * LICENSE or NOTICE files found outside of the META-INF directory are most likely shading mistakes (we are including the files from other dependencies, thus providing an invalid LICENSE file) | |
| * | |
| * In such a case, we recommend updating the shading exclusions, and adding the license file to META-INF/licenses. | |
| */ | |
| if (!name.contains("META-INF") // license files in META-INF are expected | |
| && !name.contains(".class") // some class files contain LICENSE in their name | |
| && !name.contains(".ftl") // a false positive in flink-python | |
| && !name.contains("web/3rdpartylicenses.txt") // a false positive in flink-runtime-web | |
| && !name.endsWith("/") // ignore directories, e.g. "license/" | |
| && name.toUpperCase().contains("LICENSE")) { | |
| errors.add("Jar file " + file + " contains a LICENSE file in an unexpected location: " + name); | |
| } | |
| if (!name.contains("META-INF") && !name.contains(".class") && name.toUpperCase().contains("NOTICE")) { | |
| errors.add("Jar file " + file + " contains a NOTICE file in an unexpected location: " + name); | |
| } | |
| if (name.equals("META-INF/NOTICE")) { | |
| String noticeContents = IOUtils.toString(new InputStreamReader(zis)); | |
| if (!noticeContents.toLowerCase().contains("flink") || !noticeContents.contains("The Apache Software Foundation")) { | |
| errors.add("The notice file in " + file + ":META-INF/NOTICE does not contain the expected entries"); | |
| } | |
| metaInfNoticeSeen = true; | |
| } | |
| if (name.equals("META-INF/LICENSE")) { | |
| String licenseContents = IOUtils.toString(new InputStreamReader(zis)); | |
| if (!licenseContents.contains("Apache License") || !licenseContents.contains("Version 2.0, January 2004")) { | |
| errors.add("The notice file in " + file + ":META-INF/LICENSE does not contain the expected entries"); | |
| } | |
| metaInfLicenseSeen = true; | |
| } | |
| if (name.endsWith(".class")) { | |
| classFiles++; | |
| } | |
| zis.closeEntry(); | |
| } | |
| } | |
| // Empty test jars (with no class files) are skipped | |
| if (classFiles == 0 && file.toString().endsWith("-tests.jar")) { | |
| return 0; | |
| } | |
| for (String error: errors) { | |
| LOG.warn(error); | |
| } | |
| int severeIssues = errors.size(); | |
| if (!metaInfLicenseSeen) { | |
| LOG.warn("Missing META-INF/LICENSE in {}", file); | |
| severeIssues++; | |
| } | |
| if (!metaInfNoticeSeen) { | |
| LOG.warn("Missing META-INF/NOTICE in {}", file); | |
| severeIssues++; | |
| } | |
| return severeIssues; | |
| int numSevereIssues = 0; | |
| try (final FileSystem fileSystem = FileSystems.newFileSystem(new URI("jar:file", uri.getHost(), uri.getPath(), uri.getFragment()), Collections.emptyMap())) { | |
| // test jars (with no class files) are ignored | |
| if (file.getFileName().toString().endsWith("-tests.jar")) { | |
| for (Path root : fileSystem.getRootDirectories()) { | |
| try (Stream<Path> files = Files.walk(root)) { | |
| long numClassFiles = files | |
| .filter(path -> path.getFileName().toString().endsWith(".class")) | |
| .count(); | |
| if (numClassFiles == 0) { | |
| return 0; | |
| } | |
| } | |
| } | |
| } | |
| // check notice file existence and contents | |
| final Path noticeFile = fileSystem.getPath("META-INF", "NOTICE"); | |
| if (Files.exists(noticeFile)) { | |
| final String noticeFileContents = Files.readString(noticeFile); | |
| if (!noticeFileContents.toLowerCase().contains("flink") || !noticeFileContents.contains("The Apache Software Foundation")) { | |
| LOG.error("The notify file in {} does not contain the expected entries.", file); | |
| numSevereIssues++; | |
| } | |
| } else { | |
| LOG.error("Missing META-INF/NOTICE in {}", file); | |
| numSevereIssues++; | |
| } | |
| // check license file existence and contents | |
| final Path licenseFile = fileSystem.getPath("META-INF", "LICENSE"); | |
| if (Files.exists(licenseFile)) { | |
| final String licenseFileContents = Files.readString(licenseFile); | |
| if (!licenseFileContents.contains("Apache License") || !licenseFileContents.contains("Version 2.0, January 2004")) { | |
| LOG.error("The license file in {} does not contain the expected entries.", file); | |
| numSevereIssues++; | |
| } | |
| } else { | |
| LOG.error("Missing META-INF/NOTICE in {}", file); | |
| numSevereIssues++; | |
| } | |
| for (Path root : fileSystem.getRootDirectories()) { | |
| try (Stream<Path> files = Files.walk(root)) { | |
| /* | |
| * LICENSE or NOTICE files found outside of the META-INF directory are most likely shading mistakes (we are including the files from other dependencies, thus providing an invalid LICENSE file) | |
| * | |
| * <p>In such a case, we recommend updating the shading exclusions, and adding the license file to META-INF/licenses. | |
| */ | |
| final List<String> filesWithIssues = files | |
| .filter(path -> !path.equals(root)) | |
| .filter(path -> path.getFileName().toString().toLowerCase().contains("license")) | |
| .filter(path -> !Files.isDirectory(path)) // ignore directories, e.g. "license/" | |
| .filter(path -> !path.getFileName().toString().endsWith(".class")) // some class files contain LICENSE in their name | |
| .filter(path -> !path.getFileName().toString().endsWith(".ftl")) // a false positive in flink-python | |
| .map(Path::toString) | |
| .filter(path -> !path.contains("META-INF")) // license files in META-INF are expected | |
| .filter(path -> !path.endsWith("web/3rdpartylicenses.txt")) // a false positive in flink-runtime-web | |
| .collect(Collectors.toList()); | |
| numSevereIssues += filesWithIssues.size(); | |
| for (String fileWithIssue : filesWithIssues) { | |
| LOG.error("Jar file {} contains a LICENSE file in an unexpected location: {}", file, fileWithIssue); | |
| } | |
| } | |
| } | |
| } | |
| return numSevereIssues; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seem to like these overly complex loops, which you dislike ;) (understandably)
Your proposal seems to require Java 11, and it fails with a NPE. But since you are taking over, I'll leave it to you to adjust it.
|
How about we merge the licensing issues fixes right away (because they do look correct to me), and I'll take care of writing the test tomorrow? |
|
Manually merged. |
What is the purpose of the change
During the release validation, we noticed several jar files containing LICENSE files in their root.
This change adds some basic checks to validate the jar files.
NOTICE: The check currently fails because the examples lack LICENSE files.
Brief change log