-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unknown licenses #31223
Fix unknown licenses #31223
Conversation
The goal of this commit is to address unknown licenses when producing the dependencies info report. We have two different checks that we run on licenses. The first check is whether or not we have stashed a copy of the license text for a dependency in the repository. The second is to map every dependency to a license type (e.g., BSD 3-clause). The problem here is that the way we were handling licenses in the second check differs from how we handle licenses in the first check. The first check works by finding a license file with the name of the artifact followed by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact name to another name used to check for the license (e.g., we map lucene-.* to lucene, and opensaml-.* to shibboleth. The second check understood the first way of looking for a license file but not the second way. So in this commit we teach the second check about the mappings from artifact names to license names. We do this by copying the configuration from the dependencyLicenses task to the dependenciesInfo task and then reusing the code from the first check in the second check. There were some other challenges here though. For example, dependenciesInfo was checking too many dependencies. For now, we should only be checking direct dependencies and leaving transitive dependencies from another org.elasticsearch artifact to that artifact (we want to do this differently in a follow-up). We also want to disable dependenciesInfo for projects that we do not publish, users only care about licenses they might be exposed to if they use our assembled products. With all of the changes in this commit we have eliminated all unknown licenses. A follow-up will enforce that when we add a new dependency it does not get mapped to unknown, these will be forbidden in the future. Therefore, with this change and earlier changes are left having no unknown licenses and two custom licenses; custom here means it does not map to an SPDX license type. Those two licenses are xz and ldapsdk. A future change will not allow additional custom licenses unless they are explicitly whitelisted. This ensures that if a new dependency is added it is mapped to an SPDX license or mapped to custom because it does not have an SPDX license.
Pinging @elastic/es-core-infra |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm, with one minor comment about a log line, feel free to not address or address. dealer choice.
checkFile(depName, jarName, licenses, 'LICENSE') | ||
checkFile(depName, jarName, notices, 'NOTICE') | ||
final String dependencyName = getDependencyName(mappings, depName) | ||
logger.info("Mapped dependency name ${depName} to ${dependencyName} for license check") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we have 2 logger.info's one line apart? can we put both of them into 1 line given the previous line is not terribly informative and this line always prints right after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed 297e33c.
The goal of this commit is to address unknown licenses when producing the dependencies info report. We have two different checks that we run on licenses. The first check is whether or not we have stashed a copy of the license text for a dependency in the repository. The second is to map every dependency to a license type (e.g., BSD 3-clause). The problem here is that the way we were handling licenses in the second check differs from how we handle licenses in the first check. The first check works by finding a license file with the name of the artifact followed by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact name to another name used to check for the license (e.g., we map lucene-.* to lucene, and opensaml-.* to shibboleth. The second check understood the first way of looking for a license file but not the second way. So in this commit we teach the second check about the mappings from artifact names to license names. We do this by copying the configuration from the dependencyLicenses task to the dependenciesInfo task and then reusing the code from the first check in the second check. There were some other challenges here though. For example, dependenciesInfo was checking too many dependencies. For now, we should only be checking direct dependencies and leaving transitive dependencies from another org.elasticsearch artifact to that artifact (we want to do this differently in a follow-up). We also want to disable dependenciesInfo for projects that we do not publish, users only care about licenses they might be exposed to if they use our assembled products. With all of the changes in this commit we have eliminated all unknown licenses. A follow-up will enforce that when we add a new dependency it does not get mapped to unknown, these will be forbidden in the future. Therefore, with this change and earlier changes are left having no unknown licenses and two custom licenses; custom here means it does not map to an SPDX license type. Those two licenses are xz and ldapsdk. A future change will not allow additional custom licenses unless they are explicitly whitelisted. This ensures that if a new dependency is added it is mapped to an SPDX license or mapped to custom because it does not have an SPDX license.
The goal of this commit is to address unknown licenses when producing the dependencies info report. We have two different checks that we run on licenses. The first check is whether or not we have stashed a copy of the license text for a dependency in the repository. The second is to map every dependency to a license type (e.g., BSD 3-clause). The problem here is that the way we were handling licenses in the second check differs from how we handle licenses in the first check. The first check works by finding a license file with the name of the artifact followed by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact name to another name used to check for the license (e.g., we map lucene-.* to lucene, and opensaml-.* to shibboleth. The second check understood the first way of looking for a license file but not the second way. So in this commit we teach the second check about the mappings from artifact names to license names. We do this by copying the configuration from the dependencyLicenses task to the dependenciesInfo task and then reusing the code from the first check in the second check. There were some other challenges here though. For example, dependenciesInfo was checking too many dependencies. For now, we should only be checking direct dependencies and leaving transitive dependencies from another org.elasticsearch artifact to that artifact (we want to do this differently in a follow-up). We also want to disable dependenciesInfo for projects that we do not publish, users only care about licenses they might be exposed to if they use our assembled products. With all of the changes in this commit we have eliminated all unknown licenses. A follow-up will enforce that when we add a new dependency it does not get mapped to unknown, these will be forbidden in the future. Therefore, with this change and earlier changes are left having no unknown licenses and two custom licenses; custom here means it does not map to an SPDX license type. Those two licenses are xz and ldapsdk. A future change will not allow additional custom licenses unless they are explicitly whitelisted. This ensures that if a new dependency is added it is mapped to an SPDX license or mapped to custom because it does not have an SPDX license.
The goal of this commit is to address unknown licenses when producing the dependencies info report. We have two different checks that we run on licenses. The first check is whether or not we have stashed a copy of the license text for a dependency in the repository. The second is to map every dependency to a license type (e.g., BSD 3-clause). The problem here is that the way we were handling licenses in the second check differs from how we handle licenses in the first check. The first check works by finding a license file with the name of the artifact followed by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact name to another name used to check for the license (e.g., we map lucene-.* to lucene, and opensaml-.* to shibboleth. The second check understood the first way of looking for a license file but not the second way. So in this commit we teach the second check about the mappings from artifact names to license names. We do this by copying the configuration from the dependencyLicenses task to the dependenciesInfo task and then reusing the code from the first check in the second check. There were some other challenges here though. For example, dependenciesInfo was checking too many dependencies. For now, we should only be checking direct dependencies and leaving transitive dependencies from another org.elasticsearch artifact to that artifact (we want to do this differently in a follow-up). We also want to disable dependenciesInfo for projects that we do not publish, users only care about licenses they might be exposed to if they use our assembled products. With all of the changes in this commit we have eliminated all unknown licenses. A follow-up will enforce that when we add a new dependency it does not get mapped to unknown, these will be forbidden in the future. Therefore, with this change and earlier changes are left having no unknown licenses and two custom licenses; custom here means it does not map to an SPDX license type. Those two licenses are xz and ldapsdk. A future change will not allow additional custom licenses unless they are explicitly whitelisted. This ensures that if a new dependency is added it is mapped to an SPDX license or mapped to custom because it does not have an SPDX license.
* 6.x: Move default location of dependencies report (#31228) Remove dependencies report task dependencies (#31227) Add recognition of MPL 2.0 (#31226) Fix unknown licenses (#31223) Fully encapsulate LocalCheckpointTracker inside of the engine (#31213) Remove version from license file name for GCS SDK (#31221) Remove DocumentFieldMappers#simpleMatchToFullName. (#31041) [DOCS] Removes 6.3.1 release notes [DOCS] Splits release notes by major version Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (#31018) Remove extraneous references to 'tokenized' in the mapper code. (#31010) SQL: Make a single JDBC driver jar (#31012) QA: Fix rolling restart tests some more Allow to trim all ops above a certain seq# with a term lower than X high level REST api: cancel task (#30745) Mute TokenBackwardsCompatibilityIT.testMixedCluster Mute WatchBackwardsCompatibilityIT.testWatcherRestart Enhance license detection for various licenses (#31198) [DOCS] Add note about long-lived idle connections (#30990) Add high-level client methods that accept RequestOptions (#31069) Remove RestGetAllMappingsAction (#31129) Move RestGetSettingsAction to RestToXContentListener (#31101) Move number of language analyzers to analysis-common module (#31143) flush job to ensure all results have been written (#31187)
* master: Move default location of dependencies report (#31228) Remove dependencies report task dependencies (#31227) Add recognition of MPL 2.0 (#31226) Fix unknown licenses (#31223) Remove version from license file name for GCS SDK (#31221) Fully encapsulate LocalCheckpointTracker inside of the engine (#31213) [DOCS] Added 'fail_on_unsupported_field' param to MLT. Closes #28008 (#31160) Add licenses for transport-nio (#31218) Remove DocumentFieldMappers#simpleMatchToFullName. (#31041) Allow to trim all ops above a certain seq# with a term lower than X, post backport fix (#31211) Compliant SAML Response destination check (#31175) Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (#31018) Remove extraneous references to 'tokenized' in the mapper code. (#31010) Allow to trim all ops above a certain seq# with a term lower than X (#30176) SQL: Make a single JDBC driver jar (#31012) Enhance license detection for various licenses (#31198) [DOCS] Add note about long-lived idle connections (#30990) Move number of language analyzers to analysis-common module (#31143) Default max concurrent search req. numNodes * 5 (#31171) flush job to ensure all results have been written (#31187)
…ecker * elastic/master: (309 commits) [test] add fix for rare virtualbox error (elastic#31212) Move default location of dependencies report (elastic#31228) Remove dependencies report task dependencies (elastic#31227) Add recognition of MPL 2.0 (elastic#31226) Fix unknown licenses (elastic#31223) Remove version from license file name for GCS SDK (elastic#31221) Fully encapsulate LocalCheckpointTracker inside of the engine (elastic#31213) [DOCS] Added 'fail_on_unsupported_field' param to MLT. Closes elastic#28008 (elastic#31160) Add licenses for transport-nio (elastic#31218) Remove DocumentFieldMappers#simpleMatchToFullName. (elastic#31041) Allow to trim all ops above a certain seq# with a term lower than X, post backport fix (elastic#31211) Compliant SAML Response destination check (elastic#31175) Remove DocumentFieldMappers#smartNameFieldMapper, as it is no longer needed. (elastic#31018) Remove extraneous references to 'tokenized' in the mapper code. (elastic#31010) Allow to trim all ops above a certain seq# with a term lower than X (elastic#30176) SQL: Make a single JDBC driver jar (elastic#31012) Enhance license detection for various licenses (elastic#31198) [DOCS] Add note about long-lived idle connections (elastic#30990) Move number of language analyzers to analysis-common module (elastic#31143) Default max concurrent search req. numNodes * 5 (elastic#31171) ...
The goal of this commit is to address unknown licenses when producing the dependencies info report. We have two different checks that we run on licenses. The first check is whether or not we have stashed a copy of the license text for a dependency in the repository. The second is to map every dependency to a license type (e.g., BSD 3-clause). The problem here is that the way we were handling licenses in the second check differs from how we handle licenses in the first check. The first check works by finding a license file with the name of the artifact followed by the text -LICENSE.txt. Yet in some cases we allow mapping an artifact name to another name used to check for the license (e.g., we map lucene-.* to lucene, and opensaml-.* to shibboleth. The second check understood the first way of looking for a license file but not the second way. So in this commit we teach the second check about the mappings from artifact names to license names. We do this by copying the configuration from the dependencyLicenses task to the dependenciesInfo task and then reusing the code from the first check in the second check. There were some other challenges here though. For example, dependenciesInfo was checking too many dependencies. For now, we should only be checking direct dependencies and leaving transitive dependencies from another org.elasticsearch artifact to that artifact (we want to do this differently in a follow-up). We also want to disable dependenciesInfo for projects that we do not publish, users only care about licenses they might be exposed to if they use our assembled products. With all of the changes in this commit we have eliminated all unknown licenses. A follow-up will enforce that when we add a new dependency it does not get mapped to unknown, these will be forbidden in the future. Therefore, with this change and earlier changes are left having no unknown licenses and two custom licenses; custom here means it does not map to an SPDX license type. Those two licenses are xz and ldapsdk. A future change will not allow additional custom licenses unless they are explicitly whitelisted. This ensures that if a new dependency is added it is mapped to an SPDX license or mapped to custom because it does not have an SPDX license.