-
Notifications
You must be signed in to change notification settings - Fork 570
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove MetadataType from core package object and normalize JSON metadataType values #1983
Conversation
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Benchmark Test ResultsBenchmark results from the latest changes vs base branch
|
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Semantic diff for reviewers between the v11 and v12 json schemas BEFORE the struct renames CodeThe python code that generated this list import json
import difflib
original_schema = "schema/json/schema-11.0.1.json"
new_schema = "schema/json/schema-12.0.0.json"
# {old-type-name: new-type-name}
type_def_mapping = {
"AlpmMetadata": "arch-alpm-db-record",
"ApkMetadata": "alpine-apk-db-record",
"BinaryMetadata": "binary-signature",
"CocoapodsMetadata": "cocoa-podfile-lock",
"ConanLockMetadata": "c-conan-lock",
"ConanMetadata": "c-conan",
"DartPubMetadata": "dart-pubspec-lock",
"DotnetPortableExecutableMetadata": "dotnet-portable-executable",
"DotnetDepsMetadata": "dotnet-deps",
"DpkgMetadata": "debian-dpkg-db-record",
"GemMetadata": "ruby-gemspec",
"GolangBinMetadata": "go-module-binary-buildinfo",
"GolangModMetadata": "go-module",
"HackageMetadata": "haskell-hackage-stack",
"JavaMetadata": "java-archive",
"KbPackageMetadata": "microsoft-kb-patch",
"LinuxKernelMetadata": "linux-kernel-archive",
"LinuxKernelModuleMetadata": "linux-kernel-module",
"MixLockMetadata": "elixir-mix-lock",
"NixStoreMetadata": "nix-store",
"NpmPackageJSONMetadata": "javascript-npm-package",
"NpmPackageLockJSONMetadata": "javascript-npm-package-lock",
"PhpComposerJSONMetadata": "php-composer-lock",
"PortageMetadata": "gentoo-portage-db-record",
"PythonPackageMetadata": "python-package",
"PythonPipfileLockMetadata": "python-pipfile-lock",
"PythonRequirementsMetadata": "python-pip-requirements",
"RebarLockMetadata": "erlang-rebar-lock",
"RDescriptionFileMetadata": "r-description",
"RpmdbFileRecord": "rpm-file-record",
"RpmMetadata": "redhat-rpm-db-record",
"RpmdbMetadata": "redhat-rpm-db-record",
"RpmDBMetadata": "redhat-rpm-db-record",
"RpmArchiveMetadata": "redhat-rpm-archive",
"SwiftPackageManagerMetadata": "swift-package-manager-lock",
"CargoPackageMetadata": "rust-cargo-lock"
}
def main():
original_type_definitions = extract_type_definitions(original_schema)
new_type_definitions = extract_type_definitions(new_schema)
new_names_diffed = set()
names_with_same_content = set()
for definition_name, old_def in original_type_definitions.items():
new_name = get_new_name(definition_name)
if not new_name:
new_name = definition_name
new_def = new_type_definitions.get(new_name, "")
if not new_def:
print("Missing definition in new schema: {}".format(definition_name))
continue
new_names_diffed.add(new_name)
# diff the definitions
diff = difflib.unified_diff(old_def.splitlines(), new_def.splitlines(), fromfile=original_schema, tofile=new_schema)
diff = "\n".join(diff)
if diff:
print("Diff for {}".format(definition_name))
print(diff)
print()
else:
names_with_same_content.add(definition_name)
# for all new names not processed, print a warning
for definition_name, new_def in new_type_definitions.items():
if definition_name not in new_names_diffed:
print("Missing equivalent definition in original schema: {}".format(definition_name))
print(f"Definitions with same content: {len(names_with_same_content)}")
for name in sorted(list(names_with_same_content)):
print(" -", name)
def extract_type_definitions(schema_file_path) -> dict[str, str]:
with open(schema_file_path, "r") as schema_file:
schema = json.load(schema_file)
definitions = schema.get("$defs", {})
type_definitions = {}
for definition_name, definition in definitions.items():
# if definition_name in type_def_mapping:
# new_name = to_camel_case(type_def_mapping[definition_name])
# # print("Renaming {} to {}".format(definition_name, new_name))
# definition_name = new_name
# # else:
# # print("No mapping for {}".format(definition_name))
type_definitions[definition_name] = json.dumps(definition, indent=2, sort_keys=True)
return type_definitions
def get_new_name(name: str) -> str | None:
if name in type_def_mapping:
return to_camel_case(type_def_mapping[name])
def to_camel_case(s: str) -> str:
s = s.replace("-", "_")
return ''.join(x.capitalize() or '_' for x in s.split('_'))
if __name__ == "__main__":
main()
|
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see any blocking issues, but left a suggestion about defining the type-to-name mappings for JSON.
Removing approval so this doesn't accidentally get merged until we're ready for it
@wagoodman I read through everything here and no notes or comments the change makes sense IMO - feel free to merge when you think syft is ready for the major schema bump |
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
there have been enough changes to warrant a review on the new change set
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
…adata Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
e4a4303
to
1d867ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
…ataType values (anchore#1983) * [wip] Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * distinct the package metadata functions Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * remove metadata type from package core model Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * incorporate review feedback for names Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add RPM archive metadata and split parser helpers Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * clarify the python package metadata type Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * rename the KB metadata type Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * break hackage and composer types by use case Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * linting fix Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix encoding and decoding for syft-json and cyclonedx Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * bump json schema to 11 Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update cyclonedx-json snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update cyclonedx-xml snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update spdx-json snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update spdx-tv snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update syft-json snapshots Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * correct metadata type in stack yaml parser test Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix bom-ref redactor for cyclonedx-xml Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add tests for legacy package metadata names Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * regenerate json schema v11 Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix legacy HackageMetadataType reflect type value check Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix linting Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * packagemetadata discovery should account for type shadowing Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix linting Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * fix cli tests Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * bump json schema version to v12 Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * update json schema to incorporate changes from main Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add syft-json legacy config option Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add tests around v11-v12 json decoding Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * add docs for SYFT_JSON_LEGACY Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> * rename structs to be compliant with new naming scheme Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com> --------- Signed-off-by: Alex Goodman <wagoodman@users.noreply.github.com>
This PR:
pkg.Package.MetadataType
from the core package model, keeping it as a concern for thesyftjson
format.SYFT_FORMAT_JSON_LEGACY=<bool>
(defaulting tofalse
) to the syft application config. This allows folks to be able to fallback to the old JSON metadata type names (and other soon-to-be-breaking changes) to get to a pre-1.0 state of the JSON output.SYFT_TEMPLATE
configuration toSYFT_FORMAT_TEMPLATE
to be consistent with future format related configurations.pkg.*Metadata
structs to be consistent with the metadata type names (they do not always match exactly).Doing this necessarily breaks the JSON schema, so it has been rev'd to v12 in this PR.
The downstream grype PR has been drafted: anchore/grype#1423
For a semantic diff of the v11.0.1 vs v12 JSON schema see #1983 (comment) .
Fixes #1844
Fixes #1735