Skip to content

Conversation

@danielframpton
Copy link
Contributor

This is a significant change to Rust component detection to solely use the Cargo.lock file to detect component usage (e.g., no longer processing Cargo.toml).

As discussed in #116 there are several issues with the existing Cargo.toml processing that this change sidesteps by reporting all crates from Cargo.lock.

This change also explicitly includes the package registry in the identifier for components. I have not investigated what implications there may be for this to downstream tools, but because Cargo has support for multiple registries the package name and version number is insufficient to identify the package (because of multiple registries or a local package with the same name as one published in the registry).

In general, the sole use of the Cargo.lock file should strictly increase the number of crates reported, although I did observe some reductions that were due to this change no longer reporting some local/path crates as components (because they had a version and path specified) but I believe this was unintentional/unexpected.

There was also significant duplication in the code between v1 and v2 detectors, but in rewriting the logic to process the lock file I was able to use the same logic for both formats.

I expect this to need a careful review and some changes before landing, but I wanted to send out the PR to make some progress on getting these issues resolved.

Fixes #116

@danielframpton danielframpton requested a review from a team as a code owner May 8, 2022 23:14
Copy link
Contributor

@arlosi arlosi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not an expert in CG, but I think this makes sense to get Cargo support to stop missing dependencies as mentioned in #116 - even if it means we give up dev-dependency detection for now.

@cobya cobya added the detector:pip The pip detector label May 16, 2022
@jcfiorenzano jcfiorenzano added detector:rust The Rust Cargo detector and removed detector:pip The pip detector labels May 27, 2022
@JamieMagee JamieMagee requested review from JamieMagee and cobya June 15, 2022 17:09
@danielframpton danielframpton force-pushed the cargo-fixes branch 2 times, most recently from ad8b3f2 to 42ba5f2 Compare June 16, 2022 19:04
@juvazq
Copy link

juvazq commented Jul 29, 2022

Hey @arlosi and @danielframpton, is there anything else blocking this pr? It would be nice to have component detection in good shape for rust. Thanks!

@arlosi
Copy link
Contributor

arlosi commented Jul 29, 2022

It needs review from the maintainers. Neither @danielframpton nor I can approve it.

@juvazq
Copy link

juvazq commented Aug 1, 2022

Thanks @arlosi,

@jcfiorenzano @cobya do you think this can be merged? Do you miss something before it can be approved?

@danielframpton
Copy link
Contributor Author

I have simplified this PR to focus on the source of dependencies (lock vs toml) deferring the question of multiple registries to see if we can land the highest priority part this change.

Comment on lines 50 to 57
if (packagesByName.TryGetValue(cargoPackage.name, out var packageList))
{
if (packageList.Any(p => p.package.Equals(cargoPackage)))
{
// Ignore duplicate packages
continue;
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be an else here? Or can these two conditionals be combined?

{
try
{
// Extract the informationfrom the dependency (name with optional version and source)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Extract the informationfrom the dependency (name with optional version and source)
// Extract the information from the dependency (name with optional version and source)

Comment on lines 213 to 225
@"^([^ ]+)(?: ([^ ]+))?(?: \(([^()]*)\))?$",
RegexOptions.Compiled);

private const int PackageNameGroup = 1;
private const int VersionGroup = 2;
private const int SourceGroup = 3;

private static bool ParseDependency(string dependency, out string packageName, out string version, out string source)
{
var match = DependencyFormatRegex.Match(dependency);
packageName = match.Groups[PackageNameGroup].Success ? match.Groups[PackageNameGroup].Value : null;
version = match.Groups[VersionGroup].Success ? match.Groups[VersionGroup].Value : null;
source = match.Groups[SourceGroup].Success ? match.Groups[SourceGroup].Value : null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: named groups1

Suggested change
@"^([^ ]+)(?: ([^ ]+))?(?: \(([^()]*)\))?$",
RegexOptions.Compiled);
private const int PackageNameGroup = 1;
private const int VersionGroup = 2;
private const int SourceGroup = 3;
private static bool ParseDependency(string dependency, out string packageName, out string version, out string source)
{
var match = DependencyFormatRegex.Match(dependency);
packageName = match.Groups[PackageNameGroup].Success ? match.Groups[PackageNameGroup].Value : null;
version = match.Groups[VersionGroup].Success ? match.Groups[VersionGroup].Value : null;
source = match.Groups[SourceGroup].Success ? match.Groups[SourceGroup].Value : null;
@"^(?<packageName>[^ ]+)(?: (?<version>[^ ]+))?(?: \((?<source>[^()]*)\))?$",
RegexOptions.Compiled);
private static bool ParseDependency(string dependency, out string packageName, out string version, out string source)
{
var match = DependencyFormatRegex.Match(dependency);
packageName = match.Groups["packageName"].Success ? match.Groups["packageName"].Value : null;
version = match.Groups["version"].Success ? match.Groups["version"].Value : null;
source = match.Groups["source"].Success ? match.Groups["source"].Value : null;

Footnotes

  1. https://docs.microsoft.com/en-us/dotnet/standard/base-types/grouping-constructs-in-regular-expressions#named_matched_subexpression

@JamieMagee
Copy link
Member

@danielframpton are the verification tests for rust still sufficient, or do they need updating as well?

Copy link
Member

@JamieMagee JamieMagee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved pending comments

@JamieMagee
Copy link
Member

Snapshot tests are expected to fail due to changes in the detector.

@JamieMagee
Copy link
Member

@danielframpton Do you have time to address the comments and resolve conflicts?

if (!IsLocalPackage(package) && !seenAsDependency.Contains(package))
{
var detectedComponent = new DetectedComponent(component);
singleFileComponentRecorder.RegisterUsage(detectedComponent, isExplicitReferencedDependency: true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happened to dev dependency detection? Is that being considered in these updates, if so, how will those be treated now? Users won't like to start picking up dependencies as regular build time dependencies when they used to be dev dependencies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not report dev dependencies separately (e.g., they are included alongside other dependencies as well) as that information is not currently included in the lock file. Unfortunately, there isn't an obvious middle ground here between having something that is simple and robust (lock file processing) or replicating the full functionality of cargo.

This change puts us in the position of not missing dependencies, with the option to (in the future) introduce a tool that uses cargo to get a more accurate picture (either directly or by consuming SBOMs generated from cargo).

@katshup
Copy link

katshup commented Sep 12, 2022

@danielframpton Do we believe this PR is on track for September?

Review feedback

Revert inclusion of registry in package identity for the initial change

PR feedback and adding a multiple registry test (to validate that it doesn't cause problems)
@JamieMagee
Copy link
Member

@danielframpton thanks for deconflicting this PR ❤️

I just wanted to double check with you that you're good with this being merged?

@danielframpton
Copy link
Contributor Author

Thanks @JamieMagee!

I am, I have responded to all the feedback and updated across the code analysis fixes. Let me know if there is anything I have missed.

@JamieMagee JamieMagee enabled auto-merge (squash) October 5, 2022 17:01
@JamieMagee JamieMagee merged commit c7c4ce8 into microsoft:main Oct 5, 2022
@github-actions
Copy link

github-actions bot commented Oct 5, 2022

👋 Hi! It looks like you modified some files in the Detectors folder.
You may need to bump the detector versions if any of the following scenarios apply:

  • The detector detects more or fewer components than before
  • The detector generates different parent/child graph relationships than before
  • The detector generates different devDependencies values than before

If none of the above scenarios apply, feel free to ignore this comment 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

detector:rust The Rust Cargo detector version:minor

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Rust component detection fails to detect all Rust crate usage

8 participants