analysisTargets doesn't seem to scale well #28

zlandau · 2015-09-01T18:25:39Z

It sounds like analysisTargets should specify every file that has been looked at (which also includes hashing of these files). That seems unnecessarily expensive for very large codebases, especially ones that are in revision control. I imagine a common scenario would be to run the analysis of all files in a specific code base, in which something like an overall file tree version (like a commit hash when under revision control) would suffice.

Perhaps analysisTargets could allow different methods of specifying which files the analysis was run on, with listing of all files just being one of the options?

codecurmudgeon · 2015-09-22T21:55:49Z

Yes, for example a project name, or scm tag.

On Tue, Sep 1, 2015 at 11:25 AM, Zachary P. Landau <notifications@github.com

wrote:

It sounds like analysisTargets should specify every file that has been
looked at (which also includes hashing of these files). That seems
unnecessarily expensive for very large codebases, especially ones that are
in revision control. I imagine a common scenario would be to run the
analysis of all files in a specific code base, in which something like an
overall file tree version (like a commit hash when under revision control)
would suffice.

Perhaps analysisTargets could allow different methods of specifying which
files the analysis was run on, with listing of all files just being one of
the options?

—
Reply to this email directly or view it on GitHub
#28.

Regards,

Arthur "Code Curmudgeon" Hicken
Office: 626-275-2445 (Los Angeles)
Cell: 909-343-0936
Google voice: 978-49-OWASP (in theory rings my mobile and my desk... in
theory)

michaelcfanning · 2015-11-20T22:35:56Z

this makes sense.

ghost · 2016-03-07T17:39:49Z

@zlandau, @codecurmudgeon: I'm going to come back and make this comment clearer. Right now it's a placeholder for me.

#107 says that direct producers will express result locations as a simple URI, and that they should provide more information about the files referenced by those URIs in the "physicalLocationInfo" property of "runLog". physicalLocationInfo is not required, it's only needed at all for files that are mentioned in results.

We'll recommend they fill out metadata for every file mentioned in a result, but they might not be able to.

In a future "compliance" profile, we'll examine the problem of exhaustively identifying all analysis targets.

ghost · 2016-04-27T21:55:10Z

Good news! It is no longer necessary to enumerate every analysis target. The run.files property need only mention the files in which results are produces -- and in fact it doesn't even have to mention all of those.

ghost added this to the future milestone Mar 7, 2016

ghost added impact-breaks-consumers and removed impact-breaks-consumers labels Mar 7, 2016

ghost mentioned this issue Mar 7, 2016

Should we allow file identity to be specified by reference to a commit... #130

Open

ghost added the resolved-fixed label Apr 27, 2016

ghost modified the milestones: v1, future Apr 27, 2016

ghost closed this as completed Apr 27, 2016

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

analysisTargets doesn't seem to scale well #28

analysisTargets doesn't seem to scale well #28

zlandau commented Sep 1, 2015

codecurmudgeon commented Sep 22, 2015

michaelcfanning commented Nov 20, 2015

ghost commented Mar 7, 2016

ghost commented Apr 27, 2016

analysisTargets doesn't seem to scale well #28

analysisTargets doesn't seem to scale well #28

Comments

zlandau commented Sep 1, 2015

codecurmudgeon commented Sep 22, 2015

michaelcfanning commented Nov 20, 2015

ghost commented Mar 7, 2016

ghost commented Apr 27, 2016