-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for on demand addon_pipeline in /api/collect
endpoints
#393
Conversation
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
Signed-off-by: Keshav Priyadarshi <git@keshav.space>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keshav-space I just have a question about when the symbol collection pipelines would be run
|
||
# These are the list of supported addon pipelines to run when we scan a Package for | ||
# indexing. | ||
SUPPORTED_ADDON_PIPELINES = ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you give me an example of when these pipelines would be run when indexing a package. I see how the default pipelines would be run, but where would we add the symbol collecting pipelines when we index a package?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonoYang
Suppose we receive this request: /api/collect/?purl=pkg:npm/foo@1.2.3&addon_pipelines=collect_symbols
.
First, the CollectPackageSerializer will validate whether collect_symbols
is a valid pipeline or not using SUPPORTED_ADDON_PIPELINES
. Then, will add this package to the scan queue using add_package_to_scan_queue, with the pipeline
argument being DEFAULT_PIPELINES
+ ( 'collect_symbols' ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@keshav-space I see, do we currently use this anywhere? Otherwise, I think the code looks good unless there's something you want to change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JonoYang No more change from my side, it's ready for merge.
/api/collect
endpoint now supports optional list ofaddon_pipelines
to run on the package.Example GET request:
In bulk indexing
/api/collect/index_packages/
endpoint each package now also supports optional list ofaddon_pipelines
to run on the package.Example POST request:
fixes PurlDB: make symbols and strings indexing a separate, on demand option #376
Note
If no addon_pipeline is provided, then only the default pipelines will be run on the package.
https://github.com/nexB/purldb/blob/32ea6e810c3c371e24318440f079a7453f1c5d4a/minecode/model_utils.py#L30-L33