-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generate multiple nb products from a single task #708
Conversation
Thanks for working on this! I think you're on the right track. I'd suggest collapsing the This is how things work right now: # simplest case, product is a path
tasks:
- source: notebook.ipynb
product: report.html A bit more elaborated, multiple products # multiple products
tasks:
- source: notebook.ipynb
product:
nb: report.html
# no need to specify nb_product_key since it's "nb" by default Now, if the user wants to output multiple reports: tasks:
- source: notebook.ipynb
nb_product_key: [nb_ipynb, nb_html, nb_pdf]
product:
nb_ipynb: report.ipynb
nb_html: report.html
nb_pdf: report.pdf If the user wants to customize tasks:
- source: notebook.ipynb
nb_product_key: [nb_ipynb, nb_html, nb_pdf]
nbconvert_exporter_name:
# user webpdf for nb_pdf, the rest should use the default one based on the extension
nb_pdf: webpdf
product:
nb_ipynb: report.ipynb
nb_html: report.html
nb_pdf: report.pdf Edge cases:
This is the cleanest API I can think of. thoughts? |
Yes this makes sense. ploomber/src/ploomber/tasks/notebook.py Line 522 in 343a831
nb_product_key variable instead of multiple variables.So default would be nb_product_key='nb' , however in the case where user specifies multiple nb products this value will be a list instead of nb like nb_product_key = ['nb_ipynb', 'nb_html', 'nb_pdf']
The only issue seems to be that user needs to specify the keys multiple times : once in the list and again as the dict keys |
Also, the other edge case can be |
correct
I agree. it's kind of redundant, but since users will only create lists of 2-3 elements, I think it's better to be explicit.
good catch. there has to be some validation here. if nb_product_key is a list, then nbconvert_exporter_name must be a dictionary another validation is that all values in nb_product_key must appear in product |
@edublancas I have added the functionalities discussed above , and the corresponding test cases. Do let me know if changes are required. Thanks! |
thanks a lot! just a heads up that I'm traveling to PyCon, so it may take me a few days to review this |
Sure! No issues :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is great! thanks so much for your work! just minor comments
src/ploomber/tasks/notebook.py
Outdated
self.multiple_nb_params = { | ||
'valid_keys': ['nb_ipynb', 'nb_html', 'nb_pdf'], | ||
'ipynb_key': 'nb_ipynb' | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be defined by the user, for example, a user might call this: [notebook, report_html, report_pdf]
, then the validation should be done on those user-submitted values. Looks like this code is looking for a nb_ipynb
key.
For example, I may wanna do this:
- source: fit.py
nb_product_key: [notebook, nb_html, nb_pdf]
product:
notebook: output/report.ipynb
nb_html: output/report.html
nb_pdf: output/report.pdf
model: output/model.pickle
Or may not even want the ipynb file:
- source: fit.py
nb_product_key: [report_web, report_document]
product:
report_web: output/report.html
report_document: output/report.pdf
model: output/model.pickle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@edublancas
Made the requested changes. Do let me know if anything looks incorrect
* Added fixture * Formatting * merged fixture * minor change
great work @neelasha23! one thing before we merge. can you add an example here? ploomber/src/ploomber/tasks/notebook.py Line 491 in bd134fe
please add a sample pipeline.yaml with a few comments showing how to create multiple products from the same script. something like:
|
Yes I will add it |
For changes related to : #673
@edublancas Need to understand how the field
nbconvert_exporter_name
works in case where user specifies multiple nb products. Which nb product will it be mapped to? Or should thenbconvert_exporter_name
also be a dict where each item corresponds to the respective nb product?