New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
File client user guide #1060
File client user guide #1060
Conversation
Looks better already, just make sure the CI is passing before review. |
I created the branch from the latest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
am I right to assume that these docs are mostly gathering docs that we already had in other places? (that's fine, I'm just wondering where this came from and if we should remove other references to prevent duplication)
doc/user-guide/file_clients.rst
Outdated
|
||
The file clients only upload products generated by the pipeline. If you want to work with an external dataset which is in the cloud please follow the below approach: | ||
|
||
* Sql script that runs a CREATE TABLE statement to copy a public dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a but confused about this section. I think this is documenting what to do when the dataset we want to work with is in a different place right? But I'm not following how that connects to this list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was mentioned in the original issue: Another important concept to explain is how to bring external datasets that live in the cloud already since File clients only upload products generated by the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so let's remove this three points since they're confusing and replace them with:
The file clients only upload products generated by the pipeline. If you want to work with an external dataset, you should download such a dataset in the pipeline task that uses it as input. If you need help contact us (add link to our slack)
yeah, I saw that in an earlier CI run. We can ignore it |
Update: I found out why the CI failed and fixed it. please rebase |
remove email onboarding test
* Added an Algolia Crawler Section to doc * Update contributing.md
* add --backend to report command * update --backend help msg * Add initial report command tests * add backend arg to dag.plot * try using `plot.choose_backend(backend)` * add `dot -c` to win CIs * update help msg and remove backend in cli msg * remove backend assertion * address feedback
…loomber#825) * added and tested issue ploomber#805 * fix tepo issues based on Ido's feedback * pass test * add psycopg2-binary back * fix bugs to pass test * delete unused files * deleted fileclient/.gitignore * removed the commented debug * deleted the blank line at end of file * no message * fixed bugs with format checked * add one test * checked format with flake8 * removed one test * fixed mock called twice error * fixed test_error_unknown_example * fixed format issues * solved format issues * fixed format issues * fixed CI errors * removed debug code * avoid creation of multiple example managers and add more test cases * add `_suggest_example` and revert arg order change * refactor: move file read to constructor * use try-except to bypass exceptions * print error instead of raising exception * Update newline breaks Co-authored-by: shuyang <94rain@msn.com> Co-authored-by: shuyang <21193371+94rain@users.noreply.github.com>
* handle exception in `jupyter.manager.load_dag` * Add `test_load_dag_exception_handling` test * ensure `log.exception.call_args_list` is expected
(ploomber#920) * Fixed confusing error when passing something.html as source ploomber#646 * Failing test fixed * Error message fixed * _looks_like_file_name check added after import fails * typo fixed. test fixed. Co-authored-by: yafim <ifim.vo@gmail.com>
…mber#927) * start_method added to parallel executor * tests added to executor with valid start_method values
* Adds update to cloud notebooks. * Updates hooks for pipelines api * Updates pipelines api user guide
Co-authored-by: yafim <ifim.vo@gmail.com>
Update License to --> Copyright 2022-Present Ploomber Inc.
0b3c315
to
02cf8c2
Compare
doc/user-guide/file_clients.rst
Outdated
|
||
The file clients only upload products generated by the pipeline. If you want to work with an external dataset which is in the cloud please follow the below approach: | ||
|
||
* Sql script that runs a CREATE TABLE statement to copy a public dataset. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so let's remove this three points since they're confusing and replace them with:
The file clients only upload products generated by the pipeline. If you want to work with an external dataset, you should download such a dataset in the pipeline task that uses it as input. If you need help contact us (add link to our slack)
@neelasha23 Just a comment, I think instead of rebasing you're doing a merge of the repo, what happens when you do it is:
Please avoid merging like that so we can adhere to the rest of the PRs. |
Describe your changes
Issue ticket number and link
Closes #701
Checklist before requesting a review