Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an example in the documentation about nodes with generator functions #2170

Closed
Tracked by #2239
idanov opened this issue Jan 3, 2023 · 0 comments · Fixed by #2302
Closed
Tracked by #2239

Add an example in the documentation about nodes with generator functions #2170

idanov opened this issue Jan 3, 2023 · 0 comments · Fixed by #2302
Assignees
Labels
Component: Documentation 📄 Issue/PR for markdown and API documentation Issue: Feature Request New feature or improvement to existing feature

Comments

@idanov
Copy link
Member

idanov commented Jan 3, 2023

Description

After introducing the ability to wrap generator functions in Kedro in #2161, we should add an example in the docs how this can be leveraged to process large datasets in chunks. The example can show a repurposed split_dataset function to process chunk-wise data: https://github.com/kedro-org/kedro-starters/blob/main/pandas-iris/%7B%7B%20cookiecutter.repo_name%20%7D%7D/src/%7B%7B%20cookiecutter.python_package%20%7D%7D/nodes.py#L13

In the example, we should also implement a custom DataSet, which saves the data in an append-or-create mode (a+ mode).

We need to make sure that the example works correctly and use the opportunity to perform manual testing of the functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Documentation 📄 Issue/PR for markdown and API documentation Issue: Feature Request New feature or improvement to existing feature
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

3 participants