Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add export of Pipeline YAML config #1003

Merged
merged 13 commits into from
Apr 30, 2021
Merged

Add export of Pipeline YAML config #1003

merged 13 commits into from
Apr 30, 2021

Conversation

oryx1729
Copy link
Contributor

@oryx1729 oryx1729 commented Apr 27, 2021

Imlementation

This PR adds an export_to_yaml() method for the Pipeline class to create a YAML configuration for a loaded Pipeline instance.

Under-the-hood, the BaseComponent saves the init parameters used to create a Component. These parameters are put together in export_to_yaml() to create the complete Pipeline config.

Limitations

With this implementation, the init parameters are captured. Hence, any updates to a Component after it has been initialized are not reflected in the YAML.

Alternate approach

An alternative approach to generating the YAML by taking a snapshot of instance attributes of Components when calling export_to_yaml() was also considered. With this approach, when export is called, we iterate over all the instance attributes of Components & filters the ones that are parameters in the corresponding init() of the Component. However, it might not work in case when init parameters like embedding_model: str gets loaded as self.embedding_model: Inferencer. Additionally, some init parameters like verify_certs in ElasticsearchDocumentStore are never saved as instance attributes.


Resolves #741.

@oryx1729 oryx1729 changed the title WIP: Add export of Pipeline YAML config Add export of Pipeline YAML config Apr 28, 2021
@oryx1729 oryx1729 requested a review from tholor April 28, 2021 10:20
Copy link
Member

@tholor tholor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! I added a few comments about renaming, documentation, and storing default values.

haystack/pipeline.py Outdated Show resolved Hide resolved
haystack/pipeline.py Outdated Show resolved Hide resolved
haystack/pipeline.py Show resolved Hide resolved
haystack/pipeline.py Show resolved Hide resolved
haystack/schema.py Outdated Show resolved Hide resolved
haystack/schema.py Outdated Show resolved Hide resolved
@oryx1729 oryx1729 requested a review from tholor April 29, 2021 14:16
Copy link
Member

@tholor tholor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@oryx1729 oryx1729 merged commit 99990e7 into master Apr 30, 2021
@oryx1729 oryx1729 deleted the export-pipeline-yaml branch April 30, 2021 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Saving Pipeline to YAML
2 participants