Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipeline copy method #91

Merged
merged 10 commits into from
Jan 9, 2024
Merged

Add pipeline copy method #91

merged 10 commits into from
Jan 9, 2024

Conversation

nvaytet
Copy link
Member

@nvaytet nvaytet commented Jan 5, 2024

This makes it possible to make a copy of an existing pipeline.
One can then, for example modify the parameters of the new pipeline.

@nvaytet nvaytet marked this pull request as ready for review January 8, 2024 08:41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing tests that modifying a or b after the copy does not affect the other pipeline.

@@ -322,6 +322,9 @@ def __init__(
params:
Dictionary of concrete values to provide for types.
"""
self._copy_providers = list(providers or [])
self._copy_params: Dict[type, Any] = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why those are needed? Can't you just make copies of self._providers, etc.?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No because self._providers contains all the original providers, plus a whole bunch of lambda functions that were made from the parameters, which then fail if you try to construct a new pipeline with them.

Also, going through the sub-providers is a bit of a mess, especially with generics. So I thought just saving the constructor params was far easier. But maybe @SimonHeybrock has a better idea?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are the lambdas failing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ValueError: Provider <function Pipeline.__setitem__.<locals>.<lambda> at 0x7f492825f130> lacks type-hint for return value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lambda is raising that? That is really odd.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear, I get the error from the lambdas when I do:

Pipeline(existing_pipeline._providers.values())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making a copy of the dicts internally works fine.

@SimonHeybrock
Copy link
Member

SimonHeybrock commented Jan 9, 2024

Looking at Pipeline:

        self._providers: Dict[Key, Provider] = {}
        self._subproviders: Dict[type, Dict[Tuple[Key | TypeVar, ...], Provider]] = {}
        self._param_tables: Dict[Key, ParamTable] = {}
        self._param_name_to_table_key: Dict[Key, Key] = {}

Why can't we just copy these four dicts, and copy the nested dicts in _subproviders?

@nvaytet
Copy link
Member Author

nvaytet commented Jan 9, 2024

and copy the nested dicts in _subproviders

Can you have more than one level of nesting?

@SimonHeybrock
Copy link
Member

and copy the nested dicts in _subproviders

Can you have more than one level of nesting?

No.

@@ -794,3 +794,14 @@ def bind_and_call(
if not return_tuple:
return results[0]
return results

def copy(self) -> Pipeline:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also implement __copy__?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, can do.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we actually need a custom copy implementation? Or would this be enough with the default __copy__?

def copy(self) -> Pipeline:
    from copy import copy
    return copy(self)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to copy the (nested) dicts ourselves, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right! It's deep as far as the dicts are concerned but not the dict elements.

@nvaytet nvaytet merged commit 7c2d6a4 into main Jan 9, 2024
7 checks passed
@nvaytet nvaytet deleted the add-pipeline-copy-method branch January 9, 2024 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants