Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to use on_start functions with arguments #98

Closed
alpae opened this issue Jan 2, 2023 · 2 comments
Closed

how to use on_start functions with arguments #98

alpae opened this issue Jan 2, 2023 · 2 comments

Comments

@alpae
Copy link

alpae commented Jan 2, 2023

Hi @cgarciae

I'm trying to use a on_start function that uses an extra argument. From the code I see in Stage.run, it seems that you've planned to allow for additional arguments apart from the worker_info, but I don't see a way to pass these arguments in the end:

 def run(self) -> tp.Iterable:

    worker_info = WorkerInfo(index=0)

    on_start_args: tp.List[str] = (
        pypeln_utils.function_args(self.on_start) if self.on_start else []
    )
    on_done_args: tp.List[str] = (
        pypeln_utils.function_args(self.on_done) if self.on_done else []
    )

    if self.on_start is not None:
        on_start_kwargs = dict(worker_info=worker_info)
        kwargs = self.on_start(
            **{
                key: value
                for key, value in on_start_kwargs.items()
                if key in on_start_args
            }
        )

it seems you check for additional arguments, but the on_start_kwargs is hard-coded to the worker_info only. Any suggestion how to solve this?

Thanks Adrian

@cgarciae
Copy link
Owner

cgarciae commented Jan 2, 2023

Hey @alpae. That code is just checking if the users requests the argument worker_info in very general way. What other arguments do you plan on using? If its a custom object, maybe you can pass it to on_start via a closure? E.g:

def get_on_start(my_object):
  def on_start():
    # use my_object here
    ...
  return on_start
  
on_start = get_on_start(my_object)

@alpae
Copy link
Author

alpae commented Jan 4, 2023

Thanks, yes, I think that should work. I found also another way with a lambda function that worked for me. In my case it was simply a path to a hdf5 file that should be loaded in the sub-process.
Thanks for the cool library btw.

@alpae alpae closed this as completed Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants