Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Job results to be pass across pipeline #7

Closed
jfunez opened this issue Nov 15, 2017 · 4 comments
Closed

Job results to be pass across pipeline #7

jfunez opened this issue Nov 15, 2017 · 4 comments

Comments

@jfunez
Copy link

jfunez commented Nov 15, 2017

Hi, one thing I want to see in this awesome lib is the ability to pass data or job results across the pipeline.

As wikipedia stands as a definition of pipeline is:

In computing, a pipeline is a set of data processing elements connected in series, where the output of one element is the input of the next one

Maybe I could help with this feature if you agree...

@csurfer
Copy link
Owner

csurfer commented Nov 17, 2017

@jfunez: You can do that now too, with python queue (https://docs.python.org/3.6/library/queue.html) . Queues in python are thread-safe and so you can push things into queue from one thread and pull from another thread and in this way you can achieve communication between threads.

Does this answer your question or did you have anything else in mind?

@jfunez
Copy link
Author

jfunez commented Nov 27, 2017

@csurfer not exactly what I have in mind... Maybe this example could help to understand:

In a simple pipeline with three jobs:

J1 -> J2 -> J3
  1. In J1, I would collect the data from external source (DB, API, etc)
  2. In J2,I would run a data manipulations process with the data returned from J1, then J2 return the data transformed
  3. In J3 I would persist the result if the data given by J2 have some specific value

I know it could be possible to achieve it with a queue as you said, but maybe the Job could return an iterator/generator that would be passed to the next Job in the pipeline, the the later could make a decision without having all the data in a single "global" queue. you know the output of one element is the input of the next one

HTH

@ndemou
Copy link

ndemou commented Mar 8, 2018

@jfunez I love pyfunctional for working exactly that way. seq(generator).map(...).map(...)...

@jfunez
Copy link
Author

jfunez commented Mar 8, 2018

Hi @ndemou
PyFunctional looks very interesting!!! Thanks for the tip, I'll take a look more deeply for sure!

@csurfer csurfer closed this as completed Sep 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants