Run the pipeline synchronously? #49

OmarJay1 · 2020-05-23T17:48:51Z

Hi, I've been stepping through the data acquisition code, and it's hard to do it when there are multiple processes running. Is there a way to make the processes run sequentially instead of asynchronously? I can comment stuff out and add break points, but if there's a command line switch or something that would be cleaner.

It looks like the --only argument might do something like that, but I haven't been able to get it to work.

Thanks.

owahltinez · 2020-05-23T20:06:08Z

Some pipelines simply can't do processing synchronously (e.g. weather would take over a day to process) but you can now run the pipelines within a chain one at a time by passing the --process-count 1 option to the run.py script. Again, keep in mind that each pipeline might still be running multiple threads / processes.

--only and --exclude are used to run a single pipeline within the chain. If you run a single pipeline, then multiprocessing is disabled automatically. To use it, pass the name of the output tables separated by commas: run.py --only epidemiology,demographics or run.py --exclude weather

OmarJay1 · 2020-05-24T19:52:31Z

Thanks. I'm assuming that run.py is now called update.py?

Also, I noticed that a previous version I'm using captures more epidemiology data than the most recent. I can say more once I understand more how the code works.

Thank you.

owahltinez · 2020-05-24T21:54:54Z

Thanks. I'm assuming that run.py is now called update.py?

Yes, sorry for the change -- the project is currently under active development but I don't expect that update.py will change again soon.

Also, I noticed that a previous version I'm using captures more epidemiology data than the most recent.

Can you please share a few data points (key + date) examples?

owahltinez · 2020-05-29T21:08:26Z

@OmarJay1 did you get a chance to collect a few examples of datapoints which are missing from the previous version?

OmarJay1 · 2020-05-30T12:34:26Z

I didn't get a chance to look closely, but the overall size was back to what it was before. Thanks.

OmarJay1 closed this as completed May 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run the pipeline synchronously? #49

Run the pipeline synchronously? #49

OmarJay1 commented May 23, 2020

owahltinez commented May 23, 2020 •

edited

OmarJay1 commented May 24, 2020 •

edited

owahltinez commented May 24, 2020

owahltinez commented May 29, 2020

OmarJay1 commented May 30, 2020

Run the pipeline synchronously? #49

Run the pipeline synchronously? #49

Comments

OmarJay1 commented May 23, 2020

owahltinez commented May 23, 2020 • edited

OmarJay1 commented May 24, 2020 • edited

owahltinez commented May 24, 2020

owahltinez commented May 29, 2020

OmarJay1 commented May 30, 2020

owahltinez commented May 23, 2020 •

edited

OmarJay1 commented May 24, 2020 •

edited