Create a taxicab demo that runs in "Continuous Training" mode? #421

robertlugg · 2019-08-02T23:02:59Z

One key distinction is that between one-off and continuous pipelines. One-off pipelines are initiated by engineers to produce ML models “on demand”. In contrast, continuous pipelines are “always on”: they ingest new data and produce newly updated models continuously.

The Chicago taxicab example appears to be an "on-demand" pipeline. Any change to the data directory (changing a row in the *.csv or adding a new *.csv file) triggers a complete re-run of all the rows of all the *.csv files. From what I see, the current demo can't be run as a continuous pipeline.

Could you either correct me, explain how it might be done, or create a version of the taxicab demo which can run in continuous mode? I expect that it would watch for new *.csv files or changes in the .csv files and it would adjust the output tfrecord files of the CsvExampleGen but only process the changed rows or added files without needing to process every row again.

1025KB · 2019-08-03T05:17:21Z

currently we haven't support continuous training in TFX OSS yet, what you can do now is trigger the pipeline periodically to mimic continuous mode.

similar feature request [1][2], stay tuned, it's on our radar!

robertlugg · 2019-09-03T22:42:44Z

Hi @1025KB , I was curious if you have made any progress. Or, if you can give me a conceptual understanding of how this could be done? For instance, the taxicab demo takes in a directory into CsvExampleGen. Would that be change to take in individual files? Or individual lines?

WIth the idea of line-by-line processing. If each line operation takes a large amount of time could I also data-distribute that? Would I just make up my own URI schema?

singhniraj08 · 2023-05-05T10:36:58Z

@robertlugg,

Since similar feature request #210 is already in progress, requesting you to close this issue, follow and +1 similar thread for updates. Thanks.

github-actions · 2023-05-13T01:51:54Z

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions · 2023-05-20T01:52:21Z

This issue was closed due to lack of activity after being marked stale for past 7 days.

1025KB added the type:feature label Aug 3, 2019

iprapas mentioned this issue Mar 25, 2020

Status of continuous pipelines #1540

Closed

singhniraj08 self-assigned this May 5, 2023

singhniraj08 added the stat:awaiting response label May 5, 2023

github-actions bot added the stale label May 13, 2023

github-actions bot closed this as completed May 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create a taxicab demo that runs in "Continuous Training" mode? #421

Create a taxicab demo that runs in "Continuous Training" mode? #421

robertlugg commented Aug 2, 2019

1025KB commented Aug 3, 2019

robertlugg commented Sep 3, 2019

singhniraj08 commented May 5, 2023

github-actions bot commented May 13, 2023

github-actions bot commented May 20, 2023

Create a taxicab demo that runs in "Continuous Training" mode? #421

Create a taxicab demo that runs in "Continuous Training" mode? #421

Comments

robertlugg commented Aug 2, 2019

1025KB commented Aug 3, 2019

robertlugg commented Sep 3, 2019

singhniraj08 commented May 5, 2023

github-actions bot commented May 13, 2023

github-actions bot commented May 20, 2023