Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Train model from streams #7351
Use stream to train model
I am currently working on a predictive task of unequal sequences(many to many),which will save many files in a folder. Now some requirements require the use of streams to build datasets(i.e Others pass me some streams, I use these streams to train the model without saving the file locally)
how do you build a dataset and store it in HDFS without saving local files? Alex said he would support this.
thanks a lot
Thanks for the issue. Functionality to support this is being added here: #7340
I will provide an example (in the form of a unit test) before that is merged.
Update: functionality is done, but PR is not yet merged. It should get merged later today or tomorrow at the latest.
For opening streams from HDFS, you can use this as a reference: