You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Nice, similar to what WordSeq produces. The histogram-based length selection could be useful to have. Feel free to contribute :)
Note that wb is still work in progress, and a few bugs have surfaced now with the Mercari Kaggle competition. Some missing bits as well, such as pickling extractors written in cython. The API should be revised so it's as compatible with sklearn as possible. And the documentation improved.
There's also been a lot of work on the DL libraries for preprocessing since I made the first version available. Keras and others have dataset and dataloader packages. WB only has the wordbatch pipeline for text data, and nothing yet for data munging from different formats (.csv, word), or output of features to formats such as the LibSVM feature files.
I developed this, because at the time there was nothing. However I really like your api. So I'm going to try to use it in my next blog post.
Cheers!
The text was updated successfully, but these errors were encountered: