Awesome Library #5

hamelsmu · 2018-02-03T01:48:14Z

I developed this, because at the time there was nothing. However I really like your api. So I'm going to try to use it in my next blog post.

Cheers!

anttttti · 2018-02-03T08:54:37Z

Nice, similar to what WordSeq produces. The histogram-based length selection could be useful to have. Feel free to contribute :)

Note that wb is still work in progress, and a few bugs have surfaced now with the Mercari Kaggle competition. Some missing bits as well, such as pickling extractors written in cython. The API should be revised so it's as compatible with sklearn as possible. And the documentation improved.

There's also been a lot of work on the DL libraries for preprocessing since I made the first version available. Keras and others have dataset and dataloader packages. WB only has the wordbatch pipeline for text data, and nothing yet for data munging from different formats (.csv, word), or output of features to formats such as the LibSVM feature files.

hamelsmu · 2018-02-03T08:57:04Z

No need for the caveats. The first step in open source is putting it out there. Don’t be afraid or feel you need to explain! (I work at Github)

hamelsmu closed this as completed Feb 7, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Awesome Library #5

Awesome Library #5

hamelsmu commented Feb 3, 2018

anttttti commented Feb 3, 2018

hamelsmu commented Feb 3, 2018

Awesome Library #5

Awesome Library #5

Comments

hamelsmu commented Feb 3, 2018

anttttti commented Feb 3, 2018

hamelsmu commented Feb 3, 2018