Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Awesome Library #5

Closed
hamelsmu opened this issue Feb 3, 2018 · 2 comments
Closed

Awesome Library #5

hamelsmu opened this issue Feb 3, 2018 · 2 comments

Comments

@hamelsmu
Copy link

hamelsmu commented Feb 3, 2018

I developed this, because at the time there was nothing. However I really like your api. So I'm going to try to use it in my next blog post.

Cheers!

@anttttti
Copy link
Owner

anttttti commented Feb 3, 2018

Nice, similar to what WordSeq produces. The histogram-based length selection could be useful to have. Feel free to contribute :)

Note that wb is still work in progress, and a few bugs have surfaced now with the Mercari Kaggle competition. Some missing bits as well, such as pickling extractors written in cython. The API should be revised so it's as compatible with sklearn as possible. And the documentation improved.

There's also been a lot of work on the DL libraries for preprocessing since I made the first version available. Keras and others have dataset and dataloader packages. WB only has the wordbatch pipeline for text data, and nothing yet for data munging from different formats (.csv, word), or output of features to formats such as the LibSVM feature files.

@hamelsmu
Copy link
Author

hamelsmu commented Feb 3, 2018

No need for the caveats. The first step in open source is putting it out there. Don’t be afraid or feel you need to explain! (I work at Github)

@hamelsmu hamelsmu closed this as completed Feb 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants