Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow contributors to help maintain #18

Closed
tsturzl opened this issue Nov 13, 2019 · 6 comments
Closed

Allow contributors to help maintain #18

tsturzl opened this issue Nov 13, 2019 · 6 comments

Comments

@tsturzl
Copy link
Collaborator

tsturzl commented Nov 13, 2019

Hey @christophertrml ,

I've contributed a few times to the project, and I'd really like to continue to do so. I don't see a ton of options for NLP in Rust. I'd really like to make the library more uniform, and overall more flexible. Currently the tf-idf works but isn't easy to use in a meaningful way. The Naive-Bayes classifier has been in need of some work for a while now, and a logistic regression classifier has been a near-sighted goal since I first learned of the project. I could help handle issues and PRs, as well as expanding the feature set, and improving already existing features of the project if you give me permissions for the repository.

Let me know what you think,

Travis Sturzl

@tsturzl
Copy link
Collaborator Author

tsturzl commented Nov 13, 2019

Some things I'd like to do:

  1. close out all the standing PRs and issues.
  2. Supporting a variety of edgecases for stemming
  3. Support Serde as an optional crate feature for most of the features(classifier, soundex, and tf-idf)
  4. Tackle the near-sighted goals
  5. Add support for basic topic modeling
  6. Supporting parallel operations as an optional crate feature
  7. Start porting features from python's NLTK

@a2xchip
Copy link

a2xchip commented Nov 29, 2019

@tsturzl Ability to save tfidf and classifier out of the box would be good to have too))

@lexi-sh
Copy link
Owner

lexi-sh commented Nov 30, 2019

Thanks for bumping this @a2xchip , there were a couple of days a couple weeks ago where my e-mail was lost and I think this original PR was one of those that got lost.

Unfortunately my interests and responsibilities have shifted such that this project isn't a high priority for me, but I would love if you would continue where I left off since I agree that there's a large gap in NLP in Rust and I started this project precisely because I do believe Rust is an excellent bed for ML in the future.

Travis, I'd be happy to make you a contributor to the project, the vision you've laid out here is basically what I was imagining years ago when I started. A direct competitor to NLTK is basically right in line.

@tsturzl
Copy link
Collaborator Author

tsturzl commented Dec 3, 2019

@a2xchip serde support would effectively give a means to serialize and deserialize tf-idf and classifiers in many different formats. I think this would suffice for your needs. This will probably be the first thing I implement. If you're not familiar with serde I definitely recommend you take a look, very useful crate.

@tsturzl
Copy link
Collaborator Author

tsturzl commented Dec 3, 2019

@christophertrml Thanks Christopher. I'll probably start slow at first, but this will be a perfect place to implement a lot of the things I use in my various side projects in a manner that's useful to others.

@tsturzl tsturzl closed this as completed Dec 3, 2019
@a2xchip
Copy link

a2xchip commented Dec 3, 2019

@tsturzl Sure, i know serde and use it myself)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants