Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Added removeDocument and retrain #36

merged 1 commit into from Nov 1, 2013


None yet
3 participants

mde commented May 2, 2012

Wanted the ability to remove something from a category, so I added the removeDocument method. However, looks like train is both incremental, and additive-only, so it seemed like the most straightforward way to do it without a lot of rewriting was to add a retrain that would wipe the slate and start over.

The only thing I'm really dubious about is the ramifications of removing a text-item from features, in the case where the same thing exists in documents with multiple classifiers.

If this is completely crazy, I'd appreciate feedback on a better way to approach adding this feature.

I've also added a test for this -- the test for good/bad equality seems a little brittle, but as long as the classifications format doesn't change, it should work correctly. I am a little curious how the classify method returns a value in the case where categorizations are all equal. Does it just pick the first one it finds? Is this desirable behavior? If something can't be reasonably categorized, would returning null be too weird?

Thanks for the work on this. I plan to use this in a Hack Day project at Yammer. :)


chrisumbel commented May 2, 2012

sounds reasonable so far. i'll look this over within the next 48 hours or so as i'm a bit backed up now. thanks for the contribution regardless!

One year later


chrisumbel commented Oct 8, 2013

Yeah, I'm looking for someone to take over the operations here as I don't have time to attend to issues much these days. In the meantime I'll try to have a look within the coming days.

I'll have to review the logic and resolve conflicts so it won't be super quick.

Sorry again.

@chrisumbel chrisumbel merged commit 91b9387 into NaturalNode:master Nov 1, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment