-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initial TD-IDF work #14
Conversation
Bump ? |
Thanks, will review in the next days. Please run |
Oh ofcourse ! Will run and update pull request ! On 20 Aug 2016 09:30, "Jake Brukhman" notifications@github.com wrote:
|
seen int // docs seen | ||
datas map[Class]*classData | ||
tfIdf bool | ||
didConvertTDIDF bool //we can't classify a TD-IDF classifier if we haven't yet |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to didConvertTfIdf
; fix comment, should read: TF-IDF
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add space after //
.
Thanks for your patch. Added some comments. Please add some tests. |
@jbrukh No problem. Will add in the next few days ! |
@@ -37,7 +43,7 @@ See the GoPkgDoc documentation [here](https://godoc.org/github.com/jbrukh/bayesi | |||
- Statistics. | |||
|
|||
------------ | |||
### Example | |||
### Example 1 (plain no td-idf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change to "tf-idf".
@@ -142,7 +149,7 @@ func (d *classData) getWordsProb(words []string) (prob float64) { | |||
// NewClassifier returns a new classifier. The classes provided |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update docs.
func (c *Classifier) ConvertTermsFreqToTFIDF() { | ||
|
||
if c.didConvertTfIdf { | ||
panic("Error:TfIdf-Mode:Cumulative counts - can only call this once. Reset and relearn.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change message to "Cannot call ConvertTermsFreqToTfIdf more than once. Reset and relearn to reconvert."
Thanks for the work. Mostly looks good. Please resolve conflicts with Thank you! |
Cool did resolve some on local branch. Having some difficulties ! Going to On 24 Aug 2016 21:31, "Jake Brukhman" notifications@github.com wrote:
|
Added in TD-IDF support to use instead of plain word counts ! Works very well ! Lots of extra accuracy.
See updated README for usage.