Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bias term? #60

Closed
usptact opened this issue Feb 5, 2016 · 4 comments
Closed

Bias term? #60

usptact opened this issue Feb 5, 2016 · 4 comments

Comments

@usptact
Copy link

usptact commented Feb 5, 2016

Sometimes I am using the CRFSuite to do document classification. All the features for a document are simply tucked in a single line where the label is the first token in that line as defined by the format.

In the classic Logistic Regression setup one tries to fit the model by finding the parameters - theta (number of features x number classes) and a bias term. The CRFSuite gives the former matrix of coefficients but no bias term. Is it necessary for classification?

All in all, CRF is just a generalization of Logistic Regression to sequences according to some seminal papers on sequence analysis.

Thanks

@kmike
Copy link
Contributor

kmike commented Apr 9, 2016

@usptact you can add a feature which is 1 for all training examples; that'd be a bias feature.
One difference is that usually bias is not regularized, but this feature will be regularized like other features. If that's important you can use values like 100 instead of 1, the effect will be similar to not regularizing bias.

@jlerouge
Copy link
Contributor

jlerouge commented Apr 9, 2016

@usptact Yes, the bias is important, especially if you have unbalanced classes. Mikhail already told you how to do it.

Independently to your initial question, I wonder what is the point of using a CRF to classify a single document ? You could use other frameworks more specifically designed for this purpose (neural networks, SVMs...).

@usptact
Copy link
Author

usptact commented Apr 10, 2016

@kmike @jlerouge Thank you guys for the input! I see now better the place of the bias term. The reason why I am doing this is CRFSuite is now an embedded piece of software in our NLP pipeline. We train many CRF models which do primarily NER but also Chinese segmentation as well as domain classification (classifying short documents). The advantage is that the same tool is used everywhere and it is very fast to train.

@usptact
Copy link
Author

usptact commented Apr 10, 2016

@jlerouge Another reason using linear logistic regression is that it perfectly suffices for the task. We found very little gain (if any) using non-linear techniques for short text classification. Short texts here are documents composed of 3-10 words.

@usptact usptact closed this as completed Jun 7, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants