Bias term? #60

usptact · 2016-02-05T19:01:27Z

Sometimes I am using the CRFSuite to do document classification. All the features for a document are simply tucked in a single line where the label is the first token in that line as defined by the format.

In the classic Logistic Regression setup one tries to fit the model by finding the parameters - theta (number of features x number classes) and a bias term. The CRFSuite gives the former matrix of coefficients but no bias term. Is it necessary for classification?

All in all, CRF is just a generalization of Logistic Regression to sequences according to some seminal papers on sequence analysis.

Thanks

kmike · 2016-04-09T06:09:35Z

@usptact you can add a feature which is 1 for all training examples; that'd be a bias feature.
One difference is that usually bias is not regularized, but this feature will be regularized like other features. If that's important you can use values like 100 instead of 1, the effect will be similar to not regularizing bias.

jlerouge · 2016-04-09T06:52:31Z

@usptact Yes, the bias is important, especially if you have unbalanced classes. Mikhail already told you how to do it.

Independently to your initial question, I wonder what is the point of using a CRF to classify a single document ? You could use other frameworks more specifically designed for this purpose (neural networks, SVMs...).

usptact · 2016-04-10T06:25:08Z

@kmike @jlerouge Thank you guys for the input! I see now better the place of the bias term. The reason why I am doing this is CRFSuite is now an embedded piece of software in our NLP pipeline. We train many CRF models which do primarily NER but also Chinese segmentation as well as domain classification (classifying short documents). The advantage is that the same tool is used everywhere and it is very fast to train.

usptact · 2016-04-10T06:27:05Z

@jlerouge Another reason using linear logistic regression is that it perfectly suffices for the task. We found very little gain (if any) using non-linear techniques for short text classification. Short texts here are documents composed of 3-10 words.

usptact closed this as completed Jun 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bias term? #60

Bias term? #60

usptact commented Feb 5, 2016

kmike commented Apr 9, 2016

jlerouge commented Apr 9, 2016

usptact commented Apr 10, 2016

usptact commented Apr 10, 2016

Bias term? #60

Bias term? #60

Comments

usptact commented Feb 5, 2016

kmike commented Apr 9, 2016

jlerouge commented Apr 9, 2016

usptact commented Apr 10, 2016

usptact commented Apr 10, 2016