Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a parameter to select the N best features #8

Closed
kosloot opened this issue May 16, 2018 · 4 comments
Closed

add a parameter to select the N best features #8

kosloot opened this issue May 16, 2018 · 4 comments
Assignees

Comments

@kosloot
Copy link
Contributor

kosloot commented May 16, 2018

For some tasks, with a lot of features, it might be handy to let Timbl select the top N features, based on the current weight, and use only those to build the tree. Also an implicit -mI for all the other features.

e.g a --cutoff 1000, would select the 1000 'best ranked' features. Assuming more than 1000 are available :)

comment welcome....

@antalvdb
Copy link
Member

No comments, other than that it's a good idea.

@kosloot
Copy link
Contributor Author

kosloot commented May 21, 2018

small note: The Feature weights are at first calculated using ALL available features.
After choosing the N best, the weights have to be recalculated for only those N.
Otherwise surprises might happen.
for instance: when restoring an IB1 tree from a file, Timbl calculates the weights based on the info in the file. (Also NOT taking into account the ignored features.)
These weights should match those of an tree prior to storing.

So the scheme is:

  • calculate the weights
  • select the N best
  • recalculate the weights, with a -mI:k for all all NOT in the N best.

@kosloot kosloot removed the question label Jun 4, 2018
@kosloot
Copy link
Contributor Author

kosloot commented Jun 4, 2018

A '--limit' option is implemented now.
simple tests did succeed, but more thorough tests are welcome.

@kosloot
Copy link
Contributor Author

kosloot commented Nov 28, 2018

No feedback yet. I assume this is working.

@kosloot kosloot closed this as completed Nov 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants