Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign uptextrecipes tuning parameters #16
Comments
|
Could you change
Sure. We have qualitative parameters in other models too. |
Done. Changed to *_times.
Then I have these additions.
|
|
Would a whole step be |
|
I've thought about the issue of including a step or not. We could add an |
|
That would be great! |
|
Should |
|
Mainly |
|
Mind if I default |
|
That should be fine. |
|
Take a look at this commit and let me know if the default ranges (or anything else) should be changed. |
|
Looks good. |
|
Gak. I think that we need large numbers instead of > max_times
Maximum Token Frequency (quantitative)
Range: [1, Inf]
> grid_random(max_times, size = 5)
Show Traceback
Rerun with Debug
Error in min(unlist(object$range)):max(unlist(object$range)) :
result would be too long a vector What should we put in? We could do: > .Machine$integer.max
[1] 2147483647
> library(dials)
> max_times
Maximum Token Frequency (quantitative)
Range: [1, 2147483647]
> grid_random(max_times, size = 5)
# A tibble: 5 x 1
max_times
<int>
1 1024987753
2 2080355927
3 1342632065
4 48813909
5 85432412Maybe something smaller like |
|
So in essence |
|
merged PR |
In accordance to tidymodels/textrecipes#14 here are my thoughts on what should be tunable.
step_texthash:num_termsinteger.step_tf:weightnumeric.step_tokenfilter:maxnumeric.minnumeric.max_tokensinteger.Question.
Would something like
weight_schemeinstep_tfbe tunable as it takes a couple of different (method as characters) values?