Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question / suggestion to use multiple n-grams to get more features #76

Open
iibarant opened this issue Nov 23, 2021 · 0 comments
Open

Comments

@iibarant
Copy link

Hi @Bergvca and @ParticularMiner,

Hope you are doing good.

I got to work on the same project again and have a question / suggestion - would it be possible to use multiple n-grams to get more features? Like currently we have the following - ngram_size: The amount of characters in each n-gram. Default is 3.

What if we get n-grams in a list like [2,3,4] and get more vector components - ngrams=2 plus ngrams=3 and ngrams=4?

What do you think?

By the way, the string_grouper approach is really good in terms of speed and efficiency. Great work!

Thank you,
iibarant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant