-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnecessary blank lines found in stopwords.txt of SmartChineseAnalyzer #12291
Comments
Good Catch! Could you submit a PR to fix that? |
In general I'd suggest to figure out if we should not change the stopword file parser to strip |
I think the stoplist loader already ignores comment lines, but, does not ignore empty lines! Darned empty string rears its head at us again... |
Hi @tang-hi, I can summit a PR to fix this issue. How about skipping blank line before
and remove the blank lines from stopwords.txt. What do you think? |
Also here:
|
Description
Hi team,
This issue is a spin-off from the java-user list thread.
The stopwords.txt of SmartChineseAnalyzer contains two blank lines at L56 & L58. As a result,
SmartChineseAnalyzer.getDefaultStopSet()
will produce an empty string stop word, but it makes no sense to have empty string as a stop word.Maybe we can improve it?
The text was updated successfully, but these errors were encountered: