Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TermGenerator do not stop stemmed term. #173

Conversation

gauravaror
Copy link
Member

It seems that the TermGenerator? (with STOP_ALL and STEM_ALL strategy) do not stop stemmed term.

For example, with french stemmer and "le" in the stopwords, le term "lea" will be stemmed to "le" and "le" term will be added to the document.

Looking into termgenerator_internal.cc (method index_text), it seems that the stopper in never called on the stem, whatever the flags are.

We are using version 1.4.2 but the stopper is never called on the stem even on git master.

Fixed the issue by changing code and added a test case to fix this issue.

@gauravaror
Copy link
Member Author

Ah! It picked up old perl pod commits aswell i thought i created branch from the master but seems like i branched from another place.

Shall i recreate the branch?

@gauravaror
Copy link
Member Author

Opened a new pull request with only this change

#174

@gauravaror gauravaror closed this Jul 26, 2017
@ojwb
Copy link
Contributor

ojwb commented Jul 28, 2017

There's really no need to open a new PR in this situation - you can just rebase the branch onto master and then force push it to update the existing PR.

Also, you really should look over the diff that's shown before confirming you want to create the PR.

We rightly moan at the GSoC students for this sort of thing!

@gauravaror
Copy link
Member Author

The two extra commit was not in master but in some other branch. I was not sure how would rebasing the branch onto master will fix that. Since it will still have those two extra commits from other PR i created.

I think i rebased the branch onto master.

The only way I saw was to revert those commits. Which was unnecessarily 4 commits added two earlier and two to revert that commit's from the branch. Hence decided to make the new PR.

yeah! my bad. I think I was bit distracted while creating the PR and thought those two commits from different PR was merged and forgot why to wonder why they were here. Will try to not have this issue furtheron. Sorry

@gauravaror gauravaror deleted the 750_termgenerator_do_not_stop_stemmed_terms branch July 28, 2017 07:11
@jaylett
Copy link
Contributor

jaylett commented Jul 28, 2017

@samuelharden if you'd done an interactive rebase onto master, you could have dropped the surplus commits then force pushed onto the PR branch.

@gauravaror
Copy link
Member Author

oops true!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants