Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Close #26771: beider_morse phonetic encoder failure when languageset unspecified #26848

Closed
wants to merge 1 commit into from

Conversation

bkazez
Copy link

@bkazez bkazez commented Oct 1, 2017

This closes #26771. Based on jpountz's analysis: check languageset is neither null nor the empty array.

The previous behaviour was for an empty languageset array to cause the original tokens to be returned. Now, an empty languageset array will function the same as a non-existent languageset and trigger automatic language detection. This ensures that beider_morse phonetic encoder always returns a phonetic encoding of SOME sort, and that it matches the documented behavior.

@karmi
Copy link
Contributor

karmi commented Oct 1, 2017

Hi @bkazez, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@bkazez
Copy link
Author

bkazez commented Oct 1, 2017

I added the other email address to my github profile. This should fix the CLA check failure.

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkazez thanks for opening this PR, the fix looks good at first looks, but can you also add a unit test that checks the expected behaviour? Some of the other TokenFilterFactoryTests in org.lasticsearch.index.analysis are probably a good start.

@cbuescher cbuescher self-assigned this Oct 2, 2017
@cbuescher cbuescher added :Search/Analysis How text is split into tokens v7.0.0 labels Oct 2, 2017
@cbuescher
Copy link
Member

@bkazez also regarding the failing CLA check, I was able to see your Github user and the associated email in our CLA database, but its not the email adress you used in the commit, so can you re-check that your Github profile contains the email adress you signed the CLA with or change the commit email adress accordingly?

Copy link
Member

@cbuescher cbuescher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bkazez the CLA check looks good now, if you could add a unit test for this change we can merge it. If you have any questions please let me know.

@cbuescher
Copy link
Member

@bkazez are you still interested to work on this? I think adding a simple unit test is enought to get this in.

@bkazez
Copy link
Author

bkazez commented Oct 21, 2017 via email

@cbuescher
Copy link
Member

Hi @bkazez, thanks for letting us know. I hope you mind me closing this PR then and open a new PR. In case you have time to look at the tests that I added there, you could verify that it fixes your issue reported in #26771.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search/Analysis How text is split into tokens v7.0.0-beta1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Beider_morse phonetic encoder silently fails when languageset not specified
6 participants