New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added normalization option to Query Rescorer #9535
Conversation
Thanks for the PR. We'll take a look and get back to you. @s1monw could you review this please? |
… has a size of 0). Added test for this scenario
Apologies that it has taken so long to look at this. Got lost in the interwebs. Would you be interested in updating the PR against master? If so, please could I also ask you to sign the CLA? http://www.elasticsearch.org/contributor-agreement/ thanks |
Not a problem it's a pretty minor thing when compared against 2.0. I can work on updating the PR with what's current on Master. As far as the CLA, I'm not sure why that didn't get updated. I resigned the Adobe Document on Feb 15th 2016: "Individual Contributor License Agreement between elasticsearch BV and Ross Lieberman is Signed and Filed!" |
Found it. You used a different email address, but I've added your github address now. thanks |
@s1monw I think this one still needs review now that the CLA has been signed |
I'll see if I'm able to knock it out today. I haven't worked with On Wed, Apr 6, 2016 at 5:14 PM, Lee Hinman notifications@github.com wrote:
|
# Conflicts: # src/main/java/org/elasticsearch/search/rescore/QueryRescorer.java # src/main/java/org/elasticsearch/search/rescore/RescoreBuilder.java # src/test/java/org/elasticsearch/search/rescore/QueryRescorerTests.java
There has been some massive restructuring to the Rescorer (for the better). I'm looking to apply my changes, however before I did, I wanted to get the new master branch to compile, which seems impossible right now given the incompatibilities of Gradle & IntelliJ. I just spent the last 4 hours trying to get IntelliJ to build on my Mac and found that it was because JAVA_HOME wasn't being picked up from my profile. Once that was resolved I'm getting thousands of Import References not found throughout the project even though they are clearly part of the solution I can go to the folders and find them manually. I'm going to try Eclipse tomorrow to see if that resolves this. |
Hmm...I'm not sure what this is about? I've never had issues with this but I really want to help you! What happened here?
Are you using IntelliJ 2016.1? Sadly, it's broken. If you're a die-hard IntelliJ user, your best bet is to continue to use IntelliJ 15 if you want to work with Elasticsearch. |
That solves my problem! Thanks Jaden, I just downloaded everything today On Fri, Apr 8, 2016 at 7:46 PM, Jason Tedor notifications@github.com
|
Replaced default Lucene rescore logic with a version that normalizes the returned calculated scores by a new boolean option normalize. By removing the default logic we cut down on the number of loops that the logic does. Added new ScoreMode: Replace which takes the rescored calculation and replaces original score with the new value.
Sorry about mistyping your name @jasontedor it was a long frustrating 12 hour work day on Friday and I was heading out the door when your reply came in and that did help me. It turns out my problem was I had to open the Gradle Window and then hit the refresh button, even though I had checked the auto-import when I imported the project after running the "gradle import" command. After that I ran into the JarHell issue with ant-javafx.jar, however once that was removed everything worked! (in case anyone else has those issues, this was how I resolved them). @clintongormley @s1monw I updated the Normalization code to the current Master branch. It was mostly the same as before although I cleaned up some of the code since I had the opportunity (i.e. instead of doing Collection.Min() & Collection.Max(), I grab the values during the initial loop to reduce extra lookups). You may also want @cbuescher to take a look, since he was the last to touch the section when he abstracted some of the QueryRescorer logic back in January. A few notes:
Let me know if there's anything else I can do or any changes you'd like to see. |
Hi @RossLieberman, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile? |
@karmi I re-added those email addresses to my GitHub Account. Let me know if that resolves the issue. |
Hi @RossLieberman, it looks like some commits were commited with a different address then the one you have signed with, have a look: https://github.com/RossLieberman/elasticsearch/commit/da5d6437d34138023ae7050cf5edd69846a53ecf.patch It is generally enough to add those e-mails to your Github profile (they can be hidden, it's just a way for Github to match your profile with commits), and the CLA checker is happy. Alternatively, you can commit with |
Thanks @karmi, I didn't realize the email address being referenced was not just related to the Pull Request on my GitHub account and the CLA. I added several aliases yesterday +elasticsearch +github from my former company (hence the reason behind the email address change), but I also signed a new CLA with the email address used in the commit along with adding that address to GitHub. Hopefully this resolves things. If that still doesn't resolve things. I'll try going down the road of |
@RossLieberman, the check is now passing, so your approach has worked -- thanks!! |
@s1monw is this still applicable? If so, I think it still needs a review |
@RossLieberman Are you still interested in getting this PR in? It has merge conflicts that need to be addressed. |
Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually? |
I subscribed to this issue a long time ago because I think it is useful. Not currently using the rescorer, but I might be soon. If @RossLieberman does not get back and the community thinks it is useful, I can help with the conflicts (and perhaps addressed other issues originally mentioned). |
@brusic given that this hasn't seen updates in quite a while, it sounds like @RossLieberman isn't going to get back to it. I'm going to close this issue, if you want to pick it up I think opened a new PR would be the best thing to do. |
Still have no immediate need for reacting, but I would not mind seeing what the merge conflicts are. It has been a long time since I have contributed anything, but I do have something lined up (now that an upstream fix has been made). |
My rescored results were so varied from my original searches that I needed a way to normalize the data so that the rescoring wouldn't overpower the tfid scores.
I also removed a comment from the previous commit: // TODO: shouldn't this be up to the ScoreMode? I.e., we should just invoke ScoreMode.combine, passing 0.0f for the secondary score
This should probably be up to the developer, since performing certain actions like Average or Multiple will treat a non-match as 0 and could behave differently than expected.
If you are interested, let me know any feedback or changes I'd be more than happy to make them.