Skip to content
Notes and code for the London & Montreal Lucene Hackdays 2018
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
IMAG2335.jpg
LICENSE
README.md

README.md

london-hackday-2018

Notes and code for the London Lucene Hackday 2018 (see below for notes on the Montreal hackday that followed a week afterwards)

Task 1: How can we build a command-line application for inspecting Lucene indexes? Some of these may be very large and use versions of Lucene from 4.x upwards. Considering various existing projects such as:

Results: The team reported that they had managed to build Marple to read indexes from Lucene versions 4,5,6,7 and 8. They also managed to add a 'Query' tab to the UI to allow queries to be sent to the index under inspection. Some of the team will continue to work on this on a branch after the hackday and we expect this to be merged back into the main development branch at some point.

Task 2: Review Alessandro's various Lucene and Solr JIRA tickets:

Reference for the Existing Bugs tracks:

Results: the team were taken through the process of reporting bugs and creating patches on the Lucene/Solr JIRA repository, and looked in depth at one of the issues.

Task3: Review issue 'Different Solr replicas give different result positions' from https://github.com/flaxsearch/london-hackday-2016 item 3 - what's the current state of play with this? *

Results: the team managed to reproduce the issue with indexes as small as 50 documents: different numbers of deleted documents in a replica caused different term statistics and thus different result ordering.

René Kriegler comments: "There is an additional issue with floating point operation precision across JVMs. There is actually no guarantee in Java that a math calculation using numbers of type float will result in exactly the same floating point number across JVM instances (even for the same JVM version). We rarely run into issues here, but I’ve seen a Solr custom plugin having problems when calculating scores.

I think there is nothing that prevents the issue in standard Lucene similarity calculations - we are probably just lucky that scores are normally discrete enough. To avoid the issue we would have to use the strictfp (https://www.codejava.net/java-core/the-java-language/java-keyword-strictfp) keyword to mark up components involved in the calculation."

montreal-hackday-2018

Thanks to all who attended the Hackdays, our kind hosts Mimecast & Netgovern and our dinner/lunch sponsors Elastic, OneMoreCloud and SearchStax.

You can’t perform that action at this time.