Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solr6 template upgrade #1504

Merged
merged 55 commits into from May 24, 2017
Merged

Conversation

RabidCicada
Copy link
Contributor

@RabidCicada RabidCicada commented Apr 19, 2017

This is the Major upgrade addressing #1502 and started in #1488 to handle Solr6 (specifically 6.4 and 6.5). It adds new command parameters to build_solr_schema. The two new options allow you to specify a directory to put the config files in, and to automatically reload the core. This involved putting a new HAYSTACK_CONNECTIONS variable in place 'ADMIN_URL'.

One major change was that the template got renamed to schema.xml in the template directory instead of solr,xml

Along the way spelling suggestions format parsing was updated and handled (though the legacy style of returning only one suggestion was maintained -- we now have the plumbing to supply multiple). Upgrades were made to the tests to test spelling suggestion without collation to improve code coverage.

Upgrades were also made to handle django 11 by fixing the context passing to be native dicts in the main code base and the tests.

PEP8 changes were made, though not exhaustively.

Updated the Travis config to only use Trusty images as the others didn't have access to Oracle8 which was also required for newer Solr.
Expanded test matrix to handle Django 1.11

TBD:
Upgrade haystack to handle multiple spelling suggestions for Solr
Something about ExtendedCollation from @acdha's other comments
Expand matrix to handle multiple Solr versions (don't know what a reasonable anchor version is)
probably many more:)

…lr spinup to create a default core using predefined config in server/confdir
…check to select requestHandler and fixed parsing changes needed in core on our side
…han one 'suggestion' but keep current behavior. Update schema definition to get rid of _text_
…pt to try to ping correct location. Adding ping handler
…pt kwargs (search mlt etc). get_spelling_suggestions() will need to be updated
@acdha
Copy link
Contributor

acdha commented Apr 29, 2017 via email

@RabidCicada
Copy link
Contributor Author

Alrighty. I think we are in a good place now. I removed all but our two per-configured files (solrconfig.xml and schema.xml), and we generate from solr. then overlay our changed files.

@RabidCicada
Copy link
Contributor Author

Let me know if there's anything else. I believe everything is in order.

@RabidCicada
Copy link
Contributor Author

Eh? :)

raise CommandError('Could not write {}: {}'.format(solrconfig_path, exc))

if reload_core:
core = settings.HAYSTACK_CONNECTIONS['solr']['URL'].rsplit('/', 1)[-1]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line throws an exception if the core is labeled as 'default' instead of 'solr' in HAYSTACK_CONNECTIONS. Is this the desired behavior?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fixed in def141c

raise ImproperlyConfigured("'URL' to the core must be specified in the HAYSTACK_CONNECTIONS settings for the backend")
if 'ADMIN_URL' not in settings.HAYSTACK_CONNECTIONS[using]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I can commit on this branch, so I wanted to point out the minor syntax error here!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks - that was correct when authored but it looks like I screwed up the merge

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, just a sloppy split staging those lines. I corrected the history in 03209da so both commits are valid standalone but the end state of the tree is unchanged

@RabidCicada
Copy link
Contributor Author

Thanks guys for the input and editing. I just got out from under a giant project so I'll have more free time soon. Please let me know if there's any more work needed here.

@ahoho
Copy link

ahoho commented May 23, 2017

I migrated our project from the fork on #1315 to this one and haven't encountered any issues beyond the minor one I mentioned above (now fixed). I'll be doing more extensive testing this week and next, and will report any problems I find here.

One thing worth noting is that #1315 had a SolrAdmin class that used the REST API for schema updates (instead of writing to /conf/schema.xml). It was handy, but I don't think it's a necessary feature.

This was referenced May 24, 2017
@acdha acdha merged commit 038dd5f into django-haystack:master May 24, 2017
@ahoho
Copy link

ahoho commented Jun 16, 2017

I'm catching some errors that I believe are related to the new managed-schema and solrconfig.xml. Searches with more complex query syntax return incorrect results, even on the Solr admin page.

In the past, the following two queries were equivalent
bodytext:(+jones -goldberg -fred -"casey jones" -kennedy)
bodytext:(+jones -(goldberg fred "casey jones" kennedy))

When I use DebugQuery, I'm seeing what might be the problem, but I'm not sure.

In the old version, it returns

"rawquerystring":"bodytext:(+jones -(goldberg fred \"casey jones\" kennedy))",
"querystring":"bodytext:(+jones -(goldberg fred \"casey jones\" kennedy))",
"parsedquery":"+bodytext:jone -(bodytext:goldberg bodytext:fred PhraseQuery(bodytext:\"casei jone\") bodytext:kennedi)",
"parsedquery_toString":"+bodytext:jone -(bodytext:goldberg bodytext:fred bodytext:\"casei jone\" bodytext:kennedi)",

And the new version returns

"rawquerystring":"bodytext:(+jones -(goldberg fred \"casey jones\" kennedy))",
"querystring":"bodytext:(+jones -(goldberg fred \"casey jones\" kennedy))",
"parsedquery":"+bodytext:jone -(+bodytext:goldberg +bodytext:fred +PhraseQuery(bodytext:\"casei jone\") +bodytext:kennedi)",
"parsedquery_toString":"+bodytext:jone -(+bodytext:goldberg +bodytext:fred +bodytext:\"casei jone\" +bodytext:kennedi)",

Note the added + inside the parenthesis. I'm not familiar enough with Solr to determine what's causing this issue. Thoughts?

<defaultSearchField>{{ content_field_name }}</defaultSearchField>

<!-- SolrQueryParser configuration: defaultOperator="AND|OR" -->
<solrQueryParser defaultOperator="{{ default_operator }}"/>
Copy link

@ahoho ahoho Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is what's causing the problem. Without it, the error I described does not occur.

<str name="fmap.content">{{ content_field_name }}</str>
</lst>
</requestHandler>

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The analysis tool in the Solr admin also doesn't work because some RequestHanders were missing.

  <requestHandler name="/analysis/field"
                  startup="lazy"
                  class="solr.FieldAnalysisRequestHandler" />
  <requestHandler name="/analysis/document"
                  class="solr.DocumentAnalysisRequestHandler"
                  startup="lazy" />

@RabidCicada
Copy link
Contributor Author

I'll look at this today a little bit. Thanks for the messages.

@SalahAdDin
Copy link

Good news for Oscar and for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants