Skip to content

change solrpy library to pysolr#2352

Merged
wardi merged 9 commits into
ckan:masterfrom
joetsoi:solrpy-is-dead-long-live-pysolr
May 5, 2016
Merged

change solrpy library to pysolr#2352
wardi merged 9 commits into
ckan:masterfrom
joetsoi:solrpy-is-dead-long-live-pysolr

Conversation

@joetsoi

@joetsoi joetsoi commented Mar 17, 2015

Copy link
Copy Markdown
Contributor

solrpy is unmaintained and by some handwaving not actually realistic benchmarks, slower, even the author of solrpy has said use something else like pysolr

Leonardo Santagada  30/08/2014

I don't think solrpy supports collection api. Talking about not supporting stuff... 
I'm one of the authors of solrpy and I'm thinking why don't we just put a link to pysolr 
and declare this library deprecated? Has someone looked at pysolr and can comment 
if there is any advantage that we still have over it?

https://groups.google.com/forum/#!searchin/solrpy/pysolr/solrpy/OPuV-jxvtRU/ZyHJ5mONIPoJ

from what I recall, solrpy does more xml transformation where as pysolr just requests the json.
pysolr also uses requests instead of handling the underlying connection itself, so our code gets to be a bit cleaner as well.

We do lose the basic http auth config options for solr as pysolr does not support this, I don't mind as this was actually undocumented on solrpy if I recall anyway)

(also because i'm doing this https://github.com/joetsoi/fedora-rpms-for-ckan)

@joetsoi joetsoi force-pushed the solrpy-is-dead-long-live-pysolr branch from c239c3b to 2f52a83 Compare March 17, 2015 11:38
@joetsoi

joetsoi commented Mar 17, 2015

Copy link
Copy Markdown
Contributor Author

(needs to have the legacy tests / new tests stuff fixed.)

@amercader

Copy link
Copy Markdown
Member

@joetsoi This looks good, what's left to do with the tests?

It seems like basic auth could be reimplmemented if necessary, but I agree that we shouldn't worry too much about it for now

@joetsoi

joetsoi commented Apr 9, 2015

Copy link
Copy Markdown
Contributor Author

I meant I need to fix the branch for the new_tests being moved to tests and tests being moved to legacy tests.

Comment thread ckan/lib/search/common.py

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joetsoi can you briefly explain why this is needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I recall correctly

  • solrpy requests xml from solr and parses and allows you to search in two different ways
    search (returns datetime types) and raw_search (returns text)
  • pysolr parses the json (which does not provide a datetime type) and returns that in its search results

Since this is ckan, we like being difficult so we obviously use both raw and non raw searches in solrpy in differeent places and expect the results from solr to be a datetime and text other times.

So I had to add the decoding of dates to pysolr for the functions expecting datetime objects and some for when it wasn't

if i recall I lifted the datetime regex from pysolr (https://github.com/toastdriven/pysolr/blob/master/pysolr.py#L68) I think I should probably import it instead.

@wardi

wardi commented Nov 13, 2015

Copy link
Copy Markdown
Contributor

@amercader how do you feel about merging this for 2.5?

@rossjones

Copy link
Copy Markdown
Contributor

Although this is a pretty old PR, I think updating to maintained deps is probably a good thing. Should we unassign this and see if anyone is willing to volunteer to push this through?

@amercader

Copy link
Copy Markdown
Member

IIRC the only thing that hold me back was the dates magic going on on this PR. But if that makes sense to someone else who has time to look into it, let's go for it

@joetsoi

joetsoi commented Feb 12, 2016

Copy link
Copy Markdown
Contributor Author

If you want me to update this to master I can
On 12 Feb 2016 11:32 am, "Adrià Mercader" notifications@github.com wrote:

IIRC the only thing that hold me back was the dates magic going on on this
PR. But if that makes sense to someone else who has time to look into it,
let's go for it


Reply to this email directly or view it on GitHub
#2352 (comment).

@rossjones

Copy link
Copy Markdown
Contributor

@joetsoi That was be amazing, thank you.

@wardi

wardi commented Apr 24, 2016

Copy link
Copy Markdown
Contributor

@TkTech since you're interested ckan/ideas#174 (comment) , would you like to provide an update for this PR so I can merge?

@joetsoi joetsoi force-pushed the solrpy-is-dead-long-live-pysolr branch from ba595ba to 786ba55 Compare April 24, 2016 22:44
Comment thread ckan/lib/search/common.py
def solr_datetime_decoder(d):
for k, v in d.items():
if isinstance(v, basestring):
possible_datetime = re.search(pysolr.DATETIME_REGEX, v)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed so that we are importing the regex instead of copy pasting it.

pysolr currently always provides a string as the first argument to
SolrError exceptions. An additional check has been added to future proof
the code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants