Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change solrpy library to pysolr #2352

Merged
merged 9 commits into from May 5, 2016

Conversation

@joetsoi
Copy link
Contributor

@joetsoi joetsoi commented Mar 17, 2015

solrpy is unmaintained and by some handwaving not actually realistic benchmarks, slower, even the author of solrpy has said use something else like pysolr

Leonardo Santagada  30/08/2014

I don't think solrpy supports collection api. Talking about not supporting stuff... 
I'm one of the authors of solrpy and I'm thinking why don't we just put a link to pysolr 
and declare this library deprecated? Has someone looked at pysolr and can comment 
if there is any advantage that we still have over it?

https://groups.google.com/forum/#!searchin/solrpy/pysolr/solrpy/OPuV-jxvtRU/ZyHJ5mONIPoJ

from what I recall, solrpy does more xml transformation where as pysolr just requests the json.
pysolr also uses requests instead of handling the underlying connection itself, so our code gets to be a bit cleaner as well.

We do lose the basic http auth config options for solr as pysolr does not support this, I don't mind as this was actually undocumented on solrpy if I recall anyway)

(also because i'm doing this https://github.com/joetsoi/fedora-rpms-for-ckan)

@joetsoi joetsoi force-pushed the solrpy-is-dead-long-live-pysolr branch from c239c3b to 2f52a83 Mar 17, 2015
@joetsoi
Copy link
Contributor Author

@joetsoi joetsoi commented Mar 17, 2015

(needs to have the legacy tests / new tests stuff fixed.)

@amercader
Copy link
Member

@amercader amercader commented Apr 8, 2015

@joetsoi This looks good, what's left to do with the tests?

It seems like basic auth could be reimplmemented if necessary, but I agree that we shouldn't worry too much about it for now

@joetsoi
Copy link
Contributor Author

@joetsoi joetsoi commented Apr 9, 2015

I meant I need to fix the branch for the new_tests being moved to tests and tests being moved to legacy tests.

if solr_user is not None and solr_password is not None:
return SolrConnection(solr_url, http_user=solr_user,
http_pass=solr_password)
if decode_dates:
Copy link
Member

@amercader amercader Jul 31, 2015

@joetsoi can you briefly explain why this is needed?

Copy link
Contributor Author

@joetsoi joetsoi Aug 3, 2015

If I recall correctly

  • solrpy requests xml from solr and parses and allows you to search in two different ways
    search (returns datetime types) and raw_search (returns text)
  • pysolr parses the json (which does not provide a datetime type) and returns that in its search results

Since this is ckan, we like being difficult so we obviously use both raw and non raw searches in solrpy in differeent places and expect the results from solr to be a datetime and text other times.

So I had to add the decoding of dates to pysolr for the functions expecting datetime objects and some for when it wasn't

if i recall I lifted the datetime regex from pysolr (https://github.com/toastdriven/pysolr/blob/master/pysolr.py#L68) I think I should probably import it instead.

@wardi
Copy link
Contributor

@wardi wardi commented Nov 13, 2015

@amercader how do you feel about merging this for 2.5?

@rossjones
Copy link
Collaborator

@rossjones rossjones commented Feb 12, 2016

Although this is a pretty old PR, I think updating to maintained deps is probably a good thing. Should we unassign this and see if anyone is willing to volunteer to push this through?

@amercader
Copy link
Member

@amercader amercader commented Feb 12, 2016

IIRC the only thing that hold me back was the dates magic going on on this PR. But if that makes sense to someone else who has time to look into it, let's go for it

@joetsoi
Copy link
Contributor Author

@joetsoi joetsoi commented Feb 12, 2016

If you want me to update this to master I can
On 12 Feb 2016 11:32 am, "Adrià Mercader" notifications@github.com wrote:

IIRC the only thing that hold me back was the dates magic going on on this
PR. But if that makes sense to someone else who has time to look into it,
let's go for it


Reply to this email directly or view it on GitHub
#2352 (comment).

@rossjones
Copy link
Collaborator

@rossjones rossjones commented Feb 12, 2016

@joetsoi That was be amazing, thank you.

@wardi
Copy link
Contributor

@wardi wardi commented Apr 24, 2016

@TkTech since you're interested ckan/ideas#174 (comment) , would you like to provide an update for this PR so I can merge?

joetsoi added a commit to joetsoi/ckan that referenced this issue Apr 24, 2016
@joetsoi joetsoi force-pushed the solrpy-is-dead-long-live-pysolr branch from ba595ba to 786ba55 Apr 24, 2016
def solr_datetime_decoder(d):
for k, v in d.items():
if isinstance(v, basestring):
possible_datetime = re.search(pysolr.DATETIME_REGEX, v)
Copy link
Contributor Author

@joetsoi joetsoi Apr 24, 2016

Changed so that we are importing the regex instead of copy pasting it.

pysolr currently always provides a string as the first argument to
SolrError exceptions. An additional check has been added to future proof
the code.
@wardi wardi merged commit 595b2d5 into ckan:master May 5, 2016
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants