Skip to content

Conversation

@xSAVIKx
Copy link
Contributor

@xSAVIKx xSAVIKx commented May 14, 2017

I've updated current Python client to work with Search API v3

What were done:

  • updated the overall client classes to work with v3 API - can close Update Python client to Search API v3 [$100] #31
  • removed not used Query class
  • updated test cases to really use Betamax and make requests to the API (previously cassettes were not used and only were placed in the repo)
  • ensured Post data always have same default values

xSAVIKx added 5 commits May 14, 2017 00:22
* removed not used Query
* simplified client and parser
* restructured exceptions
* updated tests
ensured parsed Post have same values after properties were set

added minimal valid result test assertions

added valid links result test
Copy link
Member

@roback roback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment regarding the removal of the Query class.

from urllib import urlencode


class Query(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would like to still have the query class, just as we have in the Ruby implementation, even though the URL query parameters has been removed in the new version of the API.
Keeping things compatible with the old version to 100% isn't necessary, but we would still like to have the Query class for the following reasons:

  • Without having the Query#start_time and Query#end_time helper methods, the user has to think about formatting (and possibly quoting) the date correctly. This makes it easier for us to give good error messages to the user when an invalid date is set in the query.
  • The start_time and end_time helpers also makes it easier to paginate through multiple pages of results using start-time without having to rebuild the query string themselves with a new date each time. (pagination is explained here. The find_all_posts_mentioning_github.py example is also demonstrating this.)

In the Ruby client we just append start-date/end-date to the search query string before making a request, instead of sending them as URL parameters as we did before.

@xSAVIKx
Copy link
Contributor Author

xSAVIKx commented May 15, 2017 via email

@dentarg
Copy link
Contributor

dentarg commented May 16, 2017

I'm sorry of this will sound harsh, we do appreciate the work you have put in here @xSAVIKx, and hope you will continue to work with us in the future, if you want to.

Firstly, I'd say, that it would be better, if you had documented any
requirements earlier.

I agree that it would have been better. But in case of lack of requirements, I think a big part of software development is to work out what the requirements are, before starting work. Maybe we should have stated more clearly that we wanted the same functionality as the Ruby client.

But, anyway, OK, I'll get query back, but maybe it should be QueryHelper?

Great. I will let @roback answer if it should be just Query or QueryHelper. I haven't looked to carefully.

Actually, I'd say that the best option would be to implement full-featured
DSL that will correspond to the search language, but for sure it will take
much more time to implement.

Yes, you are right, that would be the best. We have talked about it internally, that we would like to see a query builder of some sort in all of our clients. Actually, just yesterday, I mentioned that it would be great if we open up issues on our repos about that, so outside collaborators can get a feel of what we are thinking. We understand though that this is out of scope for this PR.

Also, I think, you would like the same to be applied to the JAVA client?

Yes

@roback
Copy link
Member

roback commented May 16, 2017

Great. I will let @roback answer if it should be just Query or QueryHelper. I haven't looked to carefully.

For now I think just bringing the Query object back would be best in order to keep the number of changes to a minimum, and to keep it somewhat compatible with the previous version. In the ruby client we just append start-date: and end-date: to the search query string instead of as query parameters as it was before (see the diff in the ruby client).

No need to deprecate things as we did with Query#language for example in Ruby (see Query class diff in Ruby client), because I'm not sure it's possible to do easily in Python.

Actually, I'd say that the best option would be to implement full-featured
DSL that will correspond to the search language, but for sure it will take
much more time to implement.

Agreed, but as @dentarg stated above, that is a separate issue which we can resolve at another time. We have the same problem in some of our other API clients as well.

@xSAVIKx
Copy link
Contributor Author

xSAVIKx commented May 16, 2017

@dentarg @roback .

OK, I'll probably revert my changes to the Query and just refactor the functionality.

Regards,
Yuri.

added additional @deprecated notations to client in order to provide backward-compatibility.

rewritten examples to use Query as an alternative to raw `q` usage
@xSAVIKx
Copy link
Contributor Author

xSAVIKx commented May 16, 2017

@dentarg @roback Please review the latest commit.

I've rewritten Query a bit and used [deprecation][https://github.com/briancurtin/deprecation] in order to use deprecations in Python.

Anyway there are some limitations to its functionality (Python ignores DeprecationWarning by default), but IMO such way is the best in-code documentation.

If everything will be OK, I'll implement same changes in Java client.

Regards,
Yuri.

Copy link
Member

@roback roback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly smaller fixes that was forgotten when reintroducing the Query class + comments on some other stuff.

self.search_query = value

@property
@deprecation.deprecated(deprecated_in="3.0.0", removed_in="4.0.0", current_version=twingly_search.__version__,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you check the Ruby client we only deprecated #language and #pattern and kept the rest of the methods in the Query class as is.

self._end_date = None

@property
def start_date(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for start_date and end_date methods, you can just keep the start_time and end_time methods that is already defined below (also no need to deprecate them, see ruby client).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roback Maybe you have to deprecate them also in Ruby client?

It's just quite weird for me, as a developer, that I'm going to set "time", but do actually set time and date. And as it will be converted to start-date: [value] probably it should just follow the documentation as is, so that it would be easier to match what Client do with the search query language.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the reasons we built the clients was to get around some of the quirks in the search language. start-date (in the search language) sounds like you only set the date, when you can in fact set both date and time. The reason we named it start_time in the client is that it better represents what it actually is for.

query.start_time = datetime.datetime.utcnow() - datetime.timedelta(hours=1)
results = query.execute()
q = '"hello world" tspan:24h'
results = client.execute_query(q)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example should use query.search_query and query.execute() again.

Copy link
Member

@roback roback May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while True:
result = self.query.execute()
query_string = self.q.build_query_string()
result = self.client.execute_query(query_string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines can be reverted, just use result = self.query.execute() again.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO Query and Client are too loosely coupled with each other.
I'd prefer to separate them so that Query doesn't depend on Client anymore.

It's quite strange when Query can execute itself that's why I've also marker Query.execute() as deprecated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we have started to realised that as well :) (it's the same problem in a couple of other clients).

What if we do something along the lines of https://github.com/twingly/twingly-search-api-python/pull/32/files/1e3a4f6b18b4b043e227c6c4020d827720f1104f#r117175215 (let the client call q.build_query_string() if the argument to client.execute_query(q) is of type Query that is). That way the user doesn't have to call query.build_query_string() themselves.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roback OK, I do agree, that it'd be better.

return Query(self)

def execute_query(self, query):
def execute_query(self, q):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can switch back to using the Query object here as it was before.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no method overloading in Python :-(
I'd prefer having 2 methods, where q parameter is either Query object or String.

What do you think about following syntax:

def execute_query(self, q=None, query_string='')

Where one of the arguments should be supplied. If both are supplied, we should prefer query_string.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would checking the type of the argument work?

Something like the following (it's in Ruby but you get the point :)):

def execute_query(query)
  query_string = query.is_a?(Query) ? query.build_query_string : query
  response_body = self._get_response(query_string).content
  # ...

Copy link
Contributor Author

@xSAVIKx xSAVIKx May 18, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roback it will!

I've forgotten about simple type checking as I mostly work with Java :-)

I'll try to update sources today, but do not guarantee that will have enough time.

Thanks,
Iurii.

raise TwinglySearchErrorException(error)


class TwinglySearchErrorException(TwinglySearchException):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this class added? Wouldn't it work with just having TwinglySearchException as the base class for the other exceptions?

I also noticed that the exception class hierarchy has changed a bit below. Any particular reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was added as an abstraction for exceptions that have Error from the response in order to provide developers way to obtain error object.

I've added additional Client abstraction in order to separate errors by the error codes, just as it is done in the documentation.


def _get_response(self, query):
response = self._session.get(query.url())
if 200 <= response.status_code < 300:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for moving this into the error class instead.

query.start_time = datetime.datetime(2015, 2, 23, 15, 18, 13)

result = query.execute()
result = client.execute_query(q)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example doesn't work anymore (it should use the Query class).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does work, as execute_query argument is str

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Yes you're right. I commented this before actually checking the Query class.

removed non-needed fields from Query class
updated Client to accept both Query and String
updated unit tests
updated examples Query usage
@xSAVIKx
Copy link
Contributor Author

xSAVIKx commented May 18, 2017

@roback @dentarg Please check the updates.

Copy link
Member

@roback roback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉 I have made a few requests against the API and the responses and errors looks good and work as intended :)

I'll merge this now.

@roback
Copy link
Member

roback commented May 19, 2017

Oops, I forgot to check travis :) Do you have any idea what the following error is about @xSAVIKx? It only happens in Python 3.x.

ERROR: test_execute_query_with_invalid_api_key (tests.test_client.ClientTest)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/travis/build/twingly/twingly-search-api-python/tests/test_client.py", line 45, in test_execute_query_with_invalid_api_key
    c.execute_query(q)
  File "/home/travis/build/twingly/twingly-search-api-python/twingly_search/client.py", line 74, in execute_query
    query_string = self._get_query_string(q)
  File "/home/travis/build/twingly/twingly-search-api-python/twingly_search/client.py", line 80, in _get_query_string
    if self._is_string(q):
  File "/home/travis/build/twingly/twingly-search-api-python/twingly_search/client.py", line 87, in _is_string
    return isinstance(q, types.StringTypes)
AttributeError: module 'types' has no attribute 'StringTypes'

@xSAVIKx
Copy link
Contributor Author

xSAVIKx commented May 19, 2017

@roback I've fixed it.

@roback
Copy link
Member

roback commented May 19, 2017

👍 Thanks. Now its merge time :)

@roback roback merged commit 6cbcf81 into twingly:master May 19, 2017
@xSAVIKx
Copy link
Contributor Author

xSAVIKx commented May 19, 2017

@roback Great :-) Would you review Java PR also?

@roback
Copy link
Member

roback commented May 19, 2017

@roback Great :-) Would you review Java PR also?

Yes, I was about to do that :). I can't promise that I have the time to review everything today though.

roback added a commit that referenced this pull request May 22, 2017
The version was updated to 3.0.0 in #32, but since we don't use
the same version number in the client as the API it was changed
to the correct version (2.0.0) in this commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Python client to Search API v3 [$100]

3 participants