-
Notifications
You must be signed in to change notification settings - Fork 1
Update Python client to Search API v3 #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* removed not used Query * simplified client and parser * restructured exceptions * updated tests
ensured parsed Post have same values after properties were set added minimal valid result test assertions added valid links result test
roback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment regarding the removal of the Query class.
| from urllib import urlencode | ||
|
|
||
|
|
||
| class Query(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would like to still have the query class, just as we have in the Ruby implementation, even though the URL query parameters has been removed in the new version of the API.
Keeping things compatible with the old version to 100% isn't necessary, but we would still like to have the Query class for the following reasons:
- Without having the
Query#start_timeandQuery#end_timehelper methods, the user has to think about formatting (and possibly quoting) the date correctly. This makes it easier for us to give good error messages to the user when an invalid date is set in the query. - The
start_timeandend_timehelpers also makes it easier to paginate through multiple pages of results usingstart-timewithout having to rebuild the query string themselves with a new date each time. (pagination is explained here. Thefind_all_posts_mentioning_github.pyexample is also demonstrating this.)
In the Ruby client we just append start-date/end-date to the search query string before making a request, instead of sending them as URL parameters as we did before.
|
Firstly, I'd say, that it would be better, if you had documented any
requirements earlier.
But, anyway, OK, I'll get query back, but maybe it should be QueryHelper?
Actually, I'd say that the best option would be to implement full-featured
DSL that will correspond to the search language, but for sure it will take
much more time to implement.
Also, I think, you would like the same to be applied to the JAVA client?
On Mon, May 15, 2017, 18:00 Mattias Roback ***@***.***> wrote:
***@***.**** requested changes on this pull request.
Left a comment regarding the removal of the Query class.
------------------------------
In twingly_search/query.py
<#32 (comment)>
:
> @@ -1,120 +0,0 @@
-import datetime
-from pytz import utc
-
-from twingly_search.errors import TwinglyQueryException
-
-try:
- # python 3
- from urllib.parse import urlencode
-except ImportError:
- from urllib import urlencode
-
-
-class Query(object):
We would like to still have the query class, just as we have in the Ruby
implementation, even though the URL query parameters has been removed in
the new version of the API.
Keeping things compatible with the old version to 100% isn't necessary,
but we would still like to have the Query class for the following reasons:
- Without having the Query#start_time and Query#end_time helper
methods, the user has to think about formatting (and possibly quoting) the
date correctly. This makes it easier for us to give good error messages to
the user when an invalid date is set in the query.
- The start_time and end_time helpers also makes it easier to paginate
through multiple pages of results using start-time without having to
rebuild the query string themselves with a new date each time. (pagination
is explained here
<https://developer.twingly.com/resources/search/#pagination>. The
find_all_posts_mentioning_github.py example is also demonstrating
this.)
In the Ruby client we just append start-date/end-date to the search query
string
<https://github.com/twingly/twingly-search-api-ruby/blob/5ac4516a803776efec18f78e3c60feedf83d28dc/lib/twingly/search/query.rb#L81-L82>
before making a request, instead of sending them as URL parameters as we
did before.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#32 (review)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AFaaqUMkpn1OgP1LvOul3KQlMKiOpW5tks5r6Gh0gaJpZM4NaVb_>
.
--
Iurii Sergiichuk
Software Developer
TeamDev
mobile: +38 067 5 777 244 <+380675777244>
email: iurii.sergiichuk@gmail.com
[image: facebook.png] <https://www.facebook.com/iurii.sergiichuk>
[image: linked_in.png] <https://ua.linkedin.com/in/iuriisergiichuk>
[image: git_hub.png] <https://github.com/xSAVIKx>
|
|
I'm sorry of this will sound harsh, we do appreciate the work you have put in here @xSAVIKx, and hope you will continue to work with us in the future, if you want to.
I agree that it would have been better. But in case of lack of requirements, I think a big part of software development is to work out what the requirements are, before starting work. Maybe we should have stated more clearly that we wanted the same functionality as the Ruby client.
Great. I will let @roback answer if it should be just
Yes, you are right, that would be the best. We have talked about it internally, that we would like to see a query builder of some sort in all of our clients. Actually, just yesterday, I mentioned that it would be great if we open up issues on our repos about that, so outside collaborators can get a feel of what we are thinking. We understand though that this is out of scope for this PR.
Yes |
For now I think just bringing the No need to deprecate things as we did with
Agreed, but as @dentarg stated above, that is a separate issue which we can resolve at another time. We have the same problem in some of our other API clients as well. |
added additional @deprecated notations to client in order to provide backward-compatibility. rewritten examples to use Query as an alternative to raw `q` usage
|
@dentarg @roback Please review the latest commit. I've rewritten Query a bit and used [deprecation][https://github.com/briancurtin/deprecation] in order to use deprecations in Python. Anyway there are some limitations to its functionality (Python ignores DeprecationWarning by default), but IMO such way is the best in-code documentation. If everything will be OK, I'll implement same changes in Java client. Regards, |
roback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly smaller fixes that was forgotten when reintroducing the Query class + comments on some other stuff.
twingly_search/query.py
Outdated
| self.search_query = value | ||
|
|
||
| @property | ||
| @deprecation.deprecated(deprecated_in="3.0.0", removed_in="4.0.0", current_version=twingly_search.__version__, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you check the Ruby client we only deprecated #language and #pattern and kept the rest of the methods in the Query class as is.
twingly_search/query.py
Outdated
| self._end_date = None | ||
|
|
||
| @property | ||
| def start_date(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for start_date and end_date methods, you can just keep the start_time and end_time methods that is already defined below (also no need to deprecate them, see ruby client).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roback Maybe you have to deprecate them also in Ruby client?
It's just quite weird for me, as a developer, that I'm going to set "time", but do actually set time and date. And as it will be converted to start-date: [value] probably it should just follow the documentation as is, so that it would be easier to match what Client do with the search query language.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the reasons we built the clients was to get around some of the quirks in the search language. start-date (in the search language) sounds like you only set the date, when you can in fact set both date and time. The reason we named it start_time in the client is that it better represents what it actually is for.
| query.start_time = datetime.datetime.utcnow() - datetime.timedelta(hours=1) | ||
| results = query.execute() | ||
| q = '"hello world" tspan:24h' | ||
| results = client.execute_query(q) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example should use query.search_query and query.execute() again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore the above comment, check the conversation at https://github.com/twingly/twingly-search-api-python/pull/32/files/1e3a4f6b18b4b043e227c6c4020d827720f1104f#r117046935 :)
| while True: | ||
| result = self.query.execute() | ||
| query_string = self.q.build_query_string() | ||
| result = self.client.execute_query(query_string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines can be reverted, just use result = self.query.execute() again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO Query and Client are too loosely coupled with each other.
I'd prefer to separate them so that Query doesn't depend on Client anymore.
It's quite strange when Query can execute itself that's why I've also marker Query.execute() as deprecated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we have started to realised that as well :) (it's the same problem in a couple of other clients).
What if we do something along the lines of https://github.com/twingly/twingly-search-api-python/pull/32/files/1e3a4f6b18b4b043e227c6c4020d827720f1104f#r117175215 (let the client call q.build_query_string() if the argument to client.execute_query(q) is of type Query that is). That way the user doesn't have to call query.build_query_string() themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roback OK, I do agree, that it'd be better.
| return Query(self) | ||
|
|
||
| def execute_query(self, query): | ||
| def execute_query(self, q): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can switch back to using the Query object here as it was before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no method overloading in Python :-(
I'd prefer having 2 methods, where q parameter is either Query object or String.
What do you think about following syntax:
def execute_query(self, q=None, query_string='')Where one of the arguments should be supplied. If both are supplied, we should prefer query_string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would checking the type of the argument work?
Something like the following (it's in Ruby but you get the point :)):
def execute_query(query)
query_string = query.is_a?(Query) ? query.build_query_string : query
response_body = self._get_response(query_string).content
# ...There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@roback it will!
I've forgotten about simple type checking as I mostly work with Java :-)
I'll try to update sources today, but do not guarantee that will have enough time.
Thanks,
Iurii.
| raise TwinglySearchErrorException(error) | ||
|
|
||
|
|
||
| class TwinglySearchErrorException(TwinglySearchException): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this class added? Wouldn't it work with just having TwinglySearchException as the base class for the other exceptions?
I also noticed that the exception class hierarchy has changed a bit below. Any particular reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added as an abstraction for exceptions that have Error from the response in order to provide developers way to obtain error object.
I've added additional Client abstraction in order to separate errors by the error codes, just as it is done in the documentation.
|
|
||
| def _get_response(self, query): | ||
| response = self._session.get(query.url()) | ||
| if 200 <= response.status_code < 300: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for moving this into the error class instead.
| query.start_time = datetime.datetime(2015, 2, 23, 15, 18, 13) | ||
|
|
||
| result = query.execute() | ||
| result = client.execute_query(q) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example doesn't work anymore (it should use the Query class).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does work, as execute_query argument is str
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Yes you're right. I commented this before actually checking the Query class.
removed non-needed fields from Query class updated Client to accept both Query and String updated unit tests updated examples Query usage
roback
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉 I have made a few requests against the API and the responses and errors looks good and work as intended :)
I'll merge this now.
|
Oops, I forgot to check travis :) Do you have any idea what the following error is about @xSAVIKx? It only happens in Python 3.x. |
|
@roback I've fixed it. |
|
👍 Thanks. Now its merge time :) |
|
@roback Great :-) Would you review Java PR also? |
Yes, I was about to do that :). I can't promise that I have the time to review everything today though. |
The version was updated to 3.0.0 in #32, but since we don't use the same version number in the client as the API it was changed to the correct version (2.0.0) in this commit.
I've updated current Python client to work with Search API v3
What were done: