-
Notifications
You must be signed in to change notification settings - Fork 97
Implement a means of getting more than 1000 data points #131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… retrieval of more than 1000 data points. This implementation is subject to an existing bug in the pagination link header in the API that will break when and if that bug is fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi - thank you for the PR, asking for a few changes.
Adafruit_IO/client.py
Outdated
params = {'limit': max_results} if max_results else None | ||
data = [] | ||
path = "feeds/{0}/data".format(feed) | ||
while True: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you change this from while True
to a non-blocking loop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your distaste for while True
. I implemented it this way in the event the user wants all data points. In that case it runs indefinitely until it runs out of data to retrieve. To remove the while True
, I see at least two options. Either way would require an additional query to get the number of data points on a feed using feeds/{feed_key}/details query.
- Disallow None for
max_results
. If we do this, we should expose the number of data points in a feed so that the user can request all data without having to guess how many points. Right now theFeed
object doesn't include this. - Allow None as max_results but internally we query the feed details to get the total number of data points on the feed.
I can see pros/cons either way. Let me know if you have a preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a side note, it might be worth thinking about implementing more details for Feed
anyway. Maybe I'll open another issue for discussing this.
Adafruit_IO/client.py
Outdated
return Data.from_dict(self._get(path)) | ||
|
||
def data(self, feed, data_id=None): | ||
def data(self, feed, data_id=None, max_results=API_PAGE_LIMIT): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does a call to this function return the same value as the master
branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is backwards compatible. If max_results isn't provided (legacy code wouldn't provide it), up to 1000 points will be returned, as before.
Adafruit_IO/client.py
Outdated
from .errors import RequestError, ThrottlingError | ||
from .model import Data, Feed, Group | ||
|
||
API_PAGE_LIMIT = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you comment what this is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the web API returns a maximum of 1000 data points per page. I could have hard-coded this into the function definition for data
, but I prefer to have constants like this defined up front. That way if the API were to change for whatever reason, the python library would only need to be updated in this one location.
…ne (caller is requesting all data).
I made the requested change based on option two from my response. I query the feed count, and use that as max_results, if it is given as None. I'm not 100% sure that that query will always work, but it was the way I was able to find to get the number of records in a feed. If there's a cleaner way, please let me know. |
Apologies, I inadvertently deleted this branch, which closed the PR. I restored the branch and I am re-opening. |
@lcmcninch Hi - could you resolve the conflicting files and I'll look at merging this in. |
# Conflicts: # tests/test_client.py
@brentru No problem, conflict resolved. Thanks for looking at this! |
I think we should drop the default from 1000 records to something much lower. In fact, I think we may change the server to default to a lower limit as well. I'd like to think 100 records is a better default limit. |
@jwcooper I'm happy to make the change, do you want me to reduce it to 100? |
I lowered the default limit to 100. |
Any update on getting this merged? |
This PR resolves #108, at least partially. The end result is that the data method can use the limit query parameter and link header to return any number of data points (or all data points) of a feed. Current functionality returns all data up to 1000 points and doesn't provide the option to either decrease or increase that limit.
Scope of change:
Limitations: