Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import historical data. #2

Merged
merged 5 commits into from
Dec 19, 2014
Merged

Import historical data. #2

merged 5 commits into from
Dec 19, 2014

Conversation

grokcode
Copy link
Contributor

@brad So here is what I have for importing historical data. You can see in the view code where it should be imported, but having that call there was breaking tests. (Looks like a Celery issue.)

I tested everything manually with your Misfit credentials, and it's looking good. No unit tests because mocking all of those api calls looks kind of painful, and I ran out of time to do it today.

@coveralls
Copy link

Coverage Status

Coverage decreased (-12.88%) when pulling d13f4e9 on import_historical into 551476b on master.

chunks.append((s, e))
s = e + datetime.timedelta(days=1)
e = s + datetime.timedelta(days=days_in_chunk)
return chunks
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call on the chunkifying! I didn't realize it was necessary.

@brad
Copy link
Member

brad commented Dec 19, 2014

@grokcode This looks good to me so far. One thing to be aware of that may bite us and may need to be taken into consideration here, is that in my testing of the summary API it didn't look like the end_date was inclusive. For example, I tested with start date 2014-12-15, end date 2014-12-18 and it didn't include my summary from 2014-12-18. I changed the end date to 2014-12-19 and then it gave it to me. Maybe it's dependent on the time of day. I'll be interested to see what it looks like in the morning.

@grokcode
Copy link
Contributor Author

@brad Thanks for the heads up about the dates. I've done some more poking around and interestingly enough, sometimes end_date is inclusive, and other times it is not. I'm seeing the same behavior as you on those test cases, but I think the reason nothing was returned on the 18th is because there weren't any steps for that day. Now why changing the end_date to the 19th is giving a summary with no steps...I have no idea. Most of the time nothing is returned when there aren't any steps.

Testing start_date=datetime.date(2014, 12, 13), end_date=datetime.date(2014, 12, 16) is giving data for the 13, 14, 15, and 16 all of which have steps.

@brad
Copy link
Member

brad commented Dec 19, 2014

@grokcode That's really weird that you managed to get the end date to be inclusive. You must be using my old FB account credentials by the way, because my new account has steps for the 18th and 19th. Still though, when I try this morning to get data from the 17th to the 19th, it doesn't return data for the 19th unless I use the 20th as the end date. For the time being, I simply added a day to the end date (6478958) when processing notifications. I feel a little dirty doing it, but I can't think of any problem it could cause.

Regarding the notification processing, I suppose we need to chunkify the dates there too right? It seems unlikely, but the person could go more than 30 days without syncing and then we might try and pull more than 30 days of summary at once.

data = cc_to_underscore_keys(summary.data)
data['user_id'] = uid
obj_list.append(cls(**data))
cls.objects.bulk_create(obj_list)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grokcode Love that you used bulk_create on these.

@grokcode
Copy link
Contributor Author

@brad I'm getting steps in the summary data for the 18th and 19th now. I'm pretty sure that I don't have your FB credentials. 😟 Maybe it didn't sync until recently?

But look at this: http://pastebin.com/qYVy7H9s
In the first case it is exclusive, in the second case inclusive. So I will look into this a bit more because during the historical import, I don't want to be missing data or trying to save identical models. I agree that adding an extra day when processing notifications is a good way to avoid the problem altogether and not harmful.

Also, for notification processing, I think you are right, we need to chunkify there too.

@brad
Copy link
Member

brad commented Dec 19, 2014

@grokcode You're right, that data is not from FB account. I looked at the pastebin and that is really bizarre. This is starting to look like it's simply a bug on Misfit's side. Like maybe they have an internal timestamp for when they created the summary record and they are using that to filter by rather than the 'date' the summary is about.
I'm thinking about going a few days (the weekend) without syncing to test this theory.

@coveralls
Copy link

Coverage Status

Coverage decreased (-3.83%) when pulling 17ace59 on import_historical into 551476b on master.

@grokcode
Copy link
Contributor Author

@brad What do you think about reviewing and merging this, and then doing a bugfix for the date range issue once we have a better idea of what is happening there?

@coveralls
Copy link

Coverage Status

Coverage decreased (-3.78%) when pulling aa262a5 on import_historical into 551476b on master.

@brad
Copy link
Member

brad commented Dec 19, 2014

Tested end to end and it works flawlessly 🍻 merging

brad added a commit that referenced this pull request Dec 19, 2014
@brad brad merged commit c9855ed into master Dec 19, 2014
@brad brad deleted the import_historical branch December 19, 2014 23:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants