-
Notifications
You must be signed in to change notification settings - Fork 0
Add new libraries to daily data collection #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2e0cb2e to
f2282d7
Compare
| totals = base.sum() | ||
| totals.name = 'total' | ||
| base = pd.concat([base, totals], ignore_index=True) | ||
| base = pd.concat([base, totals.to_frame().T], ignore_index=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why add a transpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fixes a bug that I introduced in the previous addition. totals is a series that was 'appended' to the frame and now we are using pd.concat I referenced to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new output for total:
base.reset_index().iloc[::-1]['total']
82 2011571.0
81 6536.0
80 65395.0
79 10876.0
78 57403.0
...
4 435.0
3 569.0
2 187.0
1 93.0
0 60.0
Name: total, Length: 83, dtype: float64* Add additional libraries to the download analytics collection * Fix daily build * Use fixed httplib2 version * Improve way of inserting columns * Update metrics.py * Update metrics.py
Resolves #6
CU-86b40fw92
Here is a completed run from this branch: https://github.com/datacebo/download-analytics/actions/runs/13683203905
I realized that the data collected for the newly added libraries is for the current month. In order to make it consistent with
SDVI will run a heavier query for the newly added libraries to make sure they are in sync withsdv's data (basically force the query to run on the newly added libraries since SDV was first released).Updated results can be found here: https://drive.google.com/drive/u/1/folders/10QHbqyvptmZX4yhu2Y38YJbVHqINRr0n