Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run scraper with --stats once per week #2

Closed
simonw opened this issue Jun 7, 2022 · 5 comments
Closed

Run scraper with --stats once per week #2

simonw opened this issue Jun 7, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@simonw
Copy link
Owner

simonw commented Jun 7, 2022

This is so I don't get lots of tiny diffs because of page view and download counts incrementing all the time.

I built the script with this in mind - it only writes the stats information out - as separate files - if you include --stats:

if save_stats:
if domain not in stats_files:
stats_files[domain] = (root / "{}.stats.json".format(domain)).open("w")
stats_files[domain].write(json.dumps(stats) + "\n")

@simonw simonw added the enhancement New feature or request label Jun 7, 2022
@simonw
Copy link
Owner Author

simonw commented Jun 7, 2022

simonw added a commit that referenced this issue Jun 8, 2022
@simonw
Copy link
Owner Author

simonw commented Jun 8, 2022

Note that with this change the action no longer scrapes on a commit - it only scrapes on workflow_dispatch or when the schedules trigger.

@simonw
Copy link
Owner Author

simonw commented Jun 8, 2022

Running now with workflow_dispatch which should populate the stats files for the first time.

@simonw
Copy link
Owner Author

simonw commented Jun 8, 2022

Yup, that added the stats files: 1a09c87

@simonw simonw closed this as completed Jun 8, 2022
@simonw
Copy link
Owner Author

simonw commented Jun 8, 2022

I manually ran it again to check I got some diffs and I did: 2060f38#diff-6835345cbfec8fbf1dfeaee6534859a57591cf163fae958e06defcc40f87b969

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant