Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do we want to track changes to package locations etc? #8

Closed
stephlocke opened this issue Oct 9, 2018 · 11 comments
Closed

Do we want to track changes to package locations etc? #8

stephlocke opened this issue Oct 9, 2018 · 11 comments

Comments

@stephlocke
Copy link
Contributor

Why: For Science!
How: Google Big Query or something as part of the build
Pros: Insight into CRAN
Cons: Could be judgey

@maelle
Copy link
Member

maelle commented Oct 9, 2018

Technically, the build on Travis would update a Google Big Query database and the take_snapshot function would access that or would the Big Query data be for the archiving only? In any case by build it'd be cool to not query CRAN twice (once for the database and once for the dashboard). 🤔

@stephlocke
Copy link
Contributor Author

big query for archiving only, tho it could also be an alright source if we wanted to read the dashboard from it

@maelle
Copy link
Member

maelle commented Oct 9, 2018

@stephlocke why big query btw?

@stephlocke
Copy link
Contributor Author

seemed like a cheap storage util for this sort of simple accumulating data, plus it has some native ML capabilities built in so we would get to play 😉

@maelle
Copy link
Member

maelle commented Oct 10, 2018

So we need to run a code every hour to create the snapshot (with a bit more info cf #9 ) and send it to a Big Query project. Probaby with bigrquery + DBI.

@maelle
Copy link
Member

maelle commented Jan 6, 2020

The data is in the commit history of gh-pages at the moment, I suppose.

@stephlocke
Copy link
Contributor Author

If the dashboard.Rmd could append the data to a csv as the start of our data capture mechanism, that'd be good.

@maelle
Copy link
Member

maelle commented Jan 6, 2020

So two tasks here

  • Use GH commits to retrieve past data

  • Add the appending of a csv to the data capture.

@stephlocke
Copy link
Contributor Author

I'd focus on the append first. The looking through git versions of a html file (I believe we gotta look at the compiled html file?) sounds a lot harder and it'd be better to get fresh data capturing sooner. Backfilling is a nice to have!

@maelle
Copy link
Member

maelle commented Jan 27, 2020

Bad workflow at the moment. https://github.com/lockedata/cransays/blob/master/.github/workflows/master.yml branch https://github.com/lockedata/cransays/tree/history

  • how to I create an orphan branch and then add CSV files to it w/o adding the rest? @stephlocke

  • add the actual submission time to the csv.

once the workflow is improved, add it to cron.yml

@hadley
Copy link
Contributor

hadley commented Sep 12, 2020

I think this was done in aa0ab6b

@maelle maelle closed this as completed Sep 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants