New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Log of disappeared modules (github) #17
Comments
I am drafting the new data model and this is what I came up with that should be relevant to this issue: crai/crai/lib/Crai/Database.rakumod Lines 122 to 139 in 596b83f
Every time the cron job runs (every hour), it will write down which archives it found on CPAN and GitHub. Then we can use SQL to query the difference between any two runs, with a query like the following. If you want we can even make it send an email to you or post it in IRC or create a GH issue or something. SELECT
archives.meta_name
FROM
encounters
INNER JOIN archives
ON archives.url = encounters.archive_url
WHERE
encounters.run_when = ?1
EXCEPT
SELECT
archives.meta_name
FROM
encounters
INNER JOIN archives
ON archives.url = encounters.archive_url
WHERE
encounters.run_when = ?2 |
You can now see how many archives it found on each run: https://crai.foldr.nl/runs. This can be easily extended to display which distributions it found, and compare that to other runs. |
Will need to ignore runs that are outliers in terms of number of archives encountered. It’s more likely that CPAN or GitHub was down or the cronjob crashed, than that hundreds of packages were suddenly deleted. I don’t know much about statistics so I will have to learn that first, which is fun! |
Sometimes people decide to remove github repos or to delete their modules. This is fine, but it's a pain for everyone when there's something that depends on the deleted code. In the past I restored some modules by re-creating them in https://github.com/raku-community-modules by using git repos that zef stores locally, but I got lucky because I actually had installed these modules in the past. Now that crai provides tarballs it is less of an issue, but it'd be great to know when a module is deleted so that we can react quicker. This should also make release management just a little bit less painful.
I think it'd be nice to have a simple log with timestamps and names of modules that no longer have accessible git repos.
The text was updated successfully, but these errors were encountered: