Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QA 2.0 (to go with Archiver 2.0) #24

Merged
merged 159 commits into from
Nov 20, 2015
Merged

Conversation

davidread
Copy link
Contributor

This takes substantial improvements from the data.gov.uk team. It is intended to work with the Archiver 2.0 ckan/ckanext-archiver#15

Key changes are similar to the archiver:

  • results of the QA are now stored in a dedicated QA table, rather than in the TaskStatus objects (which were awful to query) and ResourceExtras (which makes it look like your dataset changed every time you archive, and if you're not careful this change triggers another archival and you get an infinite loop)
  • the Openness report is done using the neat ckanext-report infrastructure, generated nightly, rather than generated on the fly in ckanext-qa templates. This avoids the big back-end load when you click on a report. Now the report is available to not just admins but the public too (let's be open about broken links - it helps to add pressure to getting them fixed)
  • lots more tests
  • rather than the data.gov.uk team working on their own fork, they contribute back to the whole. Whilst some teams have been using the root fork of ckanext-archiver, no-one has taken much responsibility for it and it is not well maintained.

QA specific:

  • file format detection (rather than just looking at resource.format)

To do:

  • Merge changes from okfn master
  • Change report HTML from Genshi to Jinja

threeaims and others added 30 commits August 8, 2012 14:49
…ts back live on DGU. Queries now execute fast enough so that Varnish does not time out before they return and templates are better formatted.

* Set filename of the al broken links CSV file to all_broken_links.csv (it was set to the built-in function id() before)
* Changed certain SQLAlchemy queries followed by multiple tsak_status queries into single SQL statements
* All reports only return data for active packages and resources
* Capitalized the openness score descriptions
* Tweaked all the main templates so that long links don't break column widths
* Made the "License not open" check come after the other checks, because we prefer to know if links are genuinely broken, rather than that information being supressed by the license. Only working, non-openly licences links now get the "License not open" message
* Noticed that requests library sometimes fails to get head requests after a 301 redirect, reporting bad port. Can't fix this but have added approprate log messages
…o not have every page filtered for insertion of qa.css etc. Much improved logging in tasks.py.
…st time link worked. Excluded "Bad content type" from broken links report. Improved "reason" field to include more detail of why the score was given, instead of a description of the score which is useless. Added many data types, based on DGU data formats. Fixed up test_tasks.py
…e we have a better list of format that we are interested in. Removing unfinished attempt to correlate mime-types with extension. Promote mime-type as most important thing - holds publishers to account more. Broken datasets report - fixed rows getting jumbled, updated dates being wrong, hours and minutes mixed up.
…CKAN now alert bulk/priority queue as appropriate. Can now specify group (instead of package) in update command.
…ow defaults to correct queue when you specify a group. Cope with groups with deleted datasets in (!). Broken link reports now cope when no results. File sniffer now copes with unicode in filenames. Sniffer has more logging when it succeeds. Tests fixed up.
… score, since solr was getting overloaded.
davidread pushed a commit that referenced this pull request Nov 20, 2015
QA 2.0 (to go with Archiver 2.0)
@davidread davidread merged commit c1c3d6c into ckan:master Nov 20, 2015
@davidread davidread deleted the for-2.0-release branch November 20, 2015 18:34
HristijanVilos pushed a commit to keitaroinc/ckanext-qa that referenced this pull request Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants