Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
api
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Semantic data.gov.uk (0.8.5)

This project is a fork of datagouvfr-rdf, adapted to the British Open Data portal metadata (data.gov.uk).

You can fire SPARQL queries on the endpoint here.

This script is fully functional (not beta or alpha or what not).

Update script

build.xml is an Apache Ant script that runs the following tasks:

  1. Downloading the latest metadata dumps from data.gov.uk (CSV)
  2. Cleaning the data dumps (empty lines, spaces in CSV headers, etc.)
  3. Converting the CSV into RDF (using TARQL)
  4. Uploading the RDF to a repository
  5. Converting text identifiers into URIs for better linking across the data
  6. Integrating the output of beheader into the graph (soon)
  7. Adding some metadata about the resulting data set (DCAT, VoID, PROV)

This script is run every night to update the RDF metadata.

The data model can be seen here.

Requirements

  • Apache Ant, with [ANT INSTALL]/bin directory added to your PATH environment variable
  • cURL, with [CURL INSTALL] directory added to your PATH environment variable
  • TARQL by Richard Cyganiak (@cygri), with [TARQL INSTALL] directory added to your PATH environment variable
  • An RDF repository. Apache Fuseki is a good choice, but there are plenty.

Configuration

  • Copy upload_template.properties and rename it upload.properties
  • Open it and fill it. As-is, your repository requires a user:password combination

Run it

  • If Requirements are fulfilled, just run ant in datagovuk-rdf root folder.
  • If you have already run the process and just want to reload the data in the triple store, run ant quick.

Next steps

  • Tell me!

Contact

I would love to read your feedback/comments/suggestions!

If you have a Github account, you can create an issue.

Otherwise, you can reach me:

Change log

0.8.5
  • Fixed malformed URLs by trimming trailing space before upload
0.8.4
  • Detection of machine-readable resources (dgfr:machineReadable)
0.8.3
  • Added backup-repository and load-backup targets to enable the management of the repository as a service
  • Added data integration from beheader
0.8.2
  • Fixed dcat:downloadUrl
0.8.1
  • Fixed missing directories (csv and rdf)

0.8.0

  • Adapted scripts and queries to data.gov.uk setup (#1)

Pre-fork change log

0.7.0

  • Added properties dgfr:responseStatusCode, dgfr:responseTime and dgfr:availabilityCheckedOn to the ontology and API configuration
  • Added direct link between organizations and published distributions (see the result in the data model
  • Added a view for anavailable resources in the API (https://www.data.maudry.com/fr/resources/unavailable)
  • Icons for boolean values (true/false) are clearer now

0.6.0

0.5.0

  • Availability and unavailability count at dataset and organization levels
0.4.3
  • Made SPARQL endpoint configuration more flexible
0.4.2
  • Fixed errors in ontology
0.4.1
  • Disabled archiving of RDF due to disk space. Will enable again when I have a clearer archiving strategy.

0.4.0

  • Calculation of popularity points for all objects, and aggregate sums on organisations and datasets
  • Integration of the data collected by beheader (availability of the distributions, content type, content length)
0.3.3
  • Enabled ETL with previously downloaded data to have CasanovaLD up quicker
0.3.2
  • Not much...
0.3.1

0.3.0

  • The RDF data is now loaded in a single atomic transaction in the repository
  • Switch from Dydra (http://dydra.com) to a local Apache Fuseki instance
  • Added organizations and reuses data, with all identifiers turned into URIs for full linking
0.2.1
  • That was a lame name. Say hi to CasanovaLD!
  • Improved documentation

0.2.0

  • The data.gouv.fr explorer app, with somewhat documented APIs, is live!
  • URIs have changed to match the domain of the app
  • Added dgfr:visits and dcterms:keywords (as comma-separated list, meh) in the data
0.1.5
0.1.4
  • Fixed missing properties (mismatch at conversion stage). Still no tags
0.1.3
  • Fixed RDF dataset modification date
0.1.2
  • Fixed resources that have spaces in their URLs (url-encode)
  • Added dgfr:slug for datasets
0.1.1
  • Configured upload and update of VoID and PROV metadata (in default graph)
  • Enabled scheduled task to update data every day

0.1.0

  • Script to download/clean/convert/publish data.gouv.fr dataset metadata
  • Basic documentation

About

Fork of datagouvfr-rdf applied to data.gov.uk metadata

Resources

License

Packages

No packages published
You can’t perform that action at this time.