No description, website, or topics provided.
Branch: master
Clone or download
Pull request Compare This branch is even with jonlazaro:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
bizkaisense_sparql
bizkaisense_sql
rdf_generators
slides
sql_generators
.gitignore
README.md

README.md

EuskalSense (Formerly BizkaiSense)

Bizkaisense is an effort on the publication of enviromental Open Data from Basque Country as Linked Open Data. In this project we scrap, store, semantize and visualize air and drinking water quality measures.

At this moment the scripts scrap information from more than 70 air quality stations, more than 800 water sources and generate more than 9,000,000 observations.

This repository includes the scripts, the Django website in which we visualize the data, and the D2R script that makes the conversion from SQL to RDF.

In the first steps of the project we built some scripts that generate directly RDF data. These data was stored in a RDF store, and the Django website was entirely based on SPARQL queries over that RDF store. The performance of the website wasn't acceptable, as the response time of the RDF store to complex SPARQL queries was too high, due to the high volume of stored data. So we decided to modify the project and build it on top of a traditional SQL database. The RDF data is now generated with D2R (http://d2rq.org/d2r-server), and the mapping file is included on the repository. In a last update of the project we have improved the performance of the data generation scripts using Hadoop. We have developed some Pig scripts that make the same work as those in Python, but based on distributed computing, decreasing notably the data generation time.

Data included in this repository:

Scripts that get air and drinking water quality measures of Basque Country from Open Data Euskadi (ODE):

  • The folder sql_generators/pig contains the Pig script that gets measures from ODE using Hadoop, and stores them in a SQL database.
  • The folder sql_generators contains the Python scripts that get measures from ODE and store them in a SQL database.
  • The folder rdf_generators contains the Python scripts that get measures from ODE and create RDF triples based on SSN ontology.

Django websites:

  • The folder bizkaisense_sql contains the SQL-based website.
  • The folder bizkaisense_sparql contains the SPARQL-based website (low performance with high volume of data).

D2R mapping:

  • The "bizkaisense_sql/bizkaisense.ttl" file contains the mapping used for the conversion from SQL to RDF.

Slides:

  • The slides folder contains a presentation explaining the details of the project (spanish).