-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add concepts derived from wikidata to nwbib-spatial #85
Comments
In a slide for the NWBib meeting I added an example on how it could/should look like after transformation to SKOS: <http://purl.org/lobid/nwbib-spatial#n9Q7924>
a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel "Regierungsbezirk Arnsberg"@de ;
foaf:focus <http://www.wikidata.org/entity/Q7924> ;
skos:broader <http://purl.org/lobid/nwbib-spatial#n9> ;
skos:narrower <http://purl.org/lobid/nwbib-spatial#9Q1295> ;
skos:notation "9Q7924" .
<http://purl.org/lobid/nwbib-spatial#9Q1295>
a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel "Dortmund"@de ;
foaf:focus <http://www.wikidata.org/entity/Q1295> ;
skos:broader <http://purl.org/lobid/nwbib-spatial#n9Q7924> ;
skos:narrower .... ;
skos:notation "9Q7924" .
<http://purl.org/lobid/nwbib-spatial#n9>
a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel "Regierungsbezirke, Kreise, Orte. Euregio"@de ;
skos:narrower <http://purl.org/lobid/nwbib-spatial#n9Q7924;
skos:notation "9" . |
I started playing with SPARQL CONSTRUCT to create the file. Had several problem running the query via curl. In the end I found out that you should not have tabs in the query and then it will run. Here is the result of the current query (example Q1295/Dortmund as in the example above): @prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix skos: <http://www.w3.org/2004/02/skos/core#> .
@prefix wd: <http://www.wikidata.org/entity/> .
wd:Q1295 a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel "Dortmund"@de ;
foaf:focus wd:Q1295 ;
skos:notation "9http://www.wikidata.org/entity/Q1295" ;
skos:broader wd:Q7924 . It already looks quite good. Here is the query for curl: curl -H "Accept: text/turtle" -G "https://query.wikidata.org/sparql" --data-urlencode query='
CONSTRUCT {
?item a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel ?itemLabel ;
foaf:focus ?item ;
skos:notation ?notation ;
skos:broader ?broader .
}
WHERE {
{
{ ?item wdt:P131* wd:Q1198 . }
UNION
{ ?item p:P131 [ ps:P131 wd:Q1198 ] . }
{ ?item wdt:P131 ?broader . }
{ ?item p:P31 [ ps:P31 wd:Q829277 ] . } # Regierungsbezirk in NRW
UNION
{ ?item p:P31 [ ps:P31 wd:Q106658 ] . } # Landkreis in Deutschland
UNION
{ ?item p:P31 [ ps:P31 wd:Q5283531 ] . } # Landkreis in Preußen
UNION
{ ?item p:P31 [ ps:P31 wd:Q262166 ] . } # Gemeinde in Deutschland
UNION
{ ?item p:P31 [ ps:P31 wd:Q22865 ] . } # kreisfreie Stadt in Deutschland
UNION
{ ?item p:P31 [ ps:P31 wd:Q253019 ]. } # Ortsteil
UNION
{ ?item p:P31 [ ps:P31 wd:Q2983893 ]. } # Stadtteil
UNION
{ ?item p:P31 [ ps:P31 wd:Q42744322 ]. } # Stadtgemeinde Deutschlands
UNION
{ ?item p:P31 [ ps:P31 wd:Q134626 ]. } # Kreisstadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q448801 ]. } # Große Kreisstadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q1548518 ]. } # Große kreisangehörige Stadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q54935786 ]. } # Mittlere kreisangehörige Stadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q1852178 ] . } # Stadteil von Düsseldorf
UNION
{ ?item p:P31 [ ps:P31 wd:Q15632166 ] . } # Stadtteil von Köln
UNION
{ ?item wdt:P31/wdt:P279* wd:Q3146899 . } # Diözese der katholischen Kirche
UNION
{ ?item p:P361 [ps:P361 wd:Q1380992 ] . } # Teil der ev. Kirche im Rheinland
UNION
{ ?item p:P361 [ ps:P361 wd:Q1381014 ] . } # Teil der ev. Kirche Westfalen
UNION
{ ?item p:P31 [ps:P31 wd:Q1780389 ] . } # Kommunalverband der besonderen Art (derzeit nur "Städteregion Aachen")
UNION
{ ?item wdt:P31/wdt:P279* wd:Q4286337 . } # Stadtbezirk, für Geocache auskommentieren
}
FILTER (?item != wd:Q1787449 && ?item != wd:Q16500124 && ?item != wd:Q1465811 && ?item != wd:Q1787449
&& ?item != wd:Q16832627 && ?item != wd:Q1113210 && ?item != wd:Q19288281 && ?item != wd:Q1662807
&& ?item != wd:Q1351319 ) # Herausfiltern von Altkreisen, die namensidentisch sind mit Neukreisen
BIND(CONCAT("9", STR(?item)) AS ?notation)
SERVICE wikibase:label { bd:serviceParam wikibase:language "de" }
}' |
There are some that have a statement |
Oups, this is not correct as there also clearical regions (Dekanate, Kirchenkreise etc.). We should just remove them from the query resulting in this one: curl -H "Accept: text/turtle" -G "https://query.wikidata.org/sparql" --data-urlencode query='
CONSTRUCT {
?item a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel ?itemLabel ;
foaf:focus ?item ;
skos:notation ?notation ;
skos:broader ?broader .
}
WHERE {
{
{ ?item wdt:P131* wd:Q1198 . }
UNION
{ ?item p:P131 [ ps:P131 wd:Q1198 ] . }
{ ?item wdt:P131 ?broader . }
{ ?item p:P31 [ ps:P31 wd:Q829277 ] . } # Regierungsbezirk in NRW
UNION
{ ?item p:P31 [ ps:P31 wd:Q106658 ] . } # Landkreis in Deutschland
UNION
{ ?item p:P31 [ ps:P31 wd:Q5283531 ] . } # Landkreis in Preußen
UNION
{ ?item p:P31 [ ps:P31 wd:Q262166 ] . } # Gemeinde in Deutschland
UNION
{ ?item p:P31 [ ps:P31 wd:Q22865 ] . } # kreisfreie Stadt in Deutschland
UNION
{ ?item p:P31 [ ps:P31 wd:Q253019 ]. } # Ortsteil
UNION
{ ?item p:P31 [ ps:P31 wd:Q2983893 ]. } # Stadtteil
UNION
{ ?item p:P31 [ ps:P31 wd:Q42744322 ]. } # Stadtgemeinde Deutschlands
UNION
{ ?item p:P31 [ ps:P31 wd:Q134626 ]. } # Kreisstadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q448801 ]. } # Große Kreisstadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q1548518 ]. } # Große kreisangehörige Stadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q54935786 ]. } # Mittlere kreisangehörige Stadt
UNION
{ ?item p:P31 [ ps:P31 wd:Q1852178 ] . } # Stadteil von Düsseldorf
UNION
{ ?item p:P31 [ ps:P31 wd:Q15632166 ] . } # Stadtteil von Köln
UNION
{ ?item p:P31 [ps:P31 wd:Q1780389 ] . } # Kommunalverband der besonderen Art (derzeit nur "Städteregion Aachen")
UNION
{ ?item wdt:P31/wdt:P279* wd:Q4286337 . } # Stadtbezirk, für Geocache auskommentieren
}
FILTER (?item != wd:Q1787449 && ?item != wd:Q16500124 && ?item != wd:Q1465811 && ?item != wd:Q1787449
&& ?item != wd:Q16832627 && ?item != wd:Q1113210 && ?item != wd:Q19288281 && ?item != wd:Q1662807
&& ?item != wd:Q1351319 ) # Herausfiltern von Altkreisen, die namensidentisch sind mit Neukreisen
BIND(CONCAT("9", STR(?item)) AS ?notation)
SERVICE wikibase:label { bd:serviceParam wikibase:language "de" }
}' |
Finally, I managed to create the whole SKOS via SPARQL: $ curl -H "Accept: text/turtle" -G "https://query.wikidata.org/sparql" --data-urlencode query='
CONSTRUCT {
?lobidURI a skos:Concept ;
skos:inScheme <http://purl.org/lobid/nwbib-spatial> ;
skos:prefLabel ?wikidataURILabel ;
foaf:focus ?wikidataURI ;
skos:notation ?QID ;
skos:broader ?broaderURI .
}
WHERE {
{
{ ?wikidataURI wdt:P131* wd:Q1198 . }
UNION
{ ?wikidataURI p:P131 [ ps:P131 wd:Q1198 ] . }
{ ?wikidataURI p:P31 [ ps:P31 wd:Q829277 ] . } # Regierungsbezirk in NRW
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q106658 ] . } # Landkreis in Deutschland
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q5283531 ] . } # Landkreis in Preußen
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q262166 ] . } # Gemeinde in Deutschland
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q22865 ] . } # kreisfreie Stadt in Deutschland
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q253019 ]. } # Ortsteil
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q2983893 ]. } # Stadtteil
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q42744322 ]. } # Stadtgemeinde Deutschlands
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q134626 ]. } # Kreisstadt
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q448801 ]. } # Große Kreisstadt
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q1548518 ]. } # Große kreisangehörige Stadt
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q54935786 ]. } # Mittlere kreisangehörige Stadt
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q1852178 ] . } # Stadteil von Düsseldorf
UNION
{ ?wikidataURI p:P31 [ ps:P31 wd:Q15632166 ] . } # Stadtteil von Köln
UNION
{ ?wikidataURI p:P31 [ps:P31 wd:Q1780389 ] . } # Kommunalverband der besonderen Art (derzeit nur "Städteregion Aachen")
UNION
{ ?wikidataURI wdt:P31/wdt:P279* wd:Q4286337 . } # Stadtbezirk, für Geocache auskommentieren
OPTIONAL { ?wikidataURI wdt:P131 ?broader . }
}
# FILTER (?wikidataURI in (wd:Q1295))
FILTER (?wikidataURI != wd:Q1787449 && ?wikidataURI != wd:Q16500124 && ?wikidataURI != wd:Q1465811 && ?wikidataURI != wd:Q1787449
&& ?wikidataURI != wd:Q16832627 && ?wikidataURI != wd:Q1113210 && ?wikidataURI != wd:Q19288281 && ?wikidataURI != wd:Q1662807
&& ?wikidataURI != wd:Q1351319 ) # Herausfiltern von Altkreisen, die namensidentisch sind mit Neukreisen
BIND (STRAFTER (STR(?wikidataURI),"entity/") AS ?QID)
BIND (STRAFTER (STR(?broader),"entity/") AS ?broaderQID)
BIND (URI(CONCAT ("http://purl.org/lobid/nwbib-spatial#", ?QID)) AS ?lobidURI)
BIND (URI(CONCAT ("http://purl.org/lobid/nwbib-spatial#", ?broaderQID)) AS ?broaderURI)
SERVICE wikibase:label { bd:serviceParam wikibase:language "de" }
}' There is some post-processing to do anyway:
@fsteeg, will you do 1.) and 2.) and then add the result to the SKOS file? I will then see what to do regarding 3.). |
The SPARQL query from #85 (comment) is fine but the SPARQL endpoint does not seem to finish the construct for every entity. Take for example Q2362403, it only has one triple in the resulting Turtle: but when I do the same query with a filter on only this resource (
One solution might be to do this in steps or to use the LDF endpoint... |
Data from https://nwbib.de/spatial, manually added ConceptScheme See #85
As discussed offline, I generated a SKOS file from the current data at https://nwbib.de/spatial: 660c094 (Raw file at https://raw.githubusercontent.com/hbz/lobid-vocabs/660c0949dee6ec900d3ac058f023c629920d907c/nwbib/nwbib-spatial.ttl) We have the number of hits in NWBib at that point, so if that makes sense, we can add them to the file. |
Looks good except for one thing:
No, those numbers don't make sense in the skos file. |
Required to use the org.apache.jena.vocabulary package See hbz/lobid-vocabs#85
Fixed foaf:focus values and generated ConceptScheme data from the RDF model: |
I just noticed that the end date is also part of the prefLabel, e.g.:
This is technically not correct. We will have to think about how to handle this. One option is to use |
As discussed on the mailing list today we should take care of identfying "Stadtbezirke" via the label when generating the SKOS file. Will add this as a task to the original issue: "Add suffix " (Stadtbezirk)" to the label when |
Use new P6814 query, AGS or KS as notation, tweak prefLabel See #85
Regenerated after fixing an issue: https://raw.githubusercontent.com/hbz/lobid-vocabs/43f85d1fc0fbd7052ef8af646eed5a9f53293b0c/nwbib/nwbib-spatial.ttl But noticed a problem: many entries now have multiple |
Yes, I mentioned this yesterday. The solution is to discard the |
Other wise the file looks fine except for one error in the nwbib-spatial:Q2103 a skos:Concept ;
skos:broader nwbib-spatial:Q7924 ;
skos:inScheme <https://nwbib.de/spatial> ;
skos:notation "05911000" ;
skos:prefLabel "Bochum"@de ;
foaf:focus nwbib-spatial:Q2103 . should become nwbib-spatial:Q2103 a skos:Concept ;
skos:broader nwbib-spatial:Q7924 ;
skos:inScheme <https://nwbib.de/spatial> ;
skos:notation "05911000" ;
skos:prefLabel "Bochum"@de ;
foaf:focus wd:Q2103 . Another thing: The lobid-vocabs/nwbib/nwbib-spatial.ttl Lines 54 to 60 in 7e54138
|
Latest version (no multiple Remaining TODO in this issue: retain |
Latest version including original |
Looks good. I think we are done with this issue. +1 |
In the first comment here, there's a TODO:
Is that (still) relevant? |
If true, enrich SKOS file using Wikidata and non-90s-qids.json See: hbz/lobid-vocabs#85 hbz/lobid-vocabs#86
I don't think so. When I remember correctly, adding this created some other problems (e.g. double mention of "Stadtbezirk" in some labels). We won't implement this and will pick it up if editors ask for it again. |
Similar to the process from hbz/nwbib#397. Currently a
skos:Concept
in nwbib-spatial has the following information (example):lobid-vocabs/nwbib/nwbib-spatial.ttl
Lines 95 to 100 in aab7725
We will have to add a link to the wikidata entity the concept is derived from (using the property
http://xmlns.com/foaf/0.1/focus
).Open questions:
What URIs to use for Wikidata-derived concepts? Options
http://www.wikidata.org/entity/Q884315
http://purl.org/lobid/nwbib-spatial#nQ884315
orhttp://purl.org/lobid/nwbib-spatial#Q884315
http://purl.org/lobid/nwbib-spatial#24Q884315
Add suffix " (Stadtbezirk)" to the label when
?item wdt:P31/wdt:P279* wd:Q4286337
and "Stadtbezirk" is not already part of the labelThe text was updated successfully, but these errors were encountered: