Get all Outdoor Gyms in this project from Wikidata and checks if they are connected to an OSM object using an [API](https://osm.wikidata.link/tagged)

* The project: [github salgo60/ProjectOutdoorGyms](https://github.com/salgo60/ProjectOutdoorGyms)
* this [Notebook](https://github.com/salgo60/ProjectOutdoorGyms/blob/main/Jupyter/OSM_Wikidata_OutdoorGyms.ipynb)

* API [Wikidata to OpenStreetMap](https://osm.wikidata.link/tagged)
  * eg. [https://osm.wikidata.link/tagged/api/item/Q106708773](https://osm.wikidata.link/tagged/api/item/Q106708773)

* Another tool [osm.wikidata.link](https://osm.wikidata.link/search)
  
Status:  



| Date | Outdoor Gym | no WD - OSM | 
| ------------- |:-------------:| -----:|
| 20220529 | 1464 | 1217 |
| 20211119 | 1439 | 1121 |
| 20210917 | 1434 | 1116 |
| 20210823 | 1415 | 1105 |
| 20210717 | 1193 | 1024 |
| 20210712 | 1131 | 979 |
| 20210706 | 809 | 706 |
| 20210614 | 216 | 203 |

TODO: 
* 


In [1]:
from datetime import datetime
start_time  = datetime.now()
print("Last run: ", start_time)

Last run:  2022-05-29 09:24:24.446576


In [2]:
import pandas as pd


In [3]:
#
# pip install sparqlwrapper
# https://rdflib.github.io/sparqlwrapper/

import sys,json
from SPARQLWrapper import SPARQLWrapper, JSON

endpoint_url = "https://query.wikidata.org/sparql"
 
# https://w.wiki/3Uni
queryGym = """SELECT (REPLACE(STR(?node), ".*Q", "Q") AS ?qid) ?nodeLabel WHERE {
  VALUES ?nodeProj {wd:Q107186275}
  ?node wdt:P6104 ?nodeProj.
  minus   { ?node wikibase:propertyType ?type} # not properties

  SERVICE wikibase:label { bd:serviceParam wikibase:language "sv,en". }
}
ORDER BY (?nodeLabel)"""


def get_sparql_dataframe(endpoint_url, query):
    """
    Helper function to convert SPARQL results into a Pandas data frame.
    """
    user_agent = "salgo60/%s.%s" % (sys.version_info[0], sys.version_info[1])
 
    sparql = SPARQLWrapper(endpoint_url, agent=user_agent)
    sparql.setQuery(query)
    sparql.setReturnFormat(JSON)
    result = sparql.query()

    processed_results = json.load(result.response)
    cols = processed_results['head']['vars']

    out = []
    for row in processed_results['results']['bindings']:
        item = []
        for c in cols:
            item.append(row.get(c, {}).get('value'))
        out.append(item)

    return pd.DataFrame(out, columns=cols)

WDGym = get_sparql_dataframe(endpoint_url, queryGym)
WDGym["Source"] = "WD"     
WDGym.shape

(1464, 3)

In [4]:
WDGym.head()

Unnamed: 0,qid,nodeLabel,Source
0,Q107206033,Q107206033,WD
1,Q107206129,Q107206129,WD
2,Q107315418,Q107315418,WD
3,Q107326096,Q107326096,WD
4,Q107327233,Q107327233,WD


In [5]:
import urllib3, json
from tqdm import tqdm
http = urllib3.PoolManager()

listGym = []
for WD, row in tqdm(WDGym.iterrows(), total=WDGym.shape[0]):
    url = "https://osm.wikidata.link/tagged/api/item/" + row["qid"] 
    
    new_item = dict()
    new_item['wikidata'] = row["qid"] 
    try:
        r = http.request('GET', url) 
        data = json.loads(r.data.decode('utf-8'))
    except:
        print (r.status, url)
#    print (r.status)
    try:
        osmid = data["osm"][0]["id"]        
    except:
        #print ("error")
        osmid =""
    new_item['osmid'] = osmid 
    listGym.append(new_item)
print (len(listGym))

100%|██████████| 1464/1464 [02:19<00:00, 10.51it/s]

1464





In [6]:
OSMtot = pd.DataFrame(listGym,
                  columns=['wikidata','osmid'])
OSMtot.shape


(1464, 2)

In [7]:
pd.set_option('max_colwidth', 400)
OSMtot.head(10)

Unnamed: 0,wikidata,osmid
0,Q107206033,
1,Q107206129,
2,Q107315418,
3,Q107326096,
4,Q107327233,
5,Q107343382,762972071.0
6,Q107361377,
7,Q107364893,
8,Q107369355,
9,Q107392918,


In [8]:
#OSMempty = OSMtot.osmid.notnull()
OSMtot[(OSMtot['osmid']=="")].shape

(1217, 2)

In [9]:
OSMEmpty =OSMtot[(OSMtot['osmid']=="")]

In [10]:
OSMEmpty.shape

(1217, 2)

In [11]:
OSMEmpty.to_csv("WD - OSM Outdoor gym missing.csv")

OSMEmpty.head()

Unnamed: 0,wikidata,osmid
0,Q107206033,
1,Q107206129,
2,Q107315418,
3,Q107326096,
4,Q107327233,


In [12]:
OSMConnected=OSMtot[(OSMtot['osmid']!="")]
OSMConnected.to_csv("WD - OSM Outdoor gym.csv")
OSMConnected.head()

Unnamed: 0,wikidata,osmid
5,Q107343382,762972071
53,Q107393812,256690485
78,Q107438369,573927718
116,Q107449448,712108573
119,Q96105887,813642780


In [13]:
print("*", start_time.strftime("%Y%m%d"),"Outdoor gym", WDGym.shape[0], "ej OSM kopplade",OSMEmpty.shape[0]) 


* 20220529 Outdoor gym 1464 ej OSM kopplade 1217


Generate Markdown table eg.
| 20210526     | 2802 | 2050 |1147 | 254 | 213| 84|


In [14]:
print("|",start_time.strftime("%Y%m%d"),"|", \
      WDGym.shape[0],"|",OSMEmpty.wikidata.nunique(),"|")


| 20220529 | 1464 | 1217 |


In [15]:
end = datetime.now()
print("Ended: ", end) 
print('Time elapsed (hh:mm:ss.ms) {}'.format(datetime.now() - start_time))

Ended:  2022-05-29 09:26:48.461423
Time elapsed (hh:mm:ss.ms) 0:02:24.016275
