Get all Swedish Bathing Waters with P9616 in this project from Wikidata and checks if they exist on eionet as a WaterBody 

https://dd.eionet.europa.eu/vocabularyconcept/wise/WaterBody

* The project: [github salgo60/Svenskabadplatser](https://github.com/salgo60/Svenskabadplatser)
  * European bathwaters [GITHUB](https://github.com/salgo60/EuropeanBathingWater/blob/main/README.md) / [Wikidata](https://www.wikidata.org/wiki/Wikidata:WikiProject_European_Bath_Waters)
* this [Notebook](https://github.com/salgo60/Svenskabadplatser/blob/main/Jupyter/Eionet%20Data%20Dictionary.ipynb)

**See also** [github salgo60/EuropeanBathingWater](https://github.com/salgo60/EuropeanBathingWater/blob/main/Jupyter/Eionet%20Data%20Dictionary.ipynb)


Status:  



| Date | Total | Ok | Error 
| ------------- |:-------------:|:-------------:|:-------------:|
| 20210626 | 2654 | 833 | 1821 |


In [1]:
from datetime import datetime
start_time  = datetime.now()
print("Last run: ", start_time)

In [2]:
import pandas as pd


In [3]:
#
# pip install sparqlwrapper
# https://rdflib.github.io/sparqlwrapper/

import sys,json
from SPARQLWrapper import SPARQLWrapper, JSON

endpoint_url = "https://query.wikidata.org/sparql"
 
# get Swedish baths 
# https://w.wiki/3YdP
queryBath = """SELECT DISTINCT  (REPLACE(STR(?nodebath), ".*Q", "Q") AS ?wikidata) ?nodebath ?bathwateridentifier
(URI(CONCAT("https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.",
       str(?bathwateridentifier))) AS ?eionet)  {
      ?nodebath wdt:P9616 ?bathwateridentifier .
      ?nodebath wdt:P17 wd:Q34 .
}
"""


def get_sparql_dataframe(endpoint_url, query):
    """
    Helper function to convert SPARQL results into a Pandas data frame.
    """
    user_agent = "salgo60/%s.%s" % (sys.version_info[0], sys.version_info[1])
 
    sparql = SPARQLWrapper(endpoint_url, agent=user_agent)
    sparql.setQuery(query)
    sparql.setReturnFormat(JSON)
    result = sparql.query()

    processed_results = json.load(result.response)
    cols = processed_results['head']['vars']

    out = []
    for row in processed_results['results']['bindings']:
        item = []
        for c in cols:
            item.append(row.get(c, {}).get('value'))
        out.append(item)

    return pd.DataFrame(out, columns=cols)

WDBath = get_sparql_dataframe(endpoint_url, queryBath)
WDBath.shape

(2654, 4)

In [4]:
WDBath.head()

Unnamed: 0,wikidata,nodebath,bathwateridentifier,eionet
0,Q106707050,http://www.wikidata.org/entity/Q106707050,SE0411082000000171,https://dd.eionet.europa.eu/vocabularyconcept/...
1,Q106707054,http://www.wikidata.org/entity/Q106707054,SE0411082000000175,https://dd.eionet.europa.eu/vocabularyconcept/...
2,Q106707055,http://www.wikidata.org/entity/Q106707055,SE0411082000000176,https://dd.eionet.europa.eu/vocabularyconcept/...
3,Q106707056,http://www.wikidata.org/entity/Q106707056,SE0A21430000000177,https://dd.eionet.europa.eu/vocabularyconcept/...
4,Q106707057,http://www.wikidata.org/entity/Q106707057,SE0A11381000000181,https://dd.eionet.europa.eu/vocabularyconcept/...


In [5]:
import urllib3, json
from tqdm import tqdm
http = urllib3.PoolManager()
urlHav = "https://badplatsen.havochvatten.se/badplatsen/api/detail/" 

listBath = []
for WD, row in tqdm(WDBath.iterrows(), total=WDBath.shape[0]):
    url = row["eionet"] 
    
    new_item = dict()
    new_item['wikidata'] = row["wikidata"] 
    #print(url)
    try:
        r = http.request('GET', url) 
        new_item['status'] = r.status
        if  r.status == 404:
            #check API for reason
            try:
                urlHavBath = urlHav + row["bathwateridentifier"]
                rHav = http.request('GET',urlHavBath , 
                                    headers={'Content-Type': 'application/json'})
                rHavData = json.loads(rHav.data.decode('utf-8'))  
                #for key, value in rHavData.items() :
                    #print ("\t\t",key, value)
                new_item['euType'] = rHavData["euType"]
                new_item['euMotive'] = rHavData["euMotive"]
                new_item['NotEuMotive'] = rHavData["NotEuMotive"]
                
            except Exception as e:
                print ("Hav except ", e, urlHavBath, " WD:",row["wikidata"] )

    except:
        #print (r.status, url)
        new_item['status'] = r.status
    new_item['eionet'] = url 
    new_item['bathwateridentifier'] = row["bathwateridentifier"] 
#    new_item['country'] = row["country"] 
    
    listBath.append(new_item)
print (len(listBath))

100%|██████████| 2654/2654 [02:33<00:00, 17.34it/s]

2654





In [6]:
#listBath

In [7]:
Eionettot = pd.DataFrame(listBath,
                  columns=['wikidata','bathwateridentifier','status','eionet','euType','euMotive','NotEuMotive'])
Eionettot.shape


(2654, 7)

In [8]:
Eionettot.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2654 entries, 0 to 2653
Data columns (total 7 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   wikidata             2654 non-null   object 
 1   bathwateridentifier  2654 non-null   object 
 2   status               2654 non-null   int64  
 3   eionet               2654 non-null   object 
 4   euType               1821 non-null   object 
 5   euMotive             6 non-null      object 
 6   NotEuMotive          0 non-null      float64
dtypes: float64(1), int64(1), object(5)
memory usage: 145.3+ KB


In [9]:
pd.set_option('max_colwidth', 400)
Eionettot.head(10)

Unnamed: 0,wikidata,bathwateridentifier,status,eionet,euType,euMotive,NotEuMotive
0,Q106707050,SE0411082000000171,404,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0411082000000171,False,,
1,Q106707054,SE0411082000000175,404,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0411082000000175,False,,
2,Q106707055,SE0411082000000176,404,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0411082000000176,False,,
3,Q106707056,SE0A21430000000177,404,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A21430000000177,False,,
4,Q106707057,SE0A11381000000181,200,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A11381000000181,,,
5,Q106707059,SE0A11381000000183,200,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A11381000000183,,,
6,Q106707058,SE0A11381000000182,200,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A11381000000182,,,
7,Q106707063,SE0A11381000000195,404,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A11381000000195,False,,
8,Q106707060,SE0A11381000000186,200,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A11381000000186,,,
9,Q106707062,SE0A11381000000189,200,https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode.SE0A11381000000189,,,


In [10]:
#Eionettot["link"] = "<a href='https://dd.eionet.europa.eu/vocabularyconcept/wise/WFDProtectedArea/euProtectedAreaCode." + Eionettot["eionet"].astype(str) + "'">link eionet</a>"
Eionettot["link"] = "<a href='" + Eionettot["eionet"].astype(str) + "'>link eionet</a>"
Eionettot["WD"] = "<a href='https://www.wikidata.org/wiki/" + Eionettot["wikidata"].astype(str) + "'>link WD</a>"


In [11]:
from IPython.display import display, HTML  

#Eionettot.value_counts({"status","country"})
#Eionettot[['status', 'country']].apply(pd.Series.value_counts)
HTML(Eionettot[{'WD','bathwateridentifier','status','link','euType','euMotive','NotEuMotive'}].tail(50).to_html(escape=False))

Unnamed: 0,euMotive,status,link,WD,euType,bathwateridentifier,NotEuMotive
2604,,404,link eionet,link WD,False,SE0A21496000004517,
2605,,404,link eionet,link WD,False,SE0812480000004523,
2606,,404,link eionet,link WD,False,SE0622026000004521,
2607,,404,link eionet,link WD,False,SE0622039000004525,
2608,,404,link eionet,link WD,False,SE0812480000004522,
2609,,404,link eionet,link WD,False,SE0110163000004528,
2610,,404,link eionet,link WD,False,SE0622039000004526,
2611,,404,link eionet,link WD,False,SE0622039000004527,
2612,,404,link eionet,link WD,False,SE0A11384000004534,
2613,,404,link eionet,link WD,False,SE0822505000004531,


In [12]:
EionettotOk = Eionettot[(Eionettot['status']==200)] 
EionettotError = Eionettot[(Eionettot['status']==404)]

In [13]:
EionettotOk.shape

(833, 9)

In [14]:
EionettotError.shape

(1821, 9)

In [15]:
#EionettotError.value_counts("country")

In [16]:
EionettotError.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 1821 entries, 0 to 2653
Data columns (total 9 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   wikidata             1821 non-null   object 
 1   bathwateridentifier  1821 non-null   object 
 2   status               1821 non-null   int64  
 3   eionet               1821 non-null   object 
 4   euType               1821 non-null   object 
 5   euMotive             6 non-null      object 
 6   NotEuMotive          0 non-null      float64
 7   link                 1821 non-null   object 
 8   WD                   1821 non-null   object 
dtypes: float64(1), int64(1), object(7)
memory usage: 142.3+ KB


In [17]:

HTML(EionettotError[{'WD','bathwateridentifier','status','link','euType','euMotive','NotEuMotive'}].tail(10).to_html(escape=False))

Unnamed: 0,euMotive,status,link,WD,euType,bathwateridentifier,NotEuMotive
2644,,404,link eionet,link WD,False,SE0722380000005054,
2645,,404,link eionet,link WD,False,SE0110125000005035,
2646,,404,link eionet,link WD,False,SE0722380000005055,
2647,,404,link eionet,link WD,False,SE0722380000005056,
2648,,404,link eionet,link WD,False,SE0110188000005094,
2649,,404,link eionet,link WD,False,SE0722380000005057,
2650,,404,link eionet,link WD,False,SE0812418000005114,
2651,,404,link eionet,link WD,False,SE0611785000005134,
2652,,404,link eionet,link WD,False,SE0110126000005174,
2653,,404,link eionet,link WD,False,SE0930861000005154,


In [18]:
EionettotErrorEuType = EionettotError[EionettotError["euType"] == True] 
HTML(EionettotErrorEuType[{'WD','bathwateridentifier','status','link','euType','euMotive','NotEuMotive'}].tail(10).to_html(escape=False))

Unnamed: 0,euMotive,status,link,WD,euType,bathwateridentifier,NotEuMotive
701,Trädäcken vid badplatsen Sundspromenaden är en naturlig och mycket populär mötesplats för alla sol- och badsugna under sommarhalvåret i Malmö men används även för vinterbadare. Besöksstatistiken för Sundspromenadens badplats har registrerats under badsäsongen som ca 3000-10000personer per vecka beroende på väder.,404,link eionet,link WD,True,SE0441280000004499,
781,"En insjö med klart vatten som är en vattentäkt. Badplatsen är välbesökt året runt. Föreningen har gjort flera grillplatser, byggt en egen vacker grillkåta och byggt till lekplatsen på badplatsen. Badplatsen har ett bryggomslutet barnbad och ytterligare en brygga med hopptorn samt två flottar. Toaletter och omklädningsbås finns. Antalet besökare uppskattas till drygt 200 personer fina sommardagar.",404,link eionet,link WD,True,SE0411080000000220,
1873,Uppskattat antal badande till mer än 200 personer/dag under fina sommardagar.,404,link eionet,link WD,True,SE0A21435000006039,
2246,Uppskattat antal badande till mer än 200 personer/dag under fina sommardagar.,404,link eionet,link WD,True,SE0A21435000004327,
2295,Uppskattat antal badande till mer än 200 personer/dag under fina sommardagar.,404,link eionet,link WD,True,SE0A21435000004425,


In [19]:
EionettotError["euMotive"].value_counts()

Uppskattat antal badande till mer än 200 personer/dag under fina sommardagar.                                                                                                                                                                                                                                                                                                                                      3
Trädäcken vid badplatsen Sundspromenaden är en naturlig och mycket populär mötesplats för alla sol- och badsugna under sommarhalvåret i Malmö men används även för vinterbadare. Besöksstatistiken för Sundspromenadens badplats har registrerats under badsäsongen som ca 3000-10000personer per vecka beroende på väder.                                                                                         1
Felaktig inmatning                                                                                                                                                                            

In [20]:
EionettotOk.shape

(833, 9)

In [21]:
#EionettotOk.value_counts("country")

In [22]:
EionettotOk.to_csv("BathIdentifier_Ok.csv")
EionettotError.to_csv("BathIdentifier_Error.csv")
Eionettot.to_csv("BathIdentifier_All.csv")


Generate Markdown table eg.
| 20210610 | 3176 | 2240 | 936 |


In [23]:
print("|",start_time.strftime("%Y%m%d"),"|", \
      Eionettot.shape[0],"|", \
      EionettotOk.shape[0],"|", \
      EionettotError.shape[0],"|",)


| 20210626 | 2654 | 833 | 1821 |


In [24]:
end = datetime.now()
print("Ended: ", end) 
print('Time elapsed (hh:mm:ss.ms) {}'.format(datetime.now() - start_time))

Ended:  2021-06-26 15:04:48.700115
Time elapsed (hh:mm:ss.ms) 0:02:35.304834
