# EPO Linked Data Query Practice

We will go through a few examples of how to query the EPO Linked Data API. The examples are based on the [EPO Linked Data API documentation](https://data.epo.org/linked-data/sparql.htmll).

In [11]:
import pandas as pd
from SPARQLWrapper import SPARQLWrapper, JSON, XML, RDF, CSV, N3, TURTLE, TSV

In [15]:
# end point for EPO
sparql = SPARQLWrapper("https://data.epo.org/linked-data/query") #determine SPARQL endpoint
sparql.setReturnFormat(JSON) #determine the output format

In [23]:
# list of applications
sparql.setQuery("""
    prefix patent: <http://data.epo.org/linked-data/def/patent/>
    prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

    SELECT ?application ?appNum ?filingDate ?authority
    WHERE {
    ?application rdf:type patent:Application ;
        patent:applicationNumber ?appNum ;
        patent:filingDate        ?filingDate ; 
        patent:applicationAuthority ?authority ;
        .
    } LIMIT 10
""")

results = sparql.query().convert()

In [24]:
results = pd.json_normalize(results['results']['bindings'])
results.head()

Unnamed: 0,application.type,application.value,appNum.type,appNum.value,filingDate.type,filingDate.datatype,filingDate.value,authority.type,authority.value
0,uri,http://data.epo.org/linked-data/id/application...,literal,92911263,literal,http://www.w3.org/2001/XMLSchema#date,1992-05-14,uri,http://data.epo.org/linked-data/id/st3/AT
1,uri,http://data.epo.org/linked-data/id/application...,literal,69230802,literal,http://www.w3.org/2001/XMLSchema#date,1992-05-14,uri,http://data.epo.org/linked-data/id/st3/DE
2,uri,http://data.epo.org/linked-data/id/application...,literal,51125992,literal,http://www.w3.org/2001/XMLSchema#date,1992-05-14,uri,http://data.epo.org/linked-data/id/st3/JP
3,uri,http://data.epo.org/linked-data/id/application...,literal,1992FR00424,literal,http://www.w3.org/2001/XMLSchema#date,1992-05-14,uri,http://data.epo.org/linked-data/id/st3/WO
4,uri,http://data.epo.org/linked-data/id/application...,literal,1908492,literal,http://www.w3.org/2001/XMLSchema#date,1992-05-14,uri,http://data.epo.org/linked-data/id/st3/AU


In [5]:
# only keep columns that names contains '.value'
results = results[results.columns[results.columns.str.contains('.value')]]
results.head()

Unnamed: 0,application.value,appNum.value,filingDate.value,authority.value
0,http://data.epo.org/linked-data/id/application...,92911263,1992-05-14,http://data.epo.org/linked-data/id/st3/AT
1,http://data.epo.org/linked-data/id/application...,69230802,1992-05-14,http://data.epo.org/linked-data/id/st3/DE
2,http://data.epo.org/linked-data/id/application...,51125992,1992-05-14,http://data.epo.org/linked-data/id/st3/JP
3,http://data.epo.org/linked-data/id/application...,1992FR00424,1992-05-14,http://data.epo.org/linked-data/id/st3/WO
4,http://data.epo.org/linked-data/id/application...,1908492,1992-05-14,http://data.epo.org/linked-data/id/st3/AU


In [10]:
print(results['application.value'].values)

['http://data.epo.org/linked-data/id/application/AT/92911263T'
 'http://data.epo.org/linked-data/id/application/DE/69230802T'
 'http://data.epo.org/linked-data/id/application/JP/51125992'
 'http://data.epo.org/linked-data/id/application/WO/1992FR00424'
 'http://data.epo.org/linked-data/id/application/AU/1908492'
 'http://data.epo.org/linked-data/id/application/US/14615493'
 'http://data.epo.org/linked-data/id/application/FR/9105870'
 'http://data.epo.org/linked-data/id/application/OA/60435'
 'http://data.epo.org/linked-data/id/application/EP/92911263'
 'http://data.epo.org/linked-data/id/application/WO/1992FR000424']


In [21]:
# words in abstract
sparql.setQuery("""
prefix patent: <http://data.epo.org/linked-data/def/patent/>
prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
prefix dcterms: <http://purl.org/dc/terms/>
prefix text: <http://jena.apache.org/text#>

SELECT DISTINCT ?publication ?application ?title ?publication_date ?abstract
WHERE {
  ?publication text:query ( dcterms:abstract "battery" ) ;
               patent:titleOfInvention ?title ;
               patent:application ?application ;
               patent:publicationDate ?publication_date ;
               dcterms:abstract        ?abstract ; 
               .
} 
ORDER BY DESC(?publication_date)
LIMIT 100
""")

results = sparql.query().convert()

In [22]:
results = pd.json_normalize(results['results']['bindings'])
results = results[results.columns[results.columns.str.contains('.value')]]
results.head()

Unnamed: 0,publication.value,application.value,title.value,publication_date.value,abstract.value
0,http://data.epo.org/linked-data/data/publicati...,http://data.epo.org/linked-data/id/application...,"BATTERIEVERWALTUNGSVERFAHREN, SPEICHERSTEUERUN...",2024-03-20,A battery management method according to one a...
1,http://data.epo.org/linked-data/data/publicati...,http://data.epo.org/linked-data/id/application...,"BATTERY MANAGEMENT METHOD, STORAGE CONTROLLER,...",2024-03-20,A battery management method according to one a...
2,http://data.epo.org/linked-data/data/publicati...,http://data.epo.org/linked-data/id/application...,"PROCÉDÉ DE GESTION DE BATTERIE, CONTRÔLEUR DE ...",2024-03-20,A battery management method according to one a...
3,http://data.epo.org/linked-data/data/publicati...,http://data.epo.org/linked-data/id/application...,BATTERIEMODUL MIT EINER MITTELWAND,2024-03-20,A battery module includes: first and second su...
4,http://data.epo.org/linked-data/data/publicati...,http://data.epo.org/linked-data/id/application...,BATTERY MODULE INCLUDING A CENTER WALL,2024-03-20,A battery module includes: first and second su...


In [29]:
# information about a specific application
sparql.setQuery("""
SELECT DISTINCT * 
WHERE {
   <http://data.epo.org/linked-data/id/application/EP/01903571> ?uri_key ?uri_content .
}
""")

results = sparql.query().convert()      
    

In [30]:
results = pd.json_normalize(results['results']['bindings'])
results = results[results.columns[results.columns.str.contains('.value')]]

results

Unnamed: 0,uri_key.value,uri_content.value
0,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://data.epo.org/linked-data/def/patent/App...
1,http://data.epo.org/linked-data/def/patent/app...,http://data.epo.org/linked-data/id/st3/EP
2,http://data.epo.org/linked-data/def/patent/app...,01903571
3,http://data.epo.org/linked-data/def/patent/fil...,2001-01-20
4,http://data.epo.org/linked-data/def/patent/gra...,2006-08-16
5,http://data.epo.org/linked-data/def/patent/app...,EP20010903571
6,http://data.epo.org/linked-data/def/patent/app...,01903571
7,http://data.epo.org/linked-data/def/patent/pub...,http://data.epo.org/linked-data/data/publicati...
8,http://data.epo.org/linked-data/def/patent/pub...,http://data.epo.org/linked-data/data/publicati...
9,http://data.epo.org/linked-data/def/patent/app...,http://data.epo.org/linked-data/def/patent/app...


In [47]:
# information about a specific publication
sparql.setQuery("""
SELECT DISTINCT *
WHERE {
   <http://data.epo.org/linked-data/data/publication/EP/4339007/A1/-> ?title ?abstract .
}
""")
results = sparql.query().convert()

In [48]:

results = pd.json_normalize(results['results']['bindings'])
results = results[results.columns[results.columns.str.contains('.value')]]

In [50]:
results

Unnamed: 0,title.value,abstract.value
0,http://www.w3.org/1999/02/22-rdf-syntax-ns#type,http://data.epo.org/linked-data/def/patent/Pub...
1,http://www.w3.org/2000/01/rdf-schema#label,EP 4339007 A1
2,http://data.epo.org/linked-data/def/patent/pub...,http://data.epo.org/linked-data/id/st3/EP
3,http://data.epo.org/linked-data/def/patent/pub...,2024-03-20
4,http://data.epo.org/linked-data/def/patent/pub...,http://data.epo.org/linked-data/def/patent/pub...
5,http://data.epo.org/linked-data/def/patent/pub...,EUROPEAN PATENT APPLICATION
6,http://data.epo.org/linked-data/def/patent/pub...,4339007
7,http://purl.org/dc/terms/language,http://id.loc.gov/vocabulary/iso639-1/en
8,http://data.epo.org/linked-data/def/patent/age...,http://data.epo.org/linked-data/data/vc/AC646E...
9,http://data.epo.org/linked-data/def/patent/app...,http://data.epo.org/linked-data/data/vc/DD83CB...
