### Creation and Application of Knowledge Graphs - Exercise 2 (Solution)
# Knowledge Graph Extraction from Text
## Task 2

Write down the code in the provided cell and execute for the output.

To begin, run the code cell below to use the libraries. Also, in order to use spaCy models, we need to download 'en_core_web_sm' model for NLP functions for Wnglish.

In [1]:
# Required Libraries
from rdflib import Graph, RDF, RDFS, Namespace, URIRef, Literal
import spacy

# Download the language model for spaCy
# spacy.cli.download("en_core_web_sm")
nlp = spacy.load("en_core_web_sm")
    
# Namespaces
DBO = Namespace("https://dbpedia.org/ontology/")
DBR = Namespace("https://dbpedia.org/resource/")

#### Task 2.2
## Named Entity Recognition

Consider the following sentence from Task 1. Use the [spaCy]("https://spacy.io/usage/linguistic-features#named-entities") library to identify entities in sentence, with label and type.

In [6]:
sentence = "Barack Obama visited Berlin, the capital of Germany"
# Code here

document = nlp(sentence)
types = {
    "PERSON": DBO.Person,
    "GPE": DBO.Place,
}
print(document)
for entity in document.ents:
    print(entity)

Barack Obama
Berlin
Germany


#### Task 2.3

Create a Knowledge Graph containing all entities detected in Task 2.2, with labels and types. Use the RDFLib library for creation of graph, defining types and testing.

In [7]:
# Code here

# Creating graph instance
g = Graph()

for entity in document.ents:
    node = URIRef(DBR[entity.text.replace(" ", "_")])
    g.add((node, RDF.type, types[entity.label_]))
    g.add((node, RDFS.label, Literal(entity.text)))
    

print((g.serialize(format="ttl")))

@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

<https://dbpedia.org/resource/Barack_Obama> a <https://dbpedia.org/ontology/Person> ;
    rdfs:label "Barack Obama" .

<https://dbpedia.org/resource/Berlin> a <https://dbpedia.org/ontology/Place> ;
    rdfs:label "Berlin" .

<https://dbpedia.org/resource/Germany> a <https://dbpedia.org/ontology/Place> ;
    rdfs:label "Germany" .




Use the graph created in the previous cell to run the following SPARQL query. Print it's output.

In [8]:
query = """
PREFIX dbo: <https://dbpedia.org/ontology/>
SELECT ?placeLabel WHERE {
    ?place rdf:type dbo:Place .
    ?place rdfs:label ?placeLabel . 
}
"""
# Code here
result = g.query(query)
for record in result:
    print("{}".format(record.placeLabel))

Berlin
Germany
