 KEN 3140: Lab 2 (RDF basics)

**In this lab we are going to:**

- Create RDF triples with rdflib
- Save these files into various RDF serialisation syntaxes
- Verify the validity of a given list of IRIs

**Creating RDF triples**

RDF allows us to make statements about resources. A statement has the following structure:
# `<subject> <predicate> <object>`.

An RDF statement expresses a relationship between two resources. The subject and the object represent the two resources being related; the predicate represents the nature of their relationship. The relationship is phrased in a directional way (from subject to object) and is called in RDF a property. Because RDF statements consist of three elements they are called triples.

In [18]:
!pip install rdflib

You should consider upgrading via the '/home/mah/year3/semanticWeb/git/UM_KEN3140_SemanticWeb/venv/bin/python -m pip install --upgrade pip' command.[0m


z## Creating Nodes

The subjects and objects of the triples make up the nodes in the graph where the nodes are URI references, Blank Nodes or Literals. In RDFLib, these node types are represented by the classes **URIRef**, **BNode**, **Literal**. *URIRefs* and *Bnodes* can both be thought of as resources, such a person, a company, a website, etc.
- A *BNode* is a node where the exact URI is not known.
- A *URIRef* is a node where the exact URI is know. *URIRefs* are also used to represent the properties/predicates in the RDF graph.
- *Literals* represent attribute values, such as a name, a date, a number, etc. The most common literal values are XML data types, e.g. string, int.


In [19]:
from rdflib import URIRef, BNode, Literal, Namespace
from rdflib.namespace import FOAF, DCTERMS, XSD, RDF, SDO

#URIRef
remzi= URIRef('http://maastrichtuniversity.nl/Remzi')

#URI= Namespace + identifier

#URI for entity Remzi: http://maastrichtuniversity.nl/Remzi
UM = Namespace('http://maastrichtuniversity.nl/')

#URI for entity computerScientist: http://maastrichtuniversity.nl/computerScientist
remzi = UM['Remzi']

person = URIRef('https://schema.org/Person')





Task: Create entities for mona_lisa, Leonardo davinci, has occupation and computer scientist.

In [22]:
mona_lisa = URIRef('https://www.wikidata.org/wiki/Q12418')
leonardo_davinci = URIRef('https://www.wikidata.org/wiki/Q762')
occupation = URIRef('https://www.wikidata.org/wiki/Q12737077')
computer_scientist = URIRef('https://www.wikidata.org/wiki/Q82594')

In [23]:
name = Literal("Nicholas")  # the name 'Nicholas', as a string

age = Literal(39, datatype=XSD.integer)  # the number 39, as an integer

bn = BNode()


In [24]:
from rdflib import Graph

#initialise an empty RDF graph
g = Graph()


**Example:**

create a triple with rdflib for this sentence: Remzi is computer scientist.

In [25]:
# Bind prefix to namespace
g.bind('um', UM)
g.add((remzi, RDF.type, person))
g.add((remzi, occupation, computer_scientist))
g.add((remzi, FOAF.firstName, Literal('Remzi')))
g.add((remzi, FOAF.lastName, Literal('Celebi')))

<Graph identifier=N572596b114ea4ee198c952d846612a62 (<class 'rdflib.graph.Graph'>)>

In [26]:
print(g.serialize(format='ttl'))

@prefix ns1: <http://xmlns.com/foaf/0.1/> .
@prefix ns2: <https://www.wikidata.org/wiki/> .
@prefix um: <http://maastrichtuniversity.nl/> .

um:Remzi a <https://schema.org/Person> ;
    ns1:firstName "Remzi" ;
    ns1:lastName "Celebi" ;
    ns2:Q12737077 ns2:Q82594 .




In [27]:
print ("Entities in this graph:");
print ("-----------------------");

# Print the entities in our graph
print ("Remzi entity: " + str(remzi));
print ("Computer Scientist entity: " + str(computer_scientist));

print ("----------------------");

print ("Triples in this graph:");
print ("----------------------");

for (s, p, o) in g:
  print (s, p, o)
  
print ("----------------------");
for triples in g:
  print(triples)

Entities in this graph:
-----------------------
Remzi entity: http://maastrichtuniversity.nl/Remzi
Computer Scientist entity: https://www.wikidata.org/wiki/Q82594
----------------------
Triples in this graph:
----------------------
http://maastrichtuniversity.nl/Remzi http://xmlns.com/foaf/0.1/firstName Remzi
http://maastrichtuniversity.nl/Remzi http://xmlns.com/foaf/0.1/lastName Celebi
http://maastrichtuniversity.nl/Remzi https://www.wikidata.org/wiki/Q12737077 https://www.wikidata.org/wiki/Q82594
http://maastrichtuniversity.nl/Remzi http://www.w3.org/1999/02/22-rdf-syntax-ns#type https://schema.org/Person
----------------------
(rdflib.term.URIRef('http://maastrichtuniversity.nl/Remzi'), rdflib.term.URIRef('http://xmlns.com/foaf/0.1/firstName'), rdflib.term.Literal('Remzi'))
(rdflib.term.URIRef('http://maastrichtuniversity.nl/Remzi'), rdflib.term.URIRef('http://xmlns.com/foaf/0.1/lastName'), rdflib.term.Literal('Celebi'))
(rdflib.term.URIRef('http://maastrichtuniversity.nl/Remzi'), r

In [28]:
print(g.serialize('KEN3140_Lab2_example.rdf',format='xml'))

[a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Memory']].


In [29]:
print(g.serialize('KEN3140_Lab2_example.ttl',format='turtle'))

[a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Memory']].


In [30]:
print(g.serialize('KEN3140_Lab2_example.nt',format='ntriples'))

[a rdfg:Graph;rdflib:storage [a rdflib:Store;rdfs:label 'Memory']].


To load a graph with rdflib:

In [12]:
from rdflib import Graph
g = Graph()
g.parse('http://dbpedia.org/resource/Semantic_Web')

for s, p, o in g:
    print(s, p, o)


http://dbpedia.org/resource/Semantic_Web http://www.w3.org/2002/07/owl#sameAs http://cs.dbpedia.org/resource/Sémantický_web
http://dbpedia.org/resource/Semantic_Web http://dbpedia.org/ontology/wikiPageWikiLink http://dbpedia.org/resource/Library_science
http://dbpedia.org/resource/Semantic_Web http://dbpedia.org/ontology/wikiPageExternalLink https://www.wikidata.org/entity/Q1731
http://dbpedia.org/resource/Semantic_Web http://dbpedia.org/ontology/wikiPageWikiLink http://dbpedia.org/resource/Semantic_Sensor_Web
http://dbpedia.org/resource/Semantic_Web http://dbpedia.org/ontology/abstract The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable. To enable the encoding of semantics with the data, technologies such as Resource Description Framework (RDF) and Web Ontology Language (OWL) are used. These technologies are used to formal

**IRI validation**

In [13]:
!pip install validators

You should consider upgrading via the '/home/mah/year3/semanticWeb/git/UM_KEN3140_SemanticWeb/venv/bin/python -m pip install --upgrade pip' command.[0m


In [14]:
import validators

In [15]:
validators.url("http://google.com")

True

In [16]:
if not validators.url("http://google"):
  print("not valid")

not valid


### **Lab Tasks**

**Task 1: IRI validation**

In this task you are going to verify which of the following strings are valid IRIs or not. 
Verify them by validator.

If you find some of these to be invalid IRIs, consult the [rfc3987](https://tools.ietf.org/html/rfc3987)
IRI specification to put forward reasons why they are invalid.For each valid IRI in the list, think about
and discuss with your classmates to what extent they comply with the Linked Principles.

1. ``myIRI``
2. ``myIRI/``
3. ``myIRI#``
4. ``ftp:/myIRI``
5. ``ftp://myIRI/``
6. ``ftp://myIRI#``
7. ``http://myIRI#``
8. ``http:///myIRI/folder1/folder2/``
9. ``http:///myIRI/folder1/folder2/my name``
10. ``http:///myIRI/folder1/folder2/my_name``
11. ``my_own_protocol:///myIRI/folder1/folder2/my_name``
12. ``:///myIRI/folder1/folder2/my_name``
13. ``https://myIRI/$/my_name``
14. ``https://myIRI/#$#/my_name``
15. ``https://136.292.181.23/#12/my_name``
16. ``https://136.255.181.23/!210382/my_name``
17. ``https://schema.org/parent``
18. ``https://www.wikidata.org/wiki/Q937``
19. ``https://en.wikipedia.org/wiki/Albert_Einstein``
20. ``https://www.w3.org/Consortium/``
    

**Task 1 solution:**

In [17]:
#IRI validation
if not validators.url("myIRI"):
   print("not valid")


not valid


**Task 2: Formulating RDF triples**

Using a text editor of your choice (e.g. Notepad or Sublime text) **or** rdflib, create RDF triples capturing as fully as possible the information in the following piece of text:

“Vincent van Gogh was a Dutch artist born in Zundert, a city in the country of the Netherlands, on 30 March 1853. One of the most famous artworks created by him is ‘The Starry Night’ oil on canvas painting.”

**Requirements:**
1. Write down the triples in Turtle syntax and save the document as a .ttl file.
2. Ensure that the triples are generated using valid RDF syntax and valid IRIs. 
3. Make sure to **reuse** existing vocabulary where possible

For convenience, a conceptual diagram of the information in the above text is given below.

![image.png](task2-vangogh.png)

**Task 2 solution:**

#### Task 3: Identifying components of an RDF graph

Study the following diagram:

![image.png](task3.png)

Now, list all the:

1. object properties in the graph
2. data properties in the graph
3. instances in the graph
4. data types in the graph
5. prefix shorthands in the graph

Discuss your answers with your classmates. You may write the answers down in a new markdown cell below this one if you wish.

**Task 3 solution:**