# Networks and ontologies

This workbook explores the concept of **semantic networks** and **ontologies** in Python and how to visualise them.

Throughout the notebook you will use two libraries, namely NetworkX and Owlready2. You will find some guided examples to aid your understanding, and some exercises for you to implement on your own.

#### Content:
* [NetworkX](#nx)
    * [Getting started](#nx-start)
    * [Exercise 1 - sympton/conditions network](#nx-ex)
* [Owlready2](#owl)
    * [Getting started](#owl-start)
    * [Exercise 2 - food ontology](#owl-ex)

## NetworkX <a class="anchor" id="nx"></a>

Semantic networks are **graphical** representations of knowledge, used to organise and visualise relationships between concepts. They are often visualised as **graphs**, where nodes/concepts are connected by edges/relationships. 

**[NetworkX](https://networkx.org/documentation/stable)** is a python library for creating, visualise and analyse graphs in python. 

Let's start!

### Getting started <a class="anchor" id="nx-start"></a>

In [None]:
# uncomment the cells below to install networkx
# ! pip install --upgrade pip
# ! pip install networkx

import networkx as nx
import matplotlib.pyplot as plt # this will be needed to visualise the graph object

We first start to create an empty graph:

In [None]:
G = nx.Graph()

We then add **nodes**:

In [None]:
animals = ["Lion", "Tiger", "Elephant", "Giraffe", "Zebra"]
diet = ["Carnivore", "Herbivore"]
food = ["Meat", "Grass"]
habitat = ["Savanna", "Grassland"]
    
G.add_nodes_from(animals)
G.add_nodes_from(diet)
G.add_nodes_from(food)
G.add_nodes_from(habitat)

We now want to add **edges**, relationships among our nodes. For example, we know that 'Lion is a carnivore', 'Herbivore eats grass', and 'Zebras lives in grassland locations'.

In a semantic network, relationships have 'names', so we will label edges using the `{"label": <name>}` construct.

In [None]:
relationships = [("Lion", "Carnivore", {"label": "is_a"}),
                  ("Tiger", "Carnivore", {"label": "is_a"}),
                  ("Elephant", "Herbivore", {"label": "is_a"}),
                  ("Giraffe", "Herbivore", {"label": "is_a"}),
                  ("Zebra", "Herbivore", {"label": "is_a"}),
                  ("Carnivore", "Meat", {"label": "eats"}),
                  ("Herbivore", "Grass", {"label": "eats"}),
                  ("Lion", "Savanna", {"label": "lives_in"}),
                  ("Tiger", "Savanna", {"label": "lives_in"}),
                  ("Elephant", "Grassland", {"label": "lives_in"}),
                  ("Giraffe", "Savanna", {"label": "lives_in"}),
                  ("Zebra", "Grassland", {"label": "lives_in"})]


G.add_edges_from(relationships)

Let's now plot the complete graph. 

Note that, in networkX there are different [**layout**](https://networkx.org/documentation/stable/reference/drawing.html#module-networkx.drawing.layout) functions that determine the position of edges and nodes within the plot, based on different algorithms and criterias.

In [None]:
# use matplotlib features to set plot size for display
plt.figure(figsize=(12, 12))

# We will use the kamada-kawai layout
pos = nx.kamada_kawai_layout(G)

# draw graph
nx.draw_networkx(G, 
                 pos, # layout
                 with_labels=True, #add nodes' names
                 node_size=2000, node_color = 'lightblue',
                 font_size=10, 
                 arrows = True, arrowstyle='->', arrowsize=15)

# add labels to edges
edge_labels = nx.get_edge_attributes(G, "label")
nx.draw_networkx_edge_labels(G, pos, edge_labels=edge_labels)

# display plot
plt.show()

We now have a knowledge base in the form of a graph, so let's see how we can access such knowledge.

In [None]:
# list all the nodes adjacent (=connected) to a specific node
G['Lion']

In [None]:
# check if there is a path between two nodes, i.e. there a series of nodes and edges connecting the two nodes somehow
print('Path between Lion and Meat:', 
      nx.has_path(G, source='Lion', target='Meat') )   

In [None]:
print('Path between Lion and Grass:', 
      nx.has_path(G, source='Lion', target='Grass') ) 

The previous answer is somehow correct and incorrect! 

Technically, there is indeed a path that links the node 'Lion' with the node 'Grass': Lion - Savanna - Giraffe - Herbivore - Grass. However, we know that this does not make sense, given the nature and **direction** of the relationship. 

The error generates because we have not specified that our graph is **directed**! So let's transform the graph into a **directed graph**, and infer some (correct) knowledge.

In [None]:
# Create a directed graph
D = nx.DiGraph()

D.add_nodes_from(animals)
D.add_nodes_from(diet)
D.add_nodes_from(food)
D.add_nodes_from(habitat)

D.add_edges_from(relationships)

# plot the graph - we will not need to specify arrows=True
plt.figure(figsize=(14, 14))
pos = nx.planar_layout(D)

nx.draw_networkx(D, 
                 pos, # layout
                 with_labels=True, #add nodes' names
                 node_size=2000, node_color = 'lightblue',
                 font_size=10,arrowstyle='->', arrowsize=15)

edge_labels = nx.get_edge_attributes(D, "label")
nx.draw_networkx_edge_labels(D, pos, edge_labels=edge_labels)

plt.show()

In [None]:
# list all the nodes adjacent (=connected) to a specific node
D['Lion']

In [None]:
# check if there is a path between two nodes, i.e. there a series of nodes and edges connecting the two nodes somehow
print('Path between Lion and Meat:', 
      nx.has_path(D, source='Lion', target='Meat') )   
print('Path between Lion and Grass:', 
      nx.has_path(D, source='Lion', target='Grass') ) 

**Question**: what does the Lion eat?

In [None]:
# option 1: for each possible foods find if there is paths between it and Lion
# this option works if you know the objects/concepts in your graph

for f in food:
    if nx.has_path(D, source='Lion', target=f):
        print('Lion eats {}'.format(f))
    else:
        print('Lion does not eat {}'.format(f))


In [None]:
# option 2: iterate through all the edges called 'eats' find the nodes it points to, check if there is a path with 'Lion'
# this option works if you know the relationships' in your graph
for (node1, node2, label) in D.edges.data('label'):
     if label == 'eats':
            # edge goes from node1 to node2, i.e. 'node1' eats 'node2'
            food = node2 
            # check path
            if nx.has_path(D, 'Lion', food):
                print('Lion eats {}'.format(food))

### Exercise 1 <a class="anchor" id="nx-ex"></a>

You have been tasked with creating a semantic network that connects symptoms, related conditions, and possible treatments.

The list of conditions, symptoms and treatments is:

- Conditions: diabetes, hypertension, asthma
- Symptoms: polydipsia, polyuria, fatigue, headaches, shortness of breath, blurred vision, chest pain, dizziness
- Treatments: insulin therapy, blood pressure medication, inhaled corticosteroids, pain relievers

You also know that: 

Condition: Symptoms

- Diabetes: polydipsia, polyuria, fatigue, blurred vision
- Hypertension: fatigue, headaches, shortness of breath, chest pain, dizziness
- Ashtma: fatigue, shortness of breath

Treatments: Symptoms
- Insulin therapy:  polydipsia, polyuria
- Blood pressure medication:  fatigue, headaches, chest pain, dizziness
- Inhaled corticosteroids:  shortness of breath, chest pain
- Pain relievers:  headaches

**Task1**

Create a semantic network (directed graph) using the knowledge listed above.

In [None]:
# write here your code

**Task2**

Answer the following questions:

- **Q1**: What are the best treatments for a patient with diabetes?
- **Q2**: A patient has dizziness and fatigue. What is a possible condition?

In [None]:
# write here your code

## Owlready2 <a class="anchor" id="owl"></a>

In knowledge-base we use **ontologies** to formalise the representation of knowledge in specific, wide domain.

**[Owlready2](https://owlready2.readthedocs.io/en/latest/)** is a python library for working with ontologies, particularly those expressed in the [Web Ontology Language (OWL)](https://en.wikipedia.org/wiki/Web_Ontology_Language).


Let's start!

### Getting started <a class="anchor" id="owl-start"></a>

In [None]:
# uncomment the cells below to install owlready
# ! pip install --upgrade pip
# ! pip install owlready2

import owlready2 as owl

We first create an empty ontology using `get_ontology()` function: it takes a single parameter, the IRI (sort of URL, used to identify ontologies) of the ontology:

In [None]:
# create empty ontology of countries and associate it a IRI
onto = owl.get_ontology("http://test.org/onto_pays.owl")

Our ontology is empty and needs to be populated with concepts, properties, rules and individuals.

We start by adding the [**classes and properties**](https://owlready2.readthedocs.io/en/latest/class.html) to our ontology.

All the classess in owlready are subclassess of the `Thing` class, already embedded in an ontology. The `Thing` class is typically used as a starting point when defining new classes or creating instances in an ontology. It provides a foundation for organising and categorising different concepts within the ontology, create a hierarchy of more specific classes and define relationships between them.

What follows is an adaptation of the [pays ontology](https://github.com/KaziPratique/Ontologies-Owlready2/blob/main/code.py).

In [None]:
# add classess and properties to the ontology

with onto:
    # main classes (directly depending from 'Thing'): Country, Region, City
    class Country(owl.Thing): pass
    class Region(owl.Thing): pass
    class City(owl.Thing): pass
    
    # disjoint class: each individual we'll create cannot belong to multiple classes at the same time
    owl.AllDisjoint([Country, Region, City])
    
    # this construct is used to create a transitive property to embed in our subclassess logic
    class part_of(owl.Thing >> owl.Thing, owl.TransitiveProperty): pass
    
    # create properties to connect classes in a hierarchical way
    class in_country(part_of): pass    
    class in_region(part_of): pass
    
    # create a subclass for the City class
    class population(City >> int): pass   
    
    # these classess will be inferred, i.e. individual city will be associated to one of these 
    # two classes through specific rules based on the population property
    class BigCity(City): pass
    class SmallCity(City): pass


Let's look at our ontology so far.

In [None]:
# list classes
list(onto.classes())

In [None]:
# list properties
list(onto.properties())

To infer values for the classes `BigCity` and `SmallCity` based on the population property we need to define some **rules**. These rules will allow you to automatically assign the appropriate class to each individual city based on its population.

In owlready2, rules are initialised using the `Imp()` and then set with `.set_as_rule()`, using a [Protégé](https://en.wikipedia.org/wiki/Prot%C3%A9g%C3%A9_(software))-like syntax

In [None]:
with onto:
    
    #set rules
    owl.Imp().set_as_rule("City(?c), population(?c, ?pop), greaterThan(?pop, 200000) ->BigCity(?c)")
    owl.Imp().set_as_rule("City(?c), population(?c, ?pop), lessThan(?pop, 50000) -> SmallCity(?c)")

Now, it's time to add **individuals**, in other words we want to populate the ontology with actual instances.

In [None]:
with onto:
    
    # add country
    uk = Country("UK")
    # add regions
    south_west = Region("South West", in_country = [uk])
    north_east = Region("North East", in_country = [uk])
    
    bristol = City("Bristol", in_region = [south_west])
    bristol.population.append(463400)
    
    wells = City("Wells", in_region = [south_west])
    wells.population.append(11000)
    
    exeter = City("Exeter", in_region = [south_west])
    exeter.population.append(131000)
    
    newcastle = City("Newcastle", in_region = [north_east])
    newcastle.population.append(300000)
    
    durham = City("Durham", in_region = [north_east])
    durham.population.append(48000)
    
    # this is like AllDisjoint but for instances
    owl.AllDifferent([bristol, wells, exeter, newcastle, durham])

In [None]:
# list instances
list(onto.individuals())

Let's save the ontology, this will allow us to upload it and visualise it in [WebOwl](https://service.tib.eu/webvowl/).

In [None]:
onto.save(file="./cities.owl")

**[The following requires to a java installation! Feel free to skip it]**

**OWL reasoner** can be used to verify the coherence of an ontology and inferring new information within it. OWL reasoners are written in Java, and thus you need a Java Virtual Machine to perform inference. 

Under Linux or MacOS, owlready2 should automatically find Java. Under Windows, you will need to specify the path of the java interpreter.

_Please, install java if you don't have it already, and find the path of java.exe_

In [None]:
# if in windows specify java.exe path 
# change <path> with your own path

owl.JAVA_EXE = "C:\\Program Files\\Java\\jdk-21.0.1\\bin\\java.exe"

There are three different [reasoners](https://owlready2.readthedocs.io/en/latest/reasoning.html) in owlready2. Since we have inferred classes, we will use the Pellet reasoners that allows to look up for values.

In [None]:
# start resoning
with onto:
    owl.sync_reasoner_pellet(infer_property_values = True, infer_data_property_values = True)

In [None]:
# Query: is Bristol a big city?
print('Bristol population:', onto.Bristol.population)
print('Bristol a big city:', onto.Bristol in onto.BigCity.instances())

### Exercise 2 <a class="anchor" id="owl-ex"></a>

We want to know which of the cities in our ontology has the highest population density (city area divided by population).

**Task1**

 Add the `area` subclass to each city, as done for population.

In [None]:
# write here your code

**Task2**

Add a rule to compute the density for each city. Tip: use a [functional property](https://owlready2.readthedocs.io/en/latest/rule.html).


In [None]:
# write here your code

**Task3**

Question: which city has the highest density?

In [None]:
# write here your code