## Ontology Instance Population

In this notebook, we imported the merged disaster dataset into their corresponding ontology classes. The ontology schema was defined and built on Protege - an open source Ontology Editor, which allows developers to export the ontology in RDF file format, allowing the ontology to be easily parsed using Python's rdflib. We loaded the ontology using this library, and then used the Pandas library to load the merged disaster dataset into a dataframe. The rows in the dataframe were iterated and added as instances within the ontology's corresponding classes and data properties. 

Note: the files within this folder contains the Ontology RDF with no instances defined, as well as the RDF produced after importing the instances.

- DisasterOntology.rdf: Ontology schema with no instances, only class, relationship and data properties structured.
- disaster_ontology.rdf: Completed Ontology

In [3]:
import pandas as pd

In [4]:
%pip install rdflib

Collecting rdflib
  Obtaining dependency information for rdflib from https://files.pythonhosted.org/packages/f4/31/e9b6f04288dcd3fa60cb3179260d6dad81b92aef3063d679ac7d80a827ea/rdflib-7.1.4-py3-none-any.whl.metadata
  Downloading rdflib-7.1.4-py3-none-any.whl.metadata (11 kB)
Collecting isodate<1.0.0,>=0.7.2 (from rdflib)
  Obtaining dependency information for isodate<1.0.0,>=0.7.2 from https://files.pythonhosted.org/packages/15/aa/0aca39a37d3c7eb941ba736ede56d689e7be91cab5d9ca846bde3999eba6/isodate-0.7.2-py3-none-any.whl.metadata
  Downloading isodate-0.7.2-py3-none-any.whl.metadata (11 kB)
Downloading rdflib-7.1.4-py3-none-any.whl (565 kB)
   ---------------------------------------- 0.0/565.1 kB ? eta -:--:--
   - ------------------------------------- 20.5/565.1 kB 682.7 kB/s eta 0:00:01
   ---- ---------------------------------- 71.7/565.1 kB 787.7 kB/s eta 0:00:01
   -------- ----------------------------- 133.1/565.1 kB 983.0 kB/s eta 0:00:01
   ---------------- --------------------

In [5]:
from rdflib import Graph, Namespace, URIRef, Literal
from rdflib.namespace import RDF, RDFS, XSD

In [None]:
# Load the ontology
g = Graph()
g.parse("DisasterOntology.rdf", format="xml")  # RDF/XML format

# Defining namespace
EX = Namespace("http://www.semanticweb.org/zakar/ontologies/2025/1/DisasterOntology#")
g.bind("ex", EX)

# Print the number of triples loaded
print(f"Ontology loaded with {len(g)} triples.")


Ontology loaded with 41 triples.


In [7]:
df = pd.read_csv("merged_disaster_news.csv")
df

Unnamed: 0,DisNo.,Disaster Group,Disaster Subgroup,Disaster Subtype,Country,Subregion,Region,Start Year,Source URL,Source,Report
0,1999-9388-DJI,Natural,Climatological,Drought,Djibouti,Sub-Saharan Africa,Africa,2001,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
1,1999-9388-SDN,Natural,Climatological,Drought,Sudan,Northern Africa,Africa,2000,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
2,1999-9388-SOM,Natural,Climatological,Drought,Somalia,Sub-Saharan Africa,Africa,2000,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
3,2000-0001-AGO,Technological,Transport,Road,Angola,Sub-Saharan Africa,Africa,2000,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
4,2000-0002-AGO,Natural,Hydrological,Riverine flood,Angola,Sub-Saharan Africa,Africa,2000,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
...,...,...,...,...,...,...,...,...,...,...,...
17075,28186,,,Flood,Iraq,,Iraq,Oct 2015,https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,Heavy rains in late October have caused floodi...
17076,27771,,,"Flash Flood, Flood",Ethiopia,,Ethiopia,Oct 2015,https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,"Since the beginning of bega season, incidents ..."
17077,27741,,,Earthquake,Afghanistan,,"Afghanistan, Pakistan",Oct 2015,https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,Around 13:40 local time (UTC +4:30) on 26 Octo...
17078,27706,,,Flood,Algeria,,"Algeria, Western Sahara",Oct 2015,https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,"Mid-October, heavy rains and flooding caused w..."


In [8]:
df = df.drop(columns='Start Year')
df

Unnamed: 0,DisNo.,Disaster Group,Disaster Subgroup,Disaster Subtype,Country,Subregion,Region,Source URL,Source,Report
0,1999-9388-DJI,Natural,Climatological,Drought,Djibouti,Sub-Saharan Africa,Africa,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
1,1999-9388-SDN,Natural,Climatological,Drought,Sudan,Northern Africa,Africa,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
2,1999-9388-SOM,Natural,Climatological,Drought,Somalia,Sub-Saharan Africa,Africa,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
3,2000-0001-AGO,Technological,Transport,Road,Angola,Sub-Saharan Africa,Africa,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
4,2000-0002-AGO,Natural,Hydrological,Riverine flood,Angola,Sub-Saharan Africa,Africa,https://www.emdat.be/,EM-DAT: The International Disaster Database. C...,
...,...,...,...,...,...,...,...,...,...,...
17075,28186,,,Flood,Iraq,,Iraq,https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,Heavy rains in late October have caused floodi...
17076,27771,,,"Flash Flood, Flood",Ethiopia,,Ethiopia,https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,"Since the beginning of bega season, incidents ..."
17077,27741,,,Earthquake,Afghanistan,,"Afghanistan, Pakistan",https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,Around 13:40 local time (UTC +4:30) on 26 Octo...
17078,27706,,,Flood,Algeria,,"Algeria, Western Sahara",https://reliefweb.int/,Relief Web: Informing humanitarians worldwide ...,"Mid-October, heavy rains and flooding caused w..."


In [9]:
# Iterate over rows in the CSV and add to the ontology
for _, row in df.iterrows():
    disaster_event = URIRef(EX + row["DisNo."].replace(" ", "_"))
    location = URIRef(EX + row["Country"].replace(" ", "_"))
    source = URIRef(EX + row["Source"].replace(" ", "_"))
    news_report = URIRef(EX + "Report_" + row["DisNo."].replace(" ", "_"))

    # Define classes
    g.add((disaster_event, RDF.type, EX.DisasterEvent))
    g.add((location, RDF.type, EX.Location))
    g.add((source, RDF.type, EX.Source))
    g.add((news_report, RDF.type, EX.NewsReport))

    # Define object properties
    g.add((news_report, EX.mentionsEvent, disaster_event))
    g.add((disaster_event, EX.ocurredAt, location))
    g.add((news_report, EX.reportedBy, source))

    # Define data properties
    g.add((disaster_event, EX.disasterGroup, Literal(row["Disaster Group"], datatype=XSD.string)))
    g.add((disaster_event, EX.disasterSubgroup, Literal(row["Disaster Subgroup"], datatype=XSD.string)))
    g.add((disaster_event, EX.disasterSubtype, Literal(row["Disaster Subtype"], datatype=XSD.string)))
    g.add((location, EX.subregion, Literal(row["Subregion"], datatype=XSD.string)))
    g.add((location, EX.region, Literal(row["Region"], datatype=XSD.string)))
    g.add((news_report, EX.sourceURL, Literal(row["Source URL"], datatype=XSD.anyURI)))

# Save ontology to file
g.serialize("disaster_ontology.rdf", format="xml")

print(f"Ontology updated with {len(g)} triples.")

Ontology updated with 154816 triples.
