<a href="https://colab.research.google.com/github/sreent/data-management-intro/blob/main/past-exam-papers/september-2021/notebook-september-2021.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CM3010 September 2021 - Practice Notebook

This notebook provides hands-on practice for the September 2021 exam.

**Instructions:**
1. Run the Setup cells first
2. Write your answers in the empty code cells
3. Check your answers against the solution sheet

---

# 1. Environment Setup

Run these cells first to set up MySQL, MongoDB, xmllint, rapper, and rdflib.

In [1]:
# === MySQL Setup (for SQL exercises) ===
!apt -qq update > /dev/null
!apt -y -qq install mysql-server > /dev/null
!service mysql start

# Create user and database
!mysql -e "CREATE USER IF NOT EXISTS 'examuser'@'localhost' IDENTIFIED BY 'exampass';"
!mysql -e "CREATE DATABASE IF NOT EXISTS exam_db;"
!mysql -e "GRANT ALL PRIVILEGES ON *.* TO 'examuser'@'localhost';"

# === xmllint Setup (for XML/XPath exercises) ===
# libxml2-utils provides xmllint for XML validation and XPath queries
!apt -y -qq install libxml2-utils > /dev/null

# === rapper Setup (for RDF/Turtle validation) ===
# raptor2-utils provides rapper for command-line Turtle validation
!apt -y -qq install raptor2-utils > /dev/null

# === Python libraries ===
!pip install -q sqlalchemy==2.0.20 ipython-sql==0.5.0 pymysql==1.1.0 prettytable==2.0.0 rdflib

%reload_ext sql
%sql mysql+pymysql://examuser:exampass@localhost/exam_db

print("MySQL ready!")
print("xmllint ready!")
print("rapper ready!")



W: Skipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)


 * Starting MySQL database server mysqld
   ...done.




[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m25.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.8/44.8 kB[0m [31m3.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m587.2/587.2 kB[0m [31m35.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m57.4 MB/s[0m eta [36m0:00:00[0m
[?25hMySQL ready!
xmllint ready!
rapper ready!


In [2]:
# Install and start MongoDB
!wget -q http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.1f-1ubuntu2_amd64.deb
!dpkg -i libssl1.1_1.1.1f-1ubuntu2_amd64.deb > /dev/null 2>&1
!wget -qO - https://www.mongodb.org/static/pgp/server-4.4.asc | apt-key add - > /dev/null 2>&1
!echo "deb [ arch=amd64,arm64 ] http://repo.mongodb.org/apt/ubuntu bionic/mongodb-org/4.4 multiverse" | tee /etc/apt/sources.list.d/mongodb-org-4.4.list > /dev/null
!apt-get update -qq > /dev/null
!apt-get install -y -qq mongodb-org > /dev/null
!mkdir -p /data/db
!mongod --fork --logpath /var/log/mongodb.log --dbpath /data/db

# Test MongoDB is running
!mongo --quiet --eval 'print("MongoDB ready!")'

W: http://repo.mongodb.org/apt/ubuntu/dists/bionic/mongodb-org/4.4/InRelease: Key is stored in legacy trusted.gpg keyring (/etc/apt/trusted.gpg), see the DEPRECATION section in apt-key(8) for details.
W: Skipping acquire of configured file 'main/source/Sources' as repository 'https://r2u.stat.illinois.edu/ubuntu jammy InRelease' does not seem to provide it (sources.list entry misspelt?)
about to fork child process, waiting until server is ready for connections.
forked process: 5738
child process started successfully, parent exiting
MongoDB ready!


---

# Section A: MCQ Practice

## Q1(e) & Q1(f): XML Well-Formedness and Validation

These MCQs test whether you can identify:
- **Q1(e)**: Why XML is not **well-formed** (syntax errors)
- **Q1(f)**: Why XML is not **valid** against a schema (structural/constraint violations)

**Key concepts:**
- **Well-formed**: Correct XML syntax (tags closed, properly nested, etc.)
- **Valid**: Conforms to an XSD schema (required elements/attributes present, correct types)

### The Exam Question

Given this XML and XSD schema, identify well-formedness and validation issues.

### movies.xsd (Schema)

In [3]:
%%writefile movies.xsd
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="movie">
    <xs:complexType>
      <xs:all>
        <xs:element ref="cast"/>
        <xs:element ref="releaseYear"/>
        <xs:element ref="title"/>
      </xs:all>
    </xs:complexType>
  </xs:element>
  <xs:element name="cast">
    <xs:complexType>
      <xs:sequence>
        <xs:element maxOccurs="unbounded" ref="actor"/>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
  <xs:element name="actor">
    <xs:complexType mixed="true">
      <xs:attribute name="role"/>
    </xs:complexType>
  </xs:element>
  <xs:element name="releaseYear" type="xs:integer"/>
  <xs:element name="title">
    <xs:complexType mixed="true">
      <xs:attribute name="lang" use="required"/>
    </xs:complexType>
  </xs:element>
</xs:schema>

Writing movies.xsd


### Q1(e): The Original XML (NOT well-formed)

This is the XML from the exam - it has a well-formedness issue:

In [4]:
%%writefile movies_original.xml
<movie>
  <title>Citizen Kane</title>
  <cast>
    <actor>Orson Welles</actor>
    <actor role="Jebediah Leland">Joseph Cotton</actor>
</movie>

Writing movies_original.xml


### Q1(f): Well-formed but NOT valid XML

To test validation issues (Q1(f)), we need a well-formed XML first. Here's the XML with the `</cast>` tag fixed, but still invalid against the schema:

In [5]:
%%writefile movies_wellformed_invalid.xml
<movie>
  <title>Citizen Kane</title>
  <cast>
    <actor>Orson Welles</actor>
    <actor role="Jebediah Leland">Joseph Cotton</actor>
  </cast>
</movie>

Writing movies_wellformed_invalid.xml


### Valid XML (for comparison)

This XML is both well-formed AND valid against the schema:

In [6]:
%%writefile movies_valid.xml
<movie>
  <title lang="en">Citizen Kane</title>
  <cast>
    <actor>Orson Welles</actor>
    <actor role="Jebediah Leland">Joseph Cotton</actor>
  </cast>
  <releaseYear>1941</releaseYear>
</movie>

Writing movies_valid.xml


### Run Validation - See the difference!

In [7]:
# Q1(e): Test well-formedness of the original XML
print("=== Q1(e): Testing well-formedness of movies_original.xml ===")
print("(The cast element is NOT closed)")
!xmllint movies_original.xml 2>&1

=== Q1(e): Testing well-formedness of movies_original.xml ===
(The cast element is NOT closed)
movies_original.xml:6: parser error : Opening and ending tag mismatch: cast line 3 and movie
</movie>
        ^
movies_original.xml:7: parser error : Premature end of data in tag movie line 1

^


In [11]:
# Q1(f): Test validation of well-formed but invalid XML
print("=== Q1(f): Validating movies_wellformed_invalid.xml ===")
print("(Well-formed but missing: lang attribute on title, releaseYear element)")
!xmllint --schema movies.xsd movies_wellformed_invalid.xml 2>&1

=== Q1(f): Validating movies_wellformed_invalid.xml ===
(Well-formed but missing: lang attribute on title, releaseYear element)
<?xml version="1.0"?>
<movie>
  <title>Citizen Kane</title>
  <cast>
    <actor>Orson Welles</actor>
    <actor role="Jebediah Leland">Joseph Cotton</actor>
  </cast>
</movie>
movies_wellformed_invalid.xml:2: element title: Schemas validity error : Element 'title': The attribute 'lang' is required but missing.
movies_wellformed_invalid.xml:1: element movie: Schemas validity error : Element 'movie': Missing child element(s). Expected is ( releaseYear ).
movies_wellformed_invalid.xml fails to validate


In [12]:
print("\n" + "="*60)
print("=== For comparison: Validating movies_valid.xml ===")
!xmllint --schema movies.xsd movies_valid.xml 2>&1


=== For comparison: Validating movies_valid.xml ===
<?xml version="1.0"?>
<movie>
  <title lang="en">Citizen Kane</title>
  <cast>
    <actor>Orson Welles</actor>
    <actor role="Jebediah Leland">Joseph Cotton</actor>
  </cast>
  <releaseYear>1941</releaseYear>
</movie>
movies_valid.xml validates


---

# 2. Question 2: Bird Spotter's Database

## Sample Data: Sightings Table

| Species | Date | NumberSighted | ConservationStatus | NatureReserve | Location |
|---------|------|---------------|-------------------|---------------|----------|
| Bar-tailed godwit | 2021-04-21 | 31 | Least concern | Rainham Marshes | 51.5N 0.2E |
| Wood pigeon | 2021-04-21 | 31 | Least concern | Rainham Marshes | 51.5N 0.2E |
| Greater spotted woodpecker | 2021-06-13 | 1 | Least concern | Epping Forest | 51.6N 0.0E |
| European turtle dove | 2021-06-13 | 2 | Vulnerable | Epping Forest | 51.6N 0.0E |
| Wood pigeon | 2021-06-13 | 2 | Least concern | Epping Forest | 51.6N 0.0E |
| Great bustard | 2020-04-15 | 3 | Vulnerable | Salisbury Plain | 51.1N -1.8W |
| Bar-tailed godwit | 2020-04-20 | 53 | Least concern | Rainham Marshes | 51.5N 0.2E |

## Data Setup

In [13]:
%%sql
DROP TABLE IF EXISTS Sightings;

CREATE TABLE Sightings (
    Species VARCHAR(100),
    Date DATE,
    NumberSighted INT,
    ConservationStatus VARCHAR(50),
    NatureReserve VARCHAR(100),
    Location VARCHAR(50)
);

INSERT INTO Sightings VALUES
('Bar-tailed godwit', '2021-04-21', 31, 'Least concern', 'Rainham Marshes', '51.5N 0.2E'),
('Wood pigeon', '2021-04-21', 31, 'Least concern', 'Rainham Marshes', '51.5N 0.2E'),
('Greater spotted woodpecker', '2021-06-13', 1, 'Least concern', 'Epping Forest', '51.6N 0.0E'),
('European turtle dove', '2021-06-13', 2, 'Vulnerable', 'Epping Forest', '51.6N 0.0E'),
('Wood pigeon', '2021-06-13', 2, 'Least concern', 'Epping Forest', '51.6N 0.0E'),
('Great bustard', '2020-04-15', 3, 'Vulnerable', 'Salisbury Plain', '51.1N -1.8W'),
('Bar-tailed godwit', '2020-04-20', 53, 'Least concern', 'Rainham Marshes', '51.5N 0.2E');

SELECT 'Sightings table ready!' AS Status;

 * mysql+pymysql://examuser:***@localhost/exam_db
0 rows affected.
0 rows affected.
7 rows affected.
1 rows affected.


Status
Sightings table ready!


## Q2(a): Retrieve bird types seen since January 1, 2021 [4 marks]

In [None]:
%%sql
-- Write your query here:


## Q2(c): Create normalized tables [7 marks]

Create tables for: Species, NatureReserves, and Sightings (normalized)

In [None]:
%%sql
-- Create your normalized tables here:


## Q2(e): Query with JOIN [5 marks]

Using normalized tables, retrieve bird types and conservation status for birds seen since January 1, 2021.

### Normalized Tables:

**Species:**
| SpeciesName | ConservationStatus |
|-------------|--------------------|
| Bar-tailed godwit | Least concern |
| Wood pigeon | Least concern |
| Greater spotted woodpecker | Least concern |
| European turtle dove | Vulnerable |
| Great bustard | Vulnerable |

**NatureReserves:**
| ReserveName | Location |
|-------------|----------|
| Rainham Marshes | 51.5N 0.2E |
| Epping Forest | 51.6N 0.0E |
| Salisbury Plain | 51.1N -1.8W |

**SightingsNorm:**
| SpeciesName | ReserveName | Date | NumberSighted |
|-------------|-------------|------|---------------|
| Bar-tailed godwit | Rainham Marshes | 2021-04-21 | 31 |
| Wood pigeon | Rainham Marshes | 2021-04-21 | 31 |
| Greater spotted woodpecker | Epping Forest | 2021-06-13 | 1 |
| European turtle dove | Epping Forest | 2021-06-13 | 2 |
| Wood pigeon | Epping Forest | 2021-06-13 | 2 |
| Great bustard | Salisbury Plain | 2020-04-15 | 3 |
| Bar-tailed godwit | Rainham Marshes | 2020-04-20 | 53 |

### Setup

In [15]:
%%sql
DROP TABLE IF EXISTS SightingsNorm;
DROP TABLE IF EXISTS Species;
DROP TABLE IF EXISTS NatureReserves;

CREATE TABLE Species (
    SpeciesName VARCHAR(100) PRIMARY KEY,
    ConservationStatus VARCHAR(50)
);

CREATE TABLE NatureReserves (
    ReserveName VARCHAR(100) PRIMARY KEY,
    Location VARCHAR(50)
);

CREATE TABLE SightingsNorm (
    SpeciesName VARCHAR(100),
    ReserveName VARCHAR(100),
    Date DATE,
    NumberSighted INT,
    PRIMARY KEY (SpeciesName, ReserveName, Date),
    FOREIGN KEY (SpeciesName) REFERENCES Species(SpeciesName),
    FOREIGN KEY (ReserveName) REFERENCES NatureReserves(ReserveName)
);

INSERT INTO Species VALUES
('Bar-tailed godwit', 'Least concern'),
('Wood pigeon', 'Least concern'),
('Greater spotted woodpecker', 'Least concern'),
('European turtle dove', 'Vulnerable'),
('Great bustard', 'Vulnerable');

INSERT INTO NatureReserves VALUES
('Rainham Marshes', '51.5N 0.2E'),
('Epping Forest', '51.6N 0.0E'),
('Salisbury Plain', '51.1N -1.8W');

INSERT INTO SightingsNorm VALUES
('Bar-tailed godwit', 'Rainham Marshes', '2021-04-21', 31),
('Wood pigeon', 'Rainham Marshes', '2021-04-21', 31),
('Greater spotted woodpecker', 'Epping Forest', '2021-06-13', 1),
('European turtle dove', 'Epping Forest', '2021-06-13', 2),
('Wood pigeon', 'Epping Forest', '2021-06-13', 2),
('Great bustard', 'Salisbury Plain', '2020-04-15', 3),
('Bar-tailed godwit', 'Rainham Marshes', '2020-04-20', 53);

SELECT 'Normalized tables ready!' AS Status;

 * mysql+pymysql://examuser:***@localhost/exam_db
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
5 rows affected.
3 rows affected.
7 rows affected.
1 rows affected.


Status
Normalized tables ready!


In [16]:
%%sql
-- Write your JOIN query here:


 * mysql+pymysql://examuser:***@localhost/exam_db
0 rows affected.


[]

## Q2(f): Write a transaction example [7 marks]

In [17]:
%%sql
-- Write a transaction example here:


 * mysql+pymysql://examuser:***@localhost/exam_db
0 rows affected.


[]

---

# 3. Question 3: MEI Music Encoding

## Sample Data: MEI XML

```xml
<measure>
  <staff n="2">
    <layer n="1">
      <chord xml:id="d13e1" dur="8" dur.ppq="12" stem.dir="up">
        <note xml:id="d1e101" pname="c" oct="5"/>
        <note xml:id="d1e118" pname="a" oct="4"/>
        <note xml:id="d1e136" pname="c" oct="4"/>
      </chord>
    </layer>
  </staff>
  <staff n="3">
    <layer n="1">
      <chord xml:id="d17e1" dur="8" dur.ppq="12" stem.dir="up">
        <note xml:id="d1e157" pname="f" oct="3"/>
        <note xml:id="d1e174" pname="f" oct="2"/>
      </chord>
    </layer>
  </staff>
</measure>
```

## Data Setup

In [18]:
%%writefile mei_sample.xml
<measure>
  <staff n="2">
    <layer n="1">
      <chord xml:id="d13e1" dur="8" dur.ppq="12" stem.dir="up">
        <note xml:id="d1e101" pname="c" oct="5"/>
        <note xml:id="d1e118" pname="a" oct="4"/>
        <note xml:id="d1e136" pname="c" oct="4"/>
      </chord>
    </layer>
  </staff>
  <staff n="3">
    <layer n="1">
      <chord xml:id="d17e1" dur="8" dur.ppq="12" stem.dir="up">
        <note xml:id="d1e157" pname="f" oct="3"/>
        <note xml:id="d1e174" pname="f" oct="2"/>
      </chord>
    </layer>
  </staff>
</measure>

Writing mei_sample.xml


## Q3(b): Fix the XPath [3 marks]

**The Question:** "I am trying to retrieve all chords in the staff with n of 2... but I only want chords that contain notes with a pname of **f**, but my XPath is incorrect."

**The Incorrect XPath:** `/staff[n="2"]/layer/chord[note/@pname="c"]`

**Issues to identify:**
1. Path starts with `/staff` but `staff` is not the root element
2. `[n="2"]` looks for a *child element* `<n>`, not an *attribute* `@n`
3. The XPath searches for `pname="c"` but we want `pname="f"`

Write a correct XPath expression:

In [19]:
# Test the INCORRECT XPath from the exam (notice the errors!)
print("=== Testing INCORRECT XPath: /staff[n=\"2\"]/layer/chord ===")
print("This fails because: 1) staff is not root, 2) n=\"2\" looks for child element not attribute")
!xmllint --xpath '/staff[n="2"]/layer/chord' mei_sample.xml 2>&1 || true

print("\n=== What we actually want: chords in staff n=2 with notes containing pname='f' ===")
print("But staff n=2 has notes with pname c and a, NOT f!")
print("Staff n=3 has the notes with pname='f'")

=== Testing INCORRECT XPath: /staff[n="2"]/layer/chord ===
This fails because: 1) staff is not root, 2) n="2" looks for child element not attribute
XPath set is empty

=== What we actually want: chords in staff n=2 with notes containing pname='f' ===
But staff n=2 has notes with pname c and a, NOT f!
Staff n=3 has the notes with pname='f'


In [None]:
# Write your CORRECTED XPath here:
# Hint: Use // for descendant, [@attr] for attributes, and fix the pname value

In [21]:
# Example: Find all chords in staff with @n="2" that have notes with pname="c"
!xmllint --xpath '...' mei_sample.xml 2>&1

<chord xml:id="d13e1" dur="8" dur.ppq="12" stem.dir="up">
        <note xml:id="d1e101" pname="c" oct="5"/>
        <note xml:id="d1e118" pname="a" oct="4"/>
        <note xml:id="d1e136" pname="c" oct="4"/>
      </chord>


In [None]:
print("\n--- Now try to find chords with notes having pname='f' ---")
# Your answer: (note: this would be in staff n="3", not n="2")
!xmllint --xpath '...' mei_sample.xml 2>&1

## Q3(c)(i): Translate the first chord to JSON [5 marks]

First chord element:
```xml
<chord xml:id="d13e1" dur="8" dur.ppq="12" stem.dir="up">
  <note xml:id="d1e101" pname="c" oct="5"/>
  <note xml:id="d1e118" pname="a" oct="4"/>
  <note xml:id="d1e136" pname="c" oct="4"/>
</chord>
```

In [None]:
import json

# Create your JSON representation:
chord_json = {
    # Fill in here
}

print(json.dumps(chord_json, indent=2))

## Q3(c)(ii): MongoDB find query [5 marks]

Find chords with upward stems that have 'f' in one of their notes.

### Sample Data: Chord Documents

```json
{
  "xml_id": "d13e1",
  "dur": 8,
  "stem_dir": "up",
  "notes": [
    {"pname": "c", "oct": 5},
    {"pname": "a", "oct": 4},
    {"pname": "c", "oct": 4}
  ]
}

{
  "xml_id": "d17e1",
  "dur": 8,
  "stem_dir": "up",
  "notes": [
    {"pname": "f", "oct": 3},
    {"pname": "f", "oct": 2}
  ]
}
```

### Setup

In [24]:
%%bash
# Insert chord data using mongo shell (command line)
mongo music_db --quiet --eval '
db.chords.drop();
db.chords.insertMany([
    {
        "xml_id": "d13e1",
        "dur": 8,
        "stem_dir": "up",
        "notes": [
            {"pname": "c", "oct": 5},
            {"pname": "a", "oct": 4},
            {"pname": "c", "oct": 4}
        ]
    },
    {
        "xml_id": "d17e1",
        "dur": 8,
        "stem_dir": "up",
        "notes": [
            {"pname": "f", "oct": 3},
            {"pname": "f", "oct": 2}
        ]
    }
]);
'

{
	"acknowledged" : true,
	"insertedIds" : [
		ObjectId("697c6e0403e794b2441c691d"),
		ObjectId("697c6e0403e794b2441c691e")
	]
}


In [None]:
%%bash
# Write your MongoDB find query using mongo shell:
# Find chords with stem_dir="up" AND notes.pname="f"

mongo music_db --quiet --eval '
db.chords.find({
    // Fill in your query here
}).pretty();

In [None]:
# Example answer (uncomment to test):
# mongo music_db --quiet --eval 'db.chords.find({"stem_dir": "up", "notes.pname": "f"})'

## Q3(d)(ii): Write RDF in Turtle for the first chord [5 marks]

In [None]:
%%writefile chord.ttl
@prefix mei: <http://example.org/mei#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

# Write your RDF triples here:


In [None]:
# Validate your Turtle using both methods:

# Method 1: Command-line with rapper (quick validation)
print("=== rapper (command-line validation) ===")
!rapper -i turtle -c chord.ttl 2>&1

# Method 2: Python rdflib (parse and inspect triples)
print("\n=== rdflib (Python - inspect triples) ===")
from rdflib import Graph
g = Graph()
try:
    g.parse('chord.ttl', format='turtle')
    print(f"Valid Turtle! Contains {len(g)} triples.")
    print("\nTriples:")
    for s, p, o in g:
        print(f"  {s} -- {p} --> {o}")
except Exception as e:
    print(f"Error: {e}")

---

# 4. Question 4: Zoo Database

## Sample Data

**Zoo:**
| Name | Country |
|------|---------|
| Singapore Zoo | Singapore |
| London Zoo | UK |

**Enclosure:**
| Name | Location | ZooName |
|------|----------|---------|
| Tropical Aviary | Mandai Lake | Singapore Zoo |
| Savannah Zone | Outer Gardens | Singapore Zoo |
| Reptile House | Regents Park | London Zoo |
| Bird Paradise | Regents Park | London Zoo |

**Species:**
| LatinName | ConservationStatus |
|-----------|--------------------|
| Buceros bicornis | Vulnerable |
| Panthera leo | Vulnerable |
| Varanus komodoensis | Endangered |

**Animal:**
| Identifier | DateOfBirth | SpeciesLatinName | EnclosureName |
|------------|-------------|-----------|---------------|
| 1 | 2010-04-10 | Buceros bicornis | Tropical Aviary |
| 2 | 2012-06-15 | Panthera leo | Savannah Zone |
| 3 | 2005-02-01 | Varanus komodoensis | Reptile House |
| 4 | 2015-09-09 | Buceros bicornis | Savannah Zone |
| 5 | 2008-03-15 | Buceros bicornis | Bird Paradise |
| 6 | 2018-11-20 | Buceros bicornis | Bird Paradise |

*Note: Sample data includes Buceros bicornis (Great Hornbill) in both zoos for Q4(d)*

## Data Setup

In [25]:
%%sql
DROP TABLE IF EXISTS SightingsNorm;
DROP TABLE IF EXISTS Animal;
DROP TABLE IF EXISTS Species;
DROP TABLE IF EXISTS Enclosure;
DROP TABLE IF EXISTS Zoo;


CREATE TABLE Zoo (
    Name VARCHAR(255) PRIMARY KEY,
    Country VARCHAR(255)
);

CREATE TABLE Enclosure (
    Name VARCHAR(255) PRIMARY KEY,
    Location VARCHAR(255),
    ZooName VARCHAR(255),
    FOREIGN KEY (ZooName) REFERENCES Zoo(Name)
);

CREATE TABLE Species (
    LatinName VARCHAR(255) PRIMARY KEY,
    ConservationStatus VARCHAR(50)
);

CREATE TABLE Animal (
    Identifier INT AUTO_INCREMENT PRIMARY KEY,
    DateOfBirth DATE,
    SpeciesLatinName VARCHAR(255),
    EnclosureName VARCHAR(255),
    FOREIGN KEY (SpeciesLatinName) REFERENCES Species(LatinName),
    FOREIGN KEY (EnclosureName) REFERENCES Enclosure(Name)
);

-- Insert Zoos
INSERT INTO Zoo VALUES ('Singapore Zoo', 'Singapore'), ('London Zoo', 'UK');

-- Insert Enclosures (note: Bird Paradise added for London Zoo)
INSERT INTO Enclosure VALUES
('Tropical Aviary', 'Mandai Lake', 'Singapore Zoo'),
('Savannah Zone', 'Outer Gardens', 'Singapore Zoo'),
('Reptile House', 'Regents Park', 'London Zoo'),
('Bird Paradise', 'Regents Park', 'London Zoo');

-- Insert Species (Varanus komodoensis = Komodo dragon, more appropriate for Reptile House)
INSERT INTO Species VALUES
('Buceros bicornis', 'Vulnerable'),
('Panthera leo', 'Vulnerable'),
('Varanus komodoensis', 'Endangered');

-- Insert Animals (Buceros bicornis in BOTH zoos for Q4(d) to be meaningful)
INSERT INTO Animal (DateOfBirth, SpeciesLatinName, EnclosureName) VALUES
('2010-04-10', 'Buceros bicornis', 'Tropical Aviary'),   -- Singapore, older
('2012-06-15', 'Panthera leo', 'Savannah Zone'),         -- Singapore
('2005-02-01', 'Varanus komodoensis', 'Reptile House'),  -- London
('2015-09-09', 'Buceros bicornis', 'Savannah Zone'),     -- Singapore, younger
('2008-03-15', 'Buceros bicornis', 'Bird Paradise'),     -- London, older
('2018-11-20', 'Buceros bicornis', 'Bird Paradise');     -- London, younger

SELECT 'Zoo database ready!' AS Status;

 * mysql+pymysql://examuser:***@localhost/exam_db
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
0 rows affected.
2 rows affected.
4 rows affected.
3 rows affected.
6 rows affected.
1 rows affected.


Status
Zoo database ready!


## Q4(b): Write CREATE TABLE for two tables [6 marks]

In [None]:
%%sql
-- Write CREATE TABLE commands for any two tables:


## Q4(c): Count species in Singapore Zoo [5 marks]

In [None]:
%%sql
-- Write your query here:


## Q4(d): Find oldest 'Buceros bicornis' in each zoo [5 marks]

In [None]:
%%sql
-- Write your query here:


## Q4(e): Write instance data in RDF [10 marks]

In [None]:
%%writefile zoo.ttl
@prefix zoo: <http://example.org/zoo#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

# Write your RDF instance data here:


In [None]:
# Validate your Turtle using both methods:

# Method 1: Command-line with rapper (quick validation)
print("=== rapper (command-line validation) ===")
!rapper -i turtle -c zoo.ttl 2>&1

# Method 2: Python rdflib (parse and inspect triples)
print("\n=== rdflib (Python - inspect triples) ===")
from rdflib import Graph
g = Graph()
try:
    g.parse('zoo.ttl', format='turtle')
    print(f"Valid Turtle! Contains {len(g)} triples.")
    print("\nTriples:")
    for s, p, o in g:
        print(f"  {s} -- {p} --> {o}")
except Exception as e:
    print(f"Error: {e}")

---

# Done!

Check your answers against the **solution sheet**.