<a href="https://colab.research.google.com/github/00hiba00/project1/blob/main/lab_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SQL with Python
This lab introduces SQL using Python and SQLite, a lightweight database system. You will learn how to:
- Connect to an SQLite database.
- Create tables and insert data.
- Perform batch data insertion.
- Query the database.

SQLite is ideal for learning as it requires no additional setup and comes pre-installed with Python.


## Install and Import Dependencies
SQLite is included with Python, so no additional installation is required. However, we'll use pandas for CSV manipulation.


In [1]:
import sqlite3
import pandas as pd
import os


## Connect to SQLite Database
We'll create or connect to an SQLite database file named `lab_2.db`. If the file doesn't exist, SQLite will create it.


In [2]:
# remove existing db
if "lab_2.db" in os.listdir():
  os.remove("lab_2.db")

# Connect to SQLite database
connection = sqlite3.connect("lab_2.db")

# Create a cursor object to execute SQL commands
cursor = connection.cursor()

print("Database connection established.")


Database connection established.


## Creating a Table
We'll create a table called `Students` with the following schema:
- `ID`: INTEGER, Primary Key
- `Name`: TEXT
- `Age`: INTEGER
- `Grade`: TEXT


In [None]:
# Create a table
create_table_query = '''
CREATE TABLE IF NOT EXISTS Students (
    ID INTEGER PRIMARY KEY,
    Name TEXT,
    Age INTEGER,
    Grade TEXT
);
'''
cursor.execute(create_table_query)
connection.commit()

print("Table 'Students' created successfully.")


Table 'Students' created successfully.


## Inserting Data
Insert a few records into the `Students` table.


In [None]:

insert_query = '''
INSERT INTO Students (Name, Age, Grade)
VALUES (?, ?, ?);
'''

# data
students_data = [
    ('Alice', 20, 'A'),
    ('Bob', 22, 'B'),
    ('Charlie', 21, 'A'),
]


cursor.executemany(insert_query, students_data)
connection.commit()

print("Records inserted successfully.")


Records inserted successfully.


## Querying Data
Use SQL SELECT queries to retrieve data from the `Students` table.


In [None]:
# Retrieve data
query = "SELECT * FROM Students;"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)


(1, 'Alice', 20, 'A')
(2, 'Bob', 22, 'B')
(3, 'Charlie', 21, 'A')


## Batch Insertion
We will load data into a new table named `Employees`.
The should contain the following columns:
- `ID` (INTEGER)
- `Name` (TEXT)
- `Position` (TEXT)
- `Salary` (REAL)

Let's create the `Employees` table and write data to table .


In [None]:
# Create Employees table
create_employees_table = '''
CREATE TABLE IF NOT EXISTS Employees (
    ID INTEGER PRIMARY KEY,
    Name TEXT,
    Position TEXT,
    Salary REAL
);
'''
cursor.execute(create_employees_table)

#data
csv_data = [
    (1, 'John Doe', 'Manager', 75000.0),
    (2, 'Jane Smith', 'Developer', 65000.0),
    (3, 'Emily Davis', 'Designer', 50000.0),
]

# Insert data into Employees table
insert_employees_query = '''
INSERT INTO Employees (ID, Name, Position, Salary)
VALUES (?, ?, ?, ?);
'''

cursor.executemany(insert_employees_query, csv_data)
connection.commit()

print("Data loaded from CSV and inserted into 'Employees' table.")


Data loaded from CSV and inserted into 'Employees' table.


In [None]:

csv_data = [
    (4, 'John Doe', 'Manager'),
    (5, 'Jane Smith', 'Developer'),
    (6, 'Emily Davis', 'Designer'),
]
insert_employees_query = '''
INSERT INTO Employees (ID, Name, Position)
VALUES (?, ?, ?);
'''

cursor.executemany(insert_employees_query, csv_data)
connection.commit()

print("Data loaded from CSV and inserted into 'Employees' table.")

Data loaded from CSV and inserted into 'Employees' table.


In [None]:
cursor.execute("SELECT * FROM Employees;")



<sqlite3.Cursor at 0x7b1dbbb888c0>

In [None]:
results = cursor.fetchmany(1)
results

[(1, 'John Doe', 'Manager', 75000.0)]

## Query Examples
Let's perform some queries:
1. Retrieve all employees earning more than $60,000.
2. Count the number of employees.


In [None]:
# Query: Employees earning more than $60,000
query_high_salary = "SELECT * FROM Employees WHERE Salary > 60000;"
cursor.execute(query_high_salary)
# we use fetch all when we have multiple rows as output
high_salary_employees = cursor.fetchall()
print("Employees earning more than $60,000:")
for emp in high_salary_employees:
    print(emp)

# Query: Count the number of employees
query_count = "SELECT COUNT(*) FROM Employees;"
cursor.execute(query_count)
# we use fetch all when we have multiple rows as output
employee_count = cursor.fetchone()[0]
print(f"Total number of employees: {employee_count}")


Employees earning more than $60,000:
(1, 'John Doe', 'Manager', 75000.0)
(2, 'Jane Smith', 'Developer', 65000.0)
Total number of employees: 6


## Closing the Database Connection
Always close the database connection when you're done to ensure data integrity.


In [None]:
# Close the connection
connection.close()
print("Database connection closed.")


Database connection closed.


# It is your turn !
## **Lab: Transform ER Diagram seen in Exercise 2.3 into SQL Tables**

### **Objective**
The goal of this lab is to transform the given ER diagram into SQL tables, populate the tables with sample data, and write SQL queries to answer specific questions.

---

### **Instructions**


1. **Create SQL Tables**  
   - Use `CREATE TABLE` statements to define tables for each entity and relationship.  
   - Ensure proper use of primary keys, foreign keys, and data types.

2. **Insert Sample Data**  
   - Populate each table with at least 5–10 rows of sample data.

3. **Write SQL Queries**  
   - Use SQL queries to answer the provided questions.

---

### **Questions**

1. **List all professors and their research specialties.**  

2. **Retrieve all projects managed by a specific professor.**  

3. **Find the names of graduate students working on a specific project.**  

4. **Find professors supervising graduate students on a specific project.**  

5. **Count how many professors work in each department.**  

6. **List graduate students and their advisors.**  

7. **Find all projects with a budget greater than $1,000,000.**  

8. **Retrieve all departments and their chairmen.**  

9. **Find the total number of projects a specific professor is managing.**  

10. **Find the names of graduate students and their major department.**  



In [3]:
#CREATE TABLES
#professors
create_table_Professor = '''
CREATE TABLE IF NOT EXISTS Professor (
    prof_ssn CHAR(10) PRIMARY KEY,
    Name CHAR(64),
    Age INTEGER,
    Rank INTEGER,
    Speciality CHAR(64)
);
'''
cursor.execute(create_table_Professor)

create_table_Dept= '''
CREATE TABLE IF NOT EXISTS Dept (
    dno INTEGER PRIMARY KEY,
    dName CHAR(64),
    Office CHAR(10)
);
'''
cursor.execute(create_table_Dept)

create_table_Runs = '''
CREATE TABLE IF NOT EXISTS Runs (
    dno INTEGER,
    prof_ssn CHAR(10),
    PRIMARY KEY ( dno, prof_ssn),
    FOREIGN KEY (prof_ssn) REFERENCES Professor(prof_ssn),
    FOREIGN KEY (dno) REFERENCES Dept(dno)
);
'''
cursor.execute(create_table_Runs)

create_table_Work_Dept = '''
CREATE TABLE IF NOT EXISTS Work_Dept (
    dno INTEGER,
    prof_ssn CHAR(10),
    pc_time INTEGER,
    PRIMARY KEY (dno, prof_ssn),
    FOREIGN KEY (prof_ssn) REFERENCES Professor(prof_ssn),
    FOREIGN KEY (dno) REFERENCES Dept(dno)
);
'''
cursor.execute(create_table_Work_Dept)

create_table_Projet = '''
CREATE TABLE IF NOT EXISTS Projet (
    pid INTEGER PRIMARY KEY,
    sponsor CHAR(32),
    start_date DATE,
    end_date DATE,
    budget FLOAT
);
'''
cursor.execute(create_table_Projet)

create_table_Graduate = '''
CREATE TABLE IF NOT EXISTS Graduate (
    grad_ssn CHAR(10) PRIMARY KEY,
    age INTEGER,
    name CHAR(64),
    deg_prog CHAR(32),
    major INTEGER,
    FOREIGN KEY (major) REFERENCES Dept(dName)
);
'''
cursor.execute(create_table_Graduate)

create_table_Advisor = '''
CREATE TABLE IF NOT EXISTS Advisor (
    senior_ssn CHAR(10),
    grad_ssn CHAR(10),
    PRIMARY KEY (senior_ssn, grad_ssn),
    FOREIGN KEY (senior_ssn) REFERENCES Graduate(grad_ssn),
    FOREIGN KEY (grad_ssn) REFERENCES Graduate(grad_ssn)
);
'''
cursor.execute(create_table_Advisor)

create_table_Manage = '''
CREATE TABLE IF NOT EXISTS Manage (
    pid INTEGER,
    prof_ssn CHAR(10),
    PRIMARY KEY (pid, prof_ssn),
    FOREIGN KEY (prof_ssn) REFERENCES Professor(prof_ssn),
    FOREIGN KEY (pid) REFERENCES Project(pid)
);
'''
cursor.execute(create_table_Manage)

create_table_Work_In = '''
CREATE TABLE IF NOT EXISTS Work_In (
    pid INTEGER,
    prof_ssn CHAR(10),
    PRIMARY KEY (pid, prof_ssn),
    FOREIGN KEY (prof_ssn) REFERENCES Professor(prof_ssn),
    FOREIGN KEY (pid) REFERENCES Project(pid)
);
'''
cursor.execute(create_table_Work_In)

create_table_Supervise = '''
CREATE TABLE IF NOT EXISTS Supervise (
    prof_ssn CHAR(10),
    grad_ssn CHAR(10),
    pid INTEGER,
    PRIMARY KEY (prof_ssn, grad_ssn, pid),
    FOREIGN KEY (prof_ssn) REFERENCES Professor(prof_ssn),
    FOREIGN KEY (grad_ssn) REFERENCES Graduate(grad_ssn),
    FOREIGN KEY (pid) REFERENCES Project(pid)
);
'''
cursor.execute(create_table_Supervise)

connection.commit()

In [4]:
#inserting data
insert_query1= '''
INSERT INTO Professor (prof_ssn, Name, Age, Rank, Speciality)
VALUES (?, ?, ?, ?, ?);
'''

# Sample data
professors_data = [
    ('1234567890', 'Dr. Alice Johnson', 45, 1, 'Computer Science'),
    ('0987654321', 'Dr. Bob Smith', 50, 2, 'Mathematics'),
    ('1122334455', 'Dr. Charlie Brown', 38, 3, 'Physics'),
    ('2233445566', 'Dr. Diana Prince', 40, 2, 'Biology'),
    ('3344556677', 'Dr. Ethan Hunt', 42, 1, 'Chemistry'),
    ('4455667788', 'Dr. Fiona Gallagher', 35, 3, 'Psychology'),
    ('5566778899', 'Dr. George Clooney', 48, 2, 'Engineering'),
]

# Execute insertion
cursor.executemany(insert_query1, professors_data)

# Insert query
insert_query2 = '''
INSERT INTO Dept (dno, dName, Office)
VALUES (?, ?, ?);
'''

# Sample data
departments_data = [
    (1, 'Computer Science', 'C101'),
    (2, 'Mathematics', 'M202'),
    (3, 'Physics', 'P303'),
    (4, 'Biology', 'B404'),
    (5, 'Chemistry', 'C505'),
    (6, 'Psychology', 'P606'),
    (7, 'Engineering', 'E707'),
]

# Execute insertion
cursor.executemany(insert_query2, departments_data)

# Insert query
insert_query3 = '''
INSERT INTO Runs (dno, prof_ssn)
VALUES (?, ?);
'''

# Sample data
runs_data = [
    (1, '1234567890'),  # Department 1 (Computer Science), Professor Alice Johnson
    (2, '0987654321'),  # Department 2 (Mathematics), Professor Bob Smith
    (3, '1122334455'),  # Department 3 (Physics), Professor Charlie Brown
    (4, '2233445566'),  # Department 4 (Biology), Professor Diana Prince
    (5, '3344556677'),  # Department 5 (Chemistry), Professor Ethan Hunt
    (6, '4455667788'),  # Department 6 (Psychology), Professor Fiona Gallagher
    (7, '5566778899'),  # Department 7 (Engineering), Professor George Clooney
]

# Execute insertion
cursor.executemany(insert_query3, runs_data)

# Insert query
insert_query4 = '''
INSERT INTO Work_Dept (dno, prof_ssn, pc_time)
VALUES (?, ?, ?);
'''

# Sample data
work_dept_data = [
    (1, '1234567890', 20),  # Professor Alice Johnson spends 20 hours
    (2, '0987654321', 25),  # Professor Bob Smith spends 25 hours
    (3, '1122334455', 15),  # Professor Charlie Brown spends 15 hours
    (4, '2233445566', 30),  # Professor Diana Prince spends 30 hours
    (5, '3344556677', 18),  # Professor Ethan Hunt spends 18 hours
    (6, '4455667788', 22),  # Professor Fiona Gallagher spends 22 hours
    (7, '5566778899', 28),  # Professor George Clooney spends 28 hours
]

# Execute insertion
cursor.executemany(insert_query4, work_dept_data)

# Insert query
insert_query5 = '''
INSERT INTO Projet (pid, sponsor, start_date, end_date, budget)
VALUES (?, ?, ?, ?, ?);
'''

# Sample data
projet_data = [
    (101, 'NASA', '2024-01-01', '2024-12-31', 500000),
    (102, 'NIH', '2023-05-01', '2025-04-30', 300000),
    (103, 'Google', '2024-06-01', '2024-11-30', 200000),
    (104, 'Microsoft', '2023-01-15', '2024-01-15', 450000),
    (105, 'Intel', '2024-03-01', '2025-02-28', 350000),
    (106, 'IBM', '2023-09-01', '2024-08-31', 250000),
    (107, 'Amazon', '2022-01-01', '2024-12-31', 600000),
]

# Execute insertion
cursor.executemany(insert_query5, projet_data)

# Insert query
insert_query6 = '''
INSERT INTO Graduate (grad_ssn, age, name, deg_prog, major)
VALUES (?, ?, ?, ?, ?);
'''

# Sample data
graduate_data = [
    ('1000000001', 24, 'Emily Davis', 'MSc', 1),  # Computer Science
    ('1000000002', 23, 'James Brown', 'PhD', 2),  # Mathematics
    ('1000000003', 22, 'Liam Smith', 'MSc', 3),  # Physics
    ('1000000004', 25, 'Sophia Johnson', 'PhD', 4),  # Biology
    ('1000000005', 26, 'Oliver Garcia', 'MSc', 5),  # Chemistry
    ('1000000006', 27, 'Isabella Martinez', 'PhD', 6),  # Psychology
    ('1000000007', 23, 'Noah Wilson', 'MSc', 7),  # Engineering
]

# Execute insertion
cursor.executemany(insert_query6, graduate_data)

# Insert query
insert_query7 = '''
INSERT INTO Advisor (senior_ssn, grad_ssn)
VALUES (?, ?);
'''

# Sample data
advisor_data = [
    ('1000000002', '1000000001'),  # Senior James Brown advises Emily Davis
    ('1000000004', '1000000003'),  # Senior Sophia Johnson advises Liam Smith
    ('1000000006', '1000000005'),  # Senior Isabella Martinez advises Oliver Garcia
    ('1000000007', '1000000006'),  # Senior Noah Wilson advises Isabella Martinez
    ('1000000001', '1000000007'),  # Senior Emily Davis advises Noah Wilson
    ('1000000003', '1000000002'),  # Senior Liam Smith advises James Brown
    ('1000000005', '1000000004'),  # Senior Oliver Garcia advises Sophia Johnson
]

# Execute insertion
cursor.executemany(insert_query7, advisor_data)

# Insert query
insert_query8 = '''
INSERT INTO Manage (pid, prof_ssn)
VALUES (?, ?);
'''

# Sample data
manage_data = [
    (101, '1234567890'),  # Professor Alice Johnson manages Project 101
    (102, '1234567890'),  # Professor Bob Smith manages Project 102
    (103, '1122334455'),  # Professor Charlie Brown manages Project 103
    (104, '2233445566'),  # Professor Diana Prince manages Project 104
    (105, '3344556677'),  # Professor Ethan Hunt manages Project 105
    (106, '4455667788'),  # Professor Fiona Gallagher manages Project 106
    (107, '5566778899'),  # Professor George Clooney manages Project 107
]

# Execute insertion
cursor.executemany(insert_query8, manage_data)

# Insert query
insert_query9 = '''
INSERT INTO Work_In (pid, prof_ssn)
VALUES (?, ?);
'''

# Sample data
work_in_data = [
    (101, '1234567890'),  # Professor Alice Johnson works on Project 101
    (102, '0987654321'),  # Professor Bob Smith works on Project 102
    (103, '1122334455'),  # Professor Charlie Brown works on Project 103
    (104, '2233445566'),  # Professor Diana Prince works on Project 104
    (105, '3344556677'),  # Professor Ethan Hunt works on Project 105
    (106, '4455667788'),  # Professor Fiona Gallagher works on Project 106
    (107, '5566778899'),  # Professor George Clooney works on Project 107
]

# Execute insertion
cursor.executemany(insert_query9, work_in_data)

# Insert query
insert_query10 = '''
INSERT INTO Supervise (prof_ssn, grad_ssn, pid)
VALUES (?, ?, ?);
'''

# Sample data
supervise_data = [
    ('1234567890', '1000000001', 101),  # Professor Alice Johnson supervises Emily Davis on Project 101
    ('0987654321', '1000000002', 102),  # Professor Bob Smith supervises James Brown on Project 102
    ('1122334455', '1000000003', 103),  # Professor Charlie Brown supervises Liam Smith on Project 103
    ('2233445566', '1000000004', 104),  # Professor Diana Prince supervises Sophia Johnson on Project 104
    ('3344556677', '1000000005', 105),  # Professor Ethan Hunt supervises Oliver Garcia on Project 105
    ('4455667788', '1000000006', 106),  # Professor Fiona Gallagher supervises Isabella Martinez on Project 106
    ('5566778899', '1000000007', 107),  # Professor George Clooney supervises Noah Wilson on Project 107
]

# Execute insertion
cursor.executemany(insert_query10, supervise_data)


connection.commit()


In [None]:
#visualize data
query = "SELECT * FROM Professor;"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('1234567890', 'Dr. Alice Johnson', 45, 1, 'Computer Science')
('0987654321', 'Dr. Bob Smith', 50, 2, 'Mathematics')
('1122334455', 'Dr. Charlie Brown', 38, 3, 'Physics')
('2233445566', 'Dr. Diana Prince', 40, 2, 'Biology')
('3344556677', 'Dr. Ethan Hunt', 42, 1, 'Chemistry')
('4455667788', 'Dr. Fiona Gallagher', 35, 3, 'Psychology')
('5566778899', 'Dr. George Clooney', 48, 2, 'Engineering')


In [None]:
#all professor and their research specialities
query = "SELECT Name, Speciality FROM Professor;"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('Dr. Alice Johnson', 'Computer Science')
('Dr. Bob Smith', 'Mathematics')
('Dr. Charlie Brown', 'Physics')
('Dr. Diana Prince', 'Biology')
('Dr. Ethan Hunt', 'Chemistry')
('Dr. Fiona Gallagher', 'Psychology')
('Dr. George Clooney', 'Engineering')


In [None]:
#all projects managed by a specific professor (Dr. Alice Johnson)
query = "SELECT m.pid FROM Manage as m JOIN Professor as p ON m.prof_ssn=p.prof_ssn WHERE p.Name='Dr. Alice Johnson';"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

(101,)
(102,)


In [None]:
#names of graduate students working on a specific project(ex 101)
query = "SELECT g.name FROM Graduate as g JOIN Supervise as s ON g.grad_ssn=s.grad_ssn WHERE s.pid=101;"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('Emily Davis',)


In [None]:
#professors supervising graduate students on a specific project
query = "SELECT p.Name FROM Professor as p JOIN Supervise as s ON p.prof_ssn=s.prof_ssn WHERE s.pid=101"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('Dr. Alice Johnson',)


In [6]:
#count professor working in each dept
query = "SELECT w.dno, COUNT(w.prof_ssn) FROM Work_Dept as w GROUP BY w.dno;"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

(1, 1)
(2, 1)
(3, 1)
(4, 1)
(5, 1)
(6, 1)
(7, 1)


In [10]:
#List graduate students and their advisors.
query = "SELECT g1.name, g2.name FROM Graduate as g1 JOIN Advisor as a ON g1.grad_ssn=a.grad_ssn JOIN Graduate as g2 ON g2.grad_ssn=a.senior_ssn"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('Emily Davis', 'James Brown')
('Liam Smith', 'Sophia Johnson')
('Oliver Garcia', 'Isabella Martinez')
('Isabella Martinez', 'Noah Wilson')
('Noah Wilson', 'Emily Davis')
('James Brown', 'Liam Smith')
('Sophia Johnson', 'Oliver Garcia')


In [11]:
#all projects with a budget greater than $1,000,000
query = "SELECT * FROM Projet as p WHERE p.budget>1000000;"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

In [12]:
#all departments and their chairmen
query = "SELECT d.dName, p.Name FROM Dept as d JOIN Runs as r ON r.dno=d.dno JOIN Professor as p ON p.prof_ssn=r.prof_ssn"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('Computer Science', 'Dr. Alice Johnson')
('Mathematics', 'Dr. Bob Smith')
('Physics', 'Dr. Charlie Brown')
('Biology', 'Dr. Diana Prince')
('Chemistry', 'Dr. Ethan Hunt')
('Psychology', 'Dr. Fiona Gallagher')
('Engineering', 'Dr. George Clooney')


In [16]:
#the total number of projects a specific professor is managing (Dr. Alice Johnson)
query= "SELECT COUNT(m.pid) FROM Manage as m JOIN Professor as p ON p.prof_ssn=m.prof_ssn GROUP BY m.prof_ssn HAVING p.Name='Dr. Alice Johnson' "
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

(2,)


In [17]:
# names of graduate students and their major department.
query = "SELECT g.name, d.dName FROM Graduate as g JOIN Dept as d ON g.major=d.dno"
cursor.execute(query)

# Fetch and display results
results = cursor.fetchall()

for row in results:
    print(row)

('Emily Davis', 'Computer Science')
('James Brown', 'Mathematics')
('Liam Smith', 'Physics')
('Sophia Johnson', 'Biology')
('Oliver Garcia', 'Chemistry')
('Isabella Martinez', 'Psychology')
('Noah Wilson', 'Engineering')
