<img src="images/lasalle_logo.png" style="width:375px;height:110px;">
<p style=  "text-align: right; color: blue;"> WIM250 - Fall 2025</p>

# Week 8 - Basic Structured Query Language

### WIM250 - Introduction to Scripting Languages 
### Instructor: Ivaldo Tributino

Sources:
- Python for Everybody Exploring Data Using Python 3 by Dr. Charles R. Severance.
- [Oracle: what is database](https://www.oracle.com/ca-en/database/what-is-database/)

## 1. What is a database

A `database` is an organized collection of structured information, or data, typically stored electronically in a computer system. 

Data within the most common types of databases in operation today is typically modeled in rows and columns in a series of tables to make processing and data querying efficient. The data can then be easily accessed, managed, modified, updated, controlled, and organized. Most databases use `structured query language (SQL)` for writing and querying data.

`SQL` is a programming language used by nearly all relational databases to query, manipulate, and define data, and to provide access control. SQL was first developed at IBM in the 1970s with Oracle as a major contributor, which led to implementation of the SQL ANSI standard, SQL has spurred many extensions from companies such as IBM, Oracle, and Microsoft.

There are many different database systems which are used for a wide variety of purposes including: Oracle, MySQL, Microsoft SQL Server, PostgreSQL, and SQLite. We focus on `SQLite` in this course because it is a very common database and is already built into Python. `SQLite` is designed to be embedded into other applications to provide database support within the application.

<img src="images/database.png" style="width:100px;height:100px;">

Download [SQLite](https://sqlitebrowser.org/dl/).

## 2. Creating a database table

Defining structure for your data up front may seem inconvenient at the beginning, but the payoff is fast access to your data even when the database contains a large amount of data.

The code to create a database file and a table named `Courses` with three columns, `course_id, course_name and teacher_id`,  is as follows:


In [None]:
import sqlite3

def initialize_database(db_name: str = 'courses.sqlite') -> None:
    """
    Initializes the SQLite database by creating the Courses table.
    
    Parameters:
    - db_name (str): Name of the SQLite database file.
    """
    try:
        # Connect to the SQLite database
        conn = sqlite3.connect(db_name)
        cur = conn.cursor()

        # Drop the Courses table if it exists
        cur.execute('DROP TABLE IF EXISTS Courses')

        # Create the Courses table with appropriate columns
        cur.execute('''
            CREATE TABLE Courses (
                course_id TEXT NOT NULL,
                course_name TEXT NOT NULL,
                teacher_id INTEGER
            )
        ''')

        # Commit changes and close the connection
        conn.commit()
        print(f"Database '{db_name}' initialized successfully.")
    except sqlite3.Error as e:
        print(f"SQLite error: {e}")
    finally:
        conn.close()


In [None]:
initialize_database()

You can look at the various data types supported by SQLite at the following url:
http://www.sqlite.org/datatypes.html

As a convention, we will show the `SQL keywords` in uppercase and the parts of the execute command that we are adding (such as the table and column names) will be shown in lowercase.

Now that we have created a table named `Courses`, we can put some data into that table using the `SQL INSERT` operation. Again, we begin by making a connection to the database and obtaining the `cursor`. We can then execute SQL commands using the cursor.

The SQL INSERT command indicates which table we are using and then defines a new row by listing the fields we want to includel. Example

```python
'INSERT INTO Courses (course_id, course_name and teacher_id) VALUES (?, ?, ?)', ('GE091', 'Math Transitional', 3363)
```

In [None]:
import sqlite3

def insert_and_display_courses(db_name: str = 'courses.sqlite'):
    """
    Inserts sample course data into the Courses table and displays all records.
    
    Parameters:
    - db_name (str): Name of the SQLite database file.
    """
    try:
        # Connect to the database
        conn = sqlite3.connect(db_name)
        cur = conn.cursor()

        # Insert sample course records
        courses = [
            ('GE091', 'Math Transitional', 3363),
            ('MTH180', 'Geometry', 3363)
        ]
        cur.executemany('INSERT INTO Courses (course_id, course_name, teacher_id) VALUES (?, ?, ?)', courses)

        # Commit changes
        conn.commit()

        # Retrieve and display all courses
        print('Courses:')
        cur.execute('SELECT course_id, course_name, teacher_id FROM Courses')
        rows = cur.fetchall()
        for row in rows:
            print(f"Course ID: {row[0]}, Name: {row[1]}, Teacher ID: {row[2]}")

    except sqlite3.Error as e:
        print(f"SQLite error: {e}")
    finally:
        # Close the cursor and connection
        cur.close()
        conn.close()


In [None]:
insert_and_display_courses()

INSERT INTO Courses (course_id, course_name, teacher_id) VALUES ('WIM250','Introduction to Scripting Languages',3363)Now let's open `DB browser for SQLite`, click `Open Database` and look for `courses.sqlite`. Let's look at `Browse Data` and add the row `MTH100, Mathematics, 3363`.

<img src="images/browseData.png" style="width:700px;height:400px;">

Well, we need to know some SQL commands, so let's add two rows using the SQL `INSERT` command in `Execute SQL`:

```SQL
INSERT INTO Courses (course_id, course_name, teacher_id) VALUES ('WIM250','Introduction to Scripting Languages',3363);
INSERT INTO Courses (course_id, course_name, teacher_id) VALUES ('WDIM150','Introduction to Scripting Languages',3363)
```

<img src="images/executeSQL.png" style="width:700px;height:400px;">

Cool, huh? We add rows to our database in 3 different ways.

### Let's see two more commands the `SELECT` and `UPDATE`.

```SQL
SELECT * FROM Courses WHERE course_name = 'Introduction to Scripting Languages'
```
Using * indicates that you want the database to return all of the columns for each row that matches the `WHERE clause`.

**Note, unlike in Python, in a SQL WHERE clause we use a single equal sign to indicate a test for equality rather than a double equal sign. Other logical operations allowed in a WHERE clause include <, >, <=, >=, !=, as well as AND and OR and parentheses to build your logical expressions.**


- You can request that the returned rows be sorted by one of the fields as follows:
```SQL
SELECT course_name, course_id FROM Courses ORDER BY course_name
```
- To remove a row, you need a WHERE clause on an SQL DELETE statement. The WHERE clause determines which rows are to be deleted: 

```SQL
DELETE FROM Courses WHERE course_id = 'WDIM150'
```

- To DELETE rows that contain NULL, use:
```SQL
DELETE FROM Courses WHERE course_id IS NULL
```
- If you have duplicated lines and would like to DELETE them, you can use this program.

```SQL
DELETE from Courses
WHERE rowid NOT IN (select min(rowid)
                    FROM Courses
                    GROUP BY course_id, course_name, teacher_id)

```

- It is possible to UPDATE a column or columns within one or more rows in a table using the SQL UPDATE statement as follows:

```SQL
UPDATE Courses SET teacher_id = 5491 WHERE course_name = 'Introduction to Scripting Languages'
```
The UPDATE statement specifies a table and then a list of fields and values to change after the SET keyword and then an optional WHERE clause to select the rows that are to be updated. A single UPDATE statement will change all of the rows that match the WHERE clause. If a WHERE clause is not specified, it performs the UPDATE on all of the rows in the table.

## 3 Data storage 

Let's take some information from the json file named `daily` and create a database with it. We are interested in the name of the city, the coordinates of the city and the maximum temperature among the max daily temperature in Celsius within the 16 days.

In [None]:
import json

def load_daily_data(filepath: str = 'daily_14.json') -> list:
    """
    Loads line-delimited JSON objects from a file.

    Parameters:
    - filepath (str): Path to the JSON file.

    Returns:
    - list: A list of parsed JSON objects.
    """
    daily_data = []
    try:
        with open(filepath, 'r', encoding='utf-8') as file:
            for line in file:
                line = line.strip()
                if line:  # Skip empty lines
                    daily_data.append(json.loads(line))
    except (json.JSONDecodeError, FileNotFoundError) as e:
        print(f"Error loading JSON data: {e}")
    return daily_data


In [None]:
daily_data = load_daily_data()
type(daily_data)

In [None]:
print('There are %d distinct cities' %len(daily_data))
print('You can search %d day weather forecast with daily average parameters by city name' %len(daily_data[0]['data']))

Let's see what kind of information is offered for each city. For more information see [forecast16](https://openweathermap.org/forecast16). 

In [None]:
print(json.dumps(daily_data[0], indent = 4, sort_keys=True))

Let's create a class called `daily` with the following attributes:

- city_name
- longitude
- latitude
- max_tem

In [None]:
class daily:
    
    def __init__(self, city_name, longitude, latitude, max_temp):
        
        self.city_name = city_name
        self.longitude = longitude
        self.latitude = latitude
        self.max_temp = max_temp
        
    def __str__(self):
        return f'City: {self.city_name}'    

Now we are going to fill up a dictionary with objects of the newly created class. The keys will be a string as:

```python
'city0' ... 'city22635'
```
and the values the objects of daily class.

In [None]:
# daily_data[0]['city']['coord']['lat']
daily_data[3]['city']['name']

In [None]:
[daily_data[i]['city']['name'] for i in range(len(daily_data))][:10]

In [None]:
dic = {}
for idx, d in enumerate(daily_data):
    
    name = d["city"]['name']
    lon = d["city"]['coord']['lon']
    lat = d["city"]['coord']['lat']
    
    maxm = max([x['temp']['max'] for x in d["data"]])
    maxm = round(maxm - 273.15,2)
    
    dic['city' + str(idx)] = daily(name, lon, lat ,maxm)

In [None]:
print(dic['city22'])
print(dic['city135'].city_name)
print(dic['city134'].longitude)
print(dic['city134'].latitude)
print(dic['city899'].max_temp)

Finally, we will create a table called daily too, we are not very creative.

In [None]:
import sqlite3

def insert_daily_data(dic: dict, db_name: str = 'daily.sqlite') -> None:
    """
    Inserts daily weather data into the Daily table in the SQLite database.

    Parameters:
    - dic (dict): Dictionary of city objects with attributes: city_name, longitude, latitude, max_temp.
    - db_name (str): Name of the SQLite database file.
    """
    try:
        conn = sqlite3.connect(db_name)
        cur = conn.cursor()

        # Recreate the Daily table
        cur.execute('DROP TABLE IF EXISTS Daily')
        cur.execute('''
            CREATE TABLE Daily (
                city_name TEXT NOT NULL,
                longitude REAL,
                latitude REAL,
                max_tem REAL
            )
        ''')

        # Insert data
        for city in dic.values():
            cur.execute('''
                INSERT INTO Daily (city_name, longitude, latitude, max_tem)
                VALUES (?, ?, ?, ?)
            ''', (city.city_name, city.longitude, city.latitude, city.max_temp))

        conn.commit()
        print(f"Inserted {len(dic)} records into 'Daily' table.")

    except sqlite3.Error as e:
        print(f"SQLite error: {e}")
    finally:
        cur.close()
        conn.close()


In [None]:
insert_daily_data(dic)

<img src="images/dailyTable.png" style="width:700px;height:400px;">

In [None]:
# SELECT city_name, max_tem FROM Daily WHERE max_tem > 20

In [None]:
import os
os. getcwd()

In [None]:
import sqlite3
import json
import re
from openweather import air_pollution



def insert_air_pollution_data(daily_data: list, db_name: str = 'air_pollution.sqlite') -> None:
    APIID = ''
    """
    Inserts air pollution data for cities into an SQLite database.

    Parameters:
    - daily_data (list): List of dictionaries containing city metadata.
    - db_name (str): Name of the SQLite database file.
    """
    try:
        conn = sqlite3.connect(db_name)
        cur = conn.cursor()

        # Create table
        cur.execute('DROP TABLE IF EXISTS Air_Pollution')
        cur.execute('''
            CREATE TABLE Air_Pollution (
                city_name TEXT,
                CO REAL,
                NO REAL,
                NO2 REAL,
                O3 REAL,
                SO2 REAL,
                PM2_5 REAL,
                PM10 REAL,
                NH3 REAL
            )
        ''')

        # Insert data
        for i in daily_data:
            name = i['city']['name']
            coord = i['city']['coord']

            try:
                # Get pollution data
                response = air_pollution(APIID, coord)

                # Extract JSON from response using regex

                comp = response['list'][0]['components']
                cur.execute('''
                    INSERT INTO Air_Pollution 
                    (city_name, CO, NO, NO2, O3, SO2, PM2_5, PM10, NH3)
                    VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)
                ''', (
                    name,
                    comp.get('co'),
                    comp.get('no'),
                    comp.get('no2'),
                    comp.get('o3'),
                    comp.get('so2'),
                    comp.get('pm2_5'),
                    comp.get('pm10'),
                    comp.get('nh3')
                    ))

            except Exception as e:
                print(f"Error processing {name}: {e}")

        conn.commit()
        print("Air pollution data inserted successfully.")

    except sqlite3.Error as e:
        print(f"SQLite error: {e}")
    finally:
        cur.close()
        conn.close()


In [None]:
insert_air_pollution_data(daily_data[:60])