# FRE 521D: Data Analytics in Climate, Food and Environment

## Lecture 1: Introduction to Databases and SQL

**Date:** January 5, 2026  
**Instructor:** Asif Ahmed Neloy  
**Term:** Winter 2026

---

## 1. Learning Objectives

By the end of this lecture, you will be able to:

1. Set up a local development environment with Docker and MySQL
2. Understand the difference between SQL and NoSQL databases
3. Identify when to use different types of databases
4. Create databases and tables in MySQL
5. Write basic SQL queries (SELECT, INSERT, UPDATE, DELETE)
6. Understand SQL data types and their appropriate use cases

---

## 2. Environment Setup

Before we dive into databases, we need to set up our development environment. This section provides step-by-step instructions for both Windows and Mac users.

### 2.1 Docker Installation

Docker allows us to run databases in isolated containers without installing them directly on our machines. This makes it easy to set up, tear down, and share consistent development environments.

#### For Windows Users

1. **Enable WSL 2 (Windows Subsystem for Linux)**
   - Open PowerShell as Administrator
   - Run: `wsl --install`
   - Restart your computer when prompted

2. **Download Docker Desktop**
   - Go to: https://www.docker.com/products/docker-desktop/
   - Click "Download for Windows"
   - Run the installer (Docker Desktop Installer.exe)

3. **Install Docker Desktop**
   - Follow the installation wizard
   - Ensure "Use WSL 2 instead of Hyper-V" is checked
   - Click "Ok" to proceed

4. **Start Docker Desktop**
   - Launch Docker Desktop from the Start menu
   - Wait for the whale icon in the system tray to stop animating
   - Accept the terms of service

5. **Verify Installation**
   - Open Command Prompt or PowerShell
   - Run: `docker --version`
   - You should see something like: `Docker version 24.x.x`

#### For Mac Users

1. **Download Docker Desktop**
   - Go to: https://www.docker.com/products/docker-desktop/
   - Click "Download for Mac"
   - Choose the correct version:
     - **Apple Silicon** (M1, M2, M3 chips): Download "Apple Chip"
     - **Intel**: Download "Intel Chip"
   - To check your chip: Click Apple menu, select "About This Mac"

2. **Install Docker Desktop**
   - Open the downloaded .dmg file
   - Drag Docker to the Applications folder
   - Open Docker from Applications

3. **Grant Permissions**
   - Click "Open" when prompted about the application from the internet
   - Enter your password to allow privileged access
   - Accept the terms of service

4. **Start Docker Desktop**
   - Docker will start automatically
   - Look for the whale icon in the menu bar
   - Wait until it shows "Docker Desktop is running"

5. **Verify Installation**
   - Open Terminal
   - Run: `docker --version`
   - You should see something like: `Docker version 24.x.x`

### 2.2 VS Code Extensions

Install the following extensions in VS Code to enhance your development experience:

1. **Python** (Microsoft)
   - Extension ID: `ms-python.python`
   - Provides Python language support, debugging, and IntelliSense

2. **Jupyter** (Microsoft)
   - Extension ID: `ms-toolsai.jupyter`
   - Enables running Jupyter notebooks directly in VS Code

3. **SQLTools** (Matheus Teixeira)
   - Extension ID: `mtxr.sqltools`
   - Database management and query execution

4. **SQLTools MySQL/MariaDB** (Matheus Teixeira)
   - Extension ID: `mtxr.sqltools-driver-mysql`
   - MySQL driver for SQLTools

5. **Docker** (Microsoft)
   - Extension ID: `ms-azuretools.vscode-docker`
   - Manage Docker containers, images, and compose files

**Installation Steps:**
1. Open VS Code
2. Click the Extensions icon in the sidebar (or press `Ctrl+Shift+X` / `Cmd+Shift+X`)
3. Search for each extension by name
4. Click "Install"

### 2.3 Starting Docker Desktop

Before running any containers, ensure Docker Desktop is running:

#### Windows
- Search for "Docker Desktop" in the Start menu and click to open
- Wait for the whale icon in the system tray to become steady (not animated)
- The status should show "Docker Desktop is running"

#### Mac
- Open Docker from Applications (or Spotlight search)
- Wait for the whale icon in the menu bar to become steady
- Click the icon to verify status shows "Docker Desktop is running"

**Troubleshooting:**
- If Docker fails to start on Windows, ensure virtualization is enabled in BIOS
- If Docker fails to start on Mac, ensure you have sufficient disk space (at least 10GB free)
- Restart Docker Desktop if it becomes unresponsive

### 2.4 Creating the Docker Compose File

Docker Compose allows us to define and run multi-container applications. We will use it to set up our MySQL database.

Create a file named `docker-compose.yaml` in your project directory with the following content:

```yaml
services:
  mysql:
    image: mysql:8.0
    container_name: mfre521d-mysql
    restart: unless-stopped
    environment:
      MYSQL_ROOT_PASSWORD: mfre521d_root_pw
      MYSQL_DATABASE: mfre521d
      MYSQL_USER: mfre521d_user
      MYSQL_PASSWORD: mfre521d_user_pw
    ports:
      - "3306:3306"
    volumes:
      - mysql_data:/var/lib/mysql

volumes:
  mysql_data:
```

**Configuration Explained:**

| Parameter | Value | Description |
|-----------|-------|-------------|
| `image` | mysql:8.0 | Uses MySQL version 8.0 official image |
| `container_name` | mfre521d-mysql | Name for easy reference |
| `restart` | unless-stopped | Auto-restart unless manually stopped |
| `MYSQL_ROOT_PASSWORD` | mfre521d_root_pw | Root user password |
| `MYSQL_DATABASE` | mfre521d | Default database created on startup |
| `MYSQL_USER` | mfre521d_user | Non-root user for daily operations |
| `MYSQL_PASSWORD` | mfre521d_user_pw | Password for the non-root user |
| `ports` | 3306:3306 | Maps container port to host port |
| `volumes` | mysql_data | Persists data between container restarts |

### 2.5 Running the MySQL Container

Open a terminal in the directory containing your `docker-compose.yaml` file and run the following commands.

#### Starting the Container

**Windows (Command Prompt or PowerShell):**
```bash
docker-compose up -d
```

**Mac (Terminal):**
```bash
docker-compose up -d
```

The `-d` flag runs the container in detached mode (background).

#### Verifying the Container is Running

```bash
docker ps
```

You should see output similar to:
```
CONTAINER ID   IMAGE       COMMAND                  STATUS         PORTS                    NAMES
abc123def456   mysql:8.0   "docker-entrypoint.s..." Up 2 minutes   0.0.0.0:3306->3306/tcp   mfre521d-mysql
```

#### Stopping the Container

When you are done working:
```bash
docker-compose down
```

To stop and remove all data (fresh start):
```bash
docker-compose down -v
```

### 2.6 Connecting to MySQL from Jupyter Notebook

We will use Python libraries to connect to our MySQL database. First, let's install the required packages.

In [1]:
# Install required packages (run this cell once)
!pip install -q sqlalchemy pymysql ipython-sql cryptography pandas

In [2]:
# Import required libraries
import pymysql
from sqlalchemy import create_engine

# Load the SQL magic extension
%load_ext sql
    
%config SqlMagic.style = '_DEPRECATED_DEFAULT'
%config SqlMagic.autopandas = False

In [3]:
%load_ext sql
%sql mysql+pymysql://mfre521d_user:mfre521d_user_pw@127.0.0.1:3306/mfre521d

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [4]:
# Database connection parameters
DB_USER = "mfre521d_user"
DB_PASSWORD = "mfre521d_user_pw"
DB_HOST = "localhost"
DB_PORT = "3306"
DB_NAME = "mfre521d"

# Create connection string
connection_string = f"mysql+pymysql://{DB_USER}:{DB_PASSWORD}@{DB_HOST}:{DB_PORT}/{DB_NAME}"

# Connect using SQL magic
%sql {connection_string}

In [5]:
%%sql

SELECT 'Connection successful!' AS status, NOW() AS current_ts;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
1 rows affected.


status,current_ts
Connection successful!,2026-01-12 20:55:39


---

## 3. Introduction to Databases

### What is a Database?

A **database** is an organized collection of structured data stored electronically. Databases are managed by **Database Management Systems (DBMS)**, which provide tools for storing, retrieving, and manipulating data efficiently.

### Why Do We Need Databases?

Consider managing data about climate measurements, crop yields, or food supply chains. Without databases, you might store this data in:
- Spreadsheets (Excel, Google Sheets)
- Text files (CSV, JSON)
- In-memory data structures

**Problems with these approaches:**

| Challenge | Without Database | With Database |
|-----------|-----------------|---------------|
| Data volume | Slow with large files | Handles millions of records efficiently |
| Concurrent access | File conflicts | Multiple users simultaneously |
| Data integrity | Manual validation | Enforced constraints |
| Querying | Load entire file | Query only what you need |
| Relationships | Duplicate data | Linked tables, no redundancy |
| Security | File-level only | Row and column-level permissions |

---

## 4. Types of Databases

Databases can be broadly categorized into two main types: **Relational (SQL)** and **Non-Relational (NoSQL)**.

### 4.1 Relational Databases (SQL)

Relational databases store data in **tables** with predefined **schemas**. Tables are related to each other through **keys**.

**Key Characteristics:**
- Data stored in rows and columns (like spreadsheets)
- Fixed schema defined before inserting data
- Relationships between tables using foreign keys
- ACID compliance (Atomicity, Consistency, Isolation, Durability)
- Uses SQL (Structured Query Language) for data manipulation

**Example Structure:**

```
Table: weather_stations
+----+---------------+----------+-----------+
| id | station_name  | latitude | longitude |
+----+---------------+----------+-----------+
| 1  | Vancouver BC  | 49.2827  | -123.1207 |
| 2  | Toronto ON    | 43.6532  | -79.3832  |
+----+---------------+----------+-----------+

Table: temperature_readings
+----+------------+------------+-------------+
| id | station_id | date       | temperature |
+----+------------+------------+-------------+
| 1  | 1          | 2025-01-01 | 5.2         |
| 2  | 1          | 2025-01-02 | 4.8         |
+----+------------+------------+-------------+
```

**Popular Relational Databases:**
- MySQL (open-source, widely used)
- PostgreSQL (open-source, feature-rich)
- Microsoft SQL Server (enterprise)
- Oracle Database (enterprise)
- SQLite (embedded, file-based)

### 4.2 NoSQL Databases

NoSQL ("Not Only SQL") databases provide flexible schemas and are designed for specific data models.

**Types of NoSQL Databases:**

#### Document Databases
Store data as JSON-like documents. Each document can have different fields.
- **Examples:** MongoDB, CouchDB
- **Use case:** Content management, user profiles, catalogs

```json
{
  "station_name": "Vancouver BC",
  "location": {"lat": 49.2827, "lng": -123.1207},
  "sensors": ["temperature", "humidity", "pressure"]
}
```

#### Key-Value Stores
Simple key-value pairs, extremely fast for lookups.
- **Examples:** Redis, Amazon DynamoDB
- **Use case:** Caching, session storage, real-time data

```
"station:1:temp" -> "5.2"
"station:1:humidity" -> "78"
```

#### Column-Family Stores
Store data in columns rather than rows, optimized for queries over large datasets.
- **Examples:** Apache Cassandra, HBase
- **Use case:** Time-series data, IoT sensor data, analytics

#### Graph Databases
Store data as nodes and edges, optimized for relationship queries.
- **Examples:** Neo4j, Amazon Neptune
- **Use case:** Social networks, supply chains, fraud detection

### 4.3 SQL vs NoSQL: When to Use Each

| Criteria | SQL (Relational) | NoSQL |
|----------|------------------|-------|
| **Data Structure** | Structured, predictable | Flexible, evolving |
| **Schema** | Fixed, defined upfront | Dynamic, can change |
| **Relationships** | Complex relationships | Simple or embedded |
| **Transactions** | ACID compliant | Eventually consistent (usually) |
| **Scaling** | Vertical (bigger server) | Horizontal (more servers) |
| **Query Language** | Standardized SQL | Varies by database |
| **Best For** | Financial, inventory, ERP | Real-time, big data, IoT |

**When to Choose SQL:**
- Data has clear relationships (e.g., orders, customers, products)
- Need strong consistency and transaction support
- Complex queries with joins are required
- Data structure is well-defined and stable

**When to Choose NoSQL:**
- Handling large volumes of unstructured data
- Need horizontal scaling across many servers
- Data structure changes frequently
- High-speed read/write operations needed

### 4.4 Database Landscape Overview

| Database | Type | License | Best For |
|----------|------|---------|----------|
| MySQL | Relational | Open Source | Web applications, general purpose |
| PostgreSQL | Relational | Open Source | Complex queries, GIS, analytics |
| SQLite | Relational | Public Domain | Embedded, mobile, testing |
| SQL Server | Relational | Commercial | Enterprise Windows environments |
| Oracle | Relational | Commercial | Large enterprise, mission-critical |
| MongoDB | Document | Open Source | Content management, catalogs |
| Redis | Key-Value | Open Source | Caching, real-time analytics |
| Cassandra | Column-Family | Open Source | Time-series, IoT, high availability |
| Neo4j | Graph | Open Source | Social networks, recommendations |
| BigQuery | Cloud DW | Commercial | Big data analytics, ML |

**In this course, we will primarily use MySQL and Google BigQuery.**

---

## 5. SQL Fundamentals

### 5.1 What is SQL?

**SQL (Structured Query Language)** is the standard language for interacting with relational databases. It allows you to:

- **Define** data structures (DDL - Data Definition Language)
- **Manipulate** data (DML - Data Manipulation Language)
- **Query** data (DQL - Data Query Language)
- **Control** access (DCL - Data Control Language)

| Category | Commands | Purpose |
|----------|----------|----------|
| DDL | CREATE, ALTER, DROP | Define database structure |
| DML | INSERT, UPDATE, DELETE | Modify data |
| DQL | SELECT | Retrieve data |
| DCL | GRANT, REVOKE | Manage permissions |

### 5.2 SQL Data Types in MySQL

Choosing the correct data type is important for data integrity and storage efficiency.

#### Numeric Types

| Type | Storage | Range | Use Case |
|------|---------|-------|----------|
| TINYINT | 1 byte | -128 to 127 | Small integers, flags |
| SMALLINT | 2 bytes | -32,768 to 32,767 | Small counts |
| INT | 4 bytes | -2.1B to 2.1B | General integers |
| BIGINT | 8 bytes | Very large | IDs, large counts |
| DECIMAL(p,s) | Varies | Exact precision | Money, measurements |
| FLOAT | 4 bytes | Approximate | Scientific data |
| DOUBLE | 8 bytes | Approximate | High precision scientific |

#### String Types

| Type | Max Length | Use Case |
|------|------------|----------|
| CHAR(n) | 255 chars | Fixed-length (e.g., country codes) |
| VARCHAR(n) | 65,535 chars | Variable-length text |
| TEXT | 65,535 chars | Long text, descriptions |
| MEDIUMTEXT | 16 MB | Articles, documents |
| LONGTEXT | 4 GB | Very large text |

#### Date and Time Types

| Type | Format | Example |
|------|--------|----------|
| DATE | YYYY-MM-DD | 2025-01-05 |
| TIME | HH:MM:SS | 14:30:00 |
| DATETIME | YYYY-MM-DD HH:MM:SS | 2025-01-05 14:30:00 |
| TIMESTAMP | YYYY-MM-DD HH:MM:SS | Auto-updates, UTC |
| YEAR | YYYY | 2025 |

#### Other Types

| Type | Use Case |
|------|----------|
| BOOLEAN | True/False values (stored as TINYINT) |
| JSON | Structured JSON data |
| BLOB | Binary data (images, files) |
| ENUM | Predefined list of values |

### 5.3 Creating Databases and Tables

Let's create a sample database schema for environmental monitoring data.

In [6]:
%%sql

-- Show existing databases
SHOW DATABASES;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
3 rows affected.


Database
information_schema
mfre521d
performance_schema


In [7]:
%%sql

-- Use our database
USE mfre521d;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
0 rows affected.


[]

In [8]:
%%sql

-- Create a table for weather stations
CREATE TABLE IF NOT EXISTS weather_stations (
    station_id INT AUTO_INCREMENT PRIMARY KEY,
    station_name VARCHAR(100) NOT NULL,
    city VARCHAR(50) NOT NULL,
    province VARCHAR(50),
    country VARCHAR(50) DEFAULT 'Canada',
    latitude DECIMAL(9, 6),
    longitude DECIMAL(9, 6),
    elevation_m INT,
    installed_date DATE,
    is_active BOOLEAN DEFAULT TRUE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
0 rows affected.


[]

In [9]:
%%sql
-- Show tables in the database
show tables;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
16 rows affected.


Tables_in_mfre521d
AirQuality_2
air_quality_readings
climate_agriculture_analysis
countries
country_mapping
crop_production
daily_summary
monthly_summary
pollution_thresholds
region


In [10]:
%%sql

-- Create a table for temperature readings
CREATE TABLE IF NOT EXISTS temperature_readings (
    reading_id INT AUTO_INCREMENT PRIMARY KEY,
    station_id INT NOT NULL,
    reading_date DATE NOT NULL,
    reading_time TIME,
    temperature_c DECIMAL(5, 2),
    humidity_percent DECIMAL(5, 2),
    pressure_hpa DECIMAL(7, 2),
    wind_speed_kmh DECIMAL(5, 2),
    precipitation_mm DECIMAL(6, 2),
    recorded_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    FOREIGN KEY (station_id) REFERENCES weather_stations(station_id)
);

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
0 rows affected.


[]

In [11]:
%%sql

-- Verify tables were created
SHOW TABLES;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
16 rows affected.


Tables_in_mfre521d
AirQuality_2
air_quality_readings
climate_agriculture_analysis
countries
country_mapping
crop_production
daily_summary
monthly_summary
pollution_thresholds
region


In [12]:
%%sql

-- View table structure
DESCRIBE weather_stations;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
11 rows affected.


Field,Type,Null,Key,Default,Extra
station_id,int,NO,PRI,,auto_increment
station_name,varchar(100),NO,,,
city,varchar(50),NO,,,
province,varchar(50),YES,,,
country,varchar(50),YES,,Canada,
latitude,"decimal(9,6)",YES,,,
longitude,"decimal(9,6)",YES,,,
elevation_m,int,YES,,,
installed_date,date,YES,,,
is_active,tinyint(1),YES,,1,


In [13]:
%%sql

-- View table structure for temperature_readings
DESCRIBE temperature_readings;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
10 rows affected.


Field,Type,Null,Key,Default,Extra
reading_id,int,NO,PRI,,auto_increment
station_id,int,NO,MUL,,
reading_date,date,NO,,,
reading_time,time,YES,,,
temperature_c,"decimal(5,2)",YES,,,
humidity_percent,"decimal(5,2)",YES,,,
pressure_hpa,"decimal(7,2)",YES,,,
wind_speed_kmh,"decimal(5,2)",YES,,,
precipitation_mm,"decimal(6,2)",YES,,,
recorded_at,timestamp,YES,,CURRENT_TIMESTAMP,DEFAULT_GENERATED


### 5.4 Basic SQL Commands

Now let's learn the four fundamental SQL operations: INSERT, SELECT, UPDATE, and DELETE.

#### INSERT - Adding Data

In [14]:
%%sql

-- Insert weather stations
INSERT INTO weather_stations (station_name, city, province, latitude, longitude, elevation_m, installed_date)
VALUES 
    ('YVR Airport', 'Vancouver', 'British Columbia', 49.1947, -123.1839, 4, '2015-06-15'),
    ('Downtown Vancouver', 'Vancouver', 'British Columbia', 49.2827, -123.1207, 70, '2018-03-22'),
    ('Pearson Airport', 'Toronto', 'Ontario', 43.6777, -79.6248, 173, '2010-09-01'),
    ('Calgary Tower', 'Calgary', 'Alberta', 51.0447, -114.0719, 1045, '2012-11-10'),
    ('Montreal Downtown', 'Montreal', 'Quebec', 45.5017, -73.5673, 233, '2016-04-05');

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
5 rows affected.


[]

In [15]:
%%sql

-- Insert temperature readings
INSERT INTO temperature_readings (station_id, reading_date, reading_time, temperature_c, humidity_percent, pressure_hpa, wind_speed_kmh, precipitation_mm)
VALUES
    (1, '2025-01-01', '06:00:00', 2.5, 85.0, 1015.2, 12.5, 0.0),
    (1, '2025-01-01', '12:00:00', 5.8, 72.0, 1016.5, 8.3, 0.0),
    (1, '2025-01-01', '18:00:00', 4.2, 78.0, 1014.8, 15.2, 2.5),
    (2, '2025-01-01', '06:00:00', 3.1, 88.0, 1015.0, 5.5, 0.0),
    (2, '2025-01-01', '12:00:00', 6.5, 70.0, 1016.2, 10.8, 0.0),
    (3, '2025-01-01', '06:00:00', -8.5, 65.0, 1022.5, 22.0, 0.0),
    (3, '2025-01-01', '12:00:00', -5.2, 55.0, 1021.8, 18.5, 0.0),
    (4, '2025-01-01', '06:00:00', -15.0, 45.0, 1025.0, 8.0, 0.0),
    (4, '2025-01-01', '12:00:00', -10.5, 40.0, 1024.5, 12.0, 0.5),
    (5, '2025-01-01', '06:00:00', -12.0, 70.0, 1020.0, 25.0, 5.0),
    (5, '2025-01-01', '12:00:00', -8.0, 62.0, 1019.5, 20.0, 2.0);

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
11 rows affected.


[]

#### SELECT - Retrieving Data

In [16]:
%%sql

-- Select all columns from weather_stations
SELECT * FROM weather_stations;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
10 rows affected.


station_id,station_name,city,province,country,latitude,longitude,elevation_m,installed_date,is_active,created_at
1,YVR International Airport,Vancouver,British Columbia,Canada,49.1947,-123.1839,5,2015-06-15,1,2026-01-04 03:19:15
2,Downtown Vancouver,Vancouver,British Columbia,Canada,49.2827,-123.1207,75,2018-03-22,1,2026-01-04 03:19:15
3,Pearson Airport,Toronto,Ontario,Canada,43.6777,-79.6248,173,2010-09-01,1,2026-01-04 03:19:15
4,Calgary Tower,Calgary,Alberta,Canada,51.0447,-114.0719,1045,2012-11-10,1,2026-01-04 03:19:15
5,Montreal Downtown,Montreal,Quebec,Canada,45.5017,-73.5673,233,2016-04-05,1,2026-01-04 03:19:15
7,YVR Airport,Vancouver,British Columbia,Canada,49.1947,-123.1839,4,2015-06-15,1,2026-01-12 21:39:21
8,Downtown Vancouver,Vancouver,British Columbia,Canada,49.2827,-123.1207,70,2018-03-22,1,2026-01-12 21:39:21
9,Pearson Airport,Toronto,Ontario,Canada,43.6777,-79.6248,173,2010-09-01,1,2026-01-12 21:39:21
10,Calgary Tower,Calgary,Alberta,Canada,51.0447,-114.0719,1045,2012-11-10,1,2026-01-12 21:39:21
11,Montreal Downtown,Montreal,Quebec,Canada,45.5017,-73.5673,233,2016-04-05,1,2026-01-12 21:39:21


In [17]:
%%sql

-- Select specific columns
SELECT station_name, city, province, latitude, longitude
FROM weather_stations;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
10 rows affected.


station_name,city,province,latitude,longitude
YVR International Airport,Vancouver,British Columbia,49.1947,-123.1839
Downtown Vancouver,Vancouver,British Columbia,49.2827,-123.1207
Pearson Airport,Toronto,Ontario,43.6777,-79.6248
Calgary Tower,Calgary,Alberta,51.0447,-114.0719
Montreal Downtown,Montreal,Quebec,45.5017,-73.5673
YVR Airport,Vancouver,British Columbia,49.1947,-123.1839
Downtown Vancouver,Vancouver,British Columbia,49.2827,-123.1207
Pearson Airport,Toronto,Ontario,43.6777,-79.6248
Calgary Tower,Calgary,Alberta,51.0447,-114.0719
Montreal Downtown,Montreal,Quebec,45.5017,-73.5673


In [18]:
%%sql

-- Select with column aliases
SELECT 
    station_name AS "Station",
    city AS "City",
    elevation_m AS "Elevation (m)"
FROM weather_stations;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
10 rows affected.


Station,City,Elevation (m)
YVR International Airport,Vancouver,5
Downtown Vancouver,Vancouver,75
Pearson Airport,Toronto,173
Calgary Tower,Calgary,1045
Montreal Downtown,Montreal,233
YVR Airport,Vancouver,4
Downtown Vancouver,Vancouver,70
Pearson Airport,Toronto,173
Calgary Tower,Calgary,1045
Montreal Downtown,Montreal,233


In [None]:
%%sql

-- Select all temperature readings
SELECT * FROM temperature_readings;

#### UPDATE - Modifying Data

In [19]:
%%sql

-- Update a single record
UPDATE weather_stations
SET elevation_m = 75
WHERE station_id = 2;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
1 rows affected.


[]

In [22]:
%%sql

-- Verify the update
SELECT station_id, station_name, elevation_m 
FROM weather_stations 
WHERE station_id = 2;

   mysql+pymysql://mfre521d_user:***@127.0.0.1:3306/mfre521d
 * mysql+pymysql://mfre521d_user:***@localhost:3306/mfre521d
1 rows affected.


station_id,station_name,elevation_m
2,Downtown Vancouver,75


In [None]:
%%sql

-- Update multiple columns
UPDATE weather_stations
SET 
    station_name = 'YVR International Airport',
    elevation_m = 5
WHERE station_id = 1;

#### DELETE - Removing Data

In [None]:
%%sql

-- First, insert a record we will delete
INSERT INTO weather_stations (station_name, city, province, latitude, longitude)
VALUES ('Test Station', 'Test City', 'Test Province', 50.0000, -120.0000);

In [None]:
%%sql

-- Verify the record exists
SELECT * FROM weather_stations WHERE station_name = 'Test Station';

In [None]:
%%sql

-- Delete the test record
DELETE FROM weather_stations WHERE station_name = 'Test Station';

In [None]:
%%sql

-- Verify deletion
SELECT * FROM weather_stations;

### 5.5 Filtering with WHERE

The WHERE clause filters rows based on specified conditions.

In [None]:
%%sql

-- Filter by exact match
SELECT * FROM weather_stations
WHERE province = 'British Columbia';

In [None]:
%%sql

-- Filter with comparison operators
SELECT * FROM temperature_readings
WHERE temperature_c < 0;

In [None]:
%%sql

-- Filter with AND operator
SELECT * FROM temperature_readings
WHERE temperature_c < 0 AND humidity_percent > 50;

In [None]:
%%sql

-- Filter with OR operator
SELECT station_name, city, province 
FROM weather_stations
WHERE province = 'British Columbia' OR province = 'Ontario';

In [None]:
%%sql

-- Filter with IN operator
SELECT station_name, city, province 
FROM weather_stations
WHERE province IN ('British Columbia', 'Ontario', 'Quebec');

In [None]:
%%sql

-- Filter with BETWEEN operator
SELECT * FROM temperature_readings
WHERE temperature_c BETWEEN -5 AND 5;

In [None]:
%%sql

-- Filter with LIKE operator (pattern matching)
SELECT * FROM weather_stations
WHERE station_name LIKE '%Airport%';

In [None]:
%%sql

-- Filter for NULL values
SELECT * FROM weather_stations
WHERE elevation_m IS NOT NULL;

### 5.6 Sorting with ORDER BY

The ORDER BY clause sorts the result set by one or more columns.

In [None]:
%%sql

-- Sort by single column (ascending - default)
SELECT station_name, city, elevation_m
FROM weather_stations
ORDER BY elevation_m;

In [None]:
%%sql

-- Sort descending
SELECT station_name, city, elevation_m
FROM weather_stations
ORDER BY elevation_m DESC;

In [None]:
%%sql

-- Sort by multiple columns
SELECT station_id, reading_date, reading_time, temperature_c
FROM temperature_readings
ORDER BY station_id ASC, temperature_c DESC;

In [None]:
%%sql

-- Combine WHERE and ORDER BY
SELECT station_id, reading_time, temperature_c, humidity_percent
FROM temperature_readings
WHERE temperature_c > 0
ORDER BY temperature_c DESC;

In [None]:
%%sql

-- Limit results
SELECT station_id, reading_time, temperature_c
FROM temperature_readings
ORDER BY temperature_c ASC
LIMIT 5;

---

## 6. Practice Exercises

Try these exercises on your own before looking at the solutions.

### Exercise 1
Select all weather stations in Canada that have an elevation greater than 100 meters.

In [None]:
%%sql

-- Your answer here


### Exercise 2
Find the coldest temperature reading recorded, showing the station_id, date, time, and temperature.

In [None]:
%%sql

-- Your answer here


### Exercise 3
List all temperature readings where it was raining (precipitation > 0), ordered by precipitation amount from highest to lowest.

In [None]:
%%sql

-- Your answer here


### Exercise 4
Insert a new weather station for "Halifax Harbour" in Nova Scotia at latitude 44.6488 and longitude -63.5752, with elevation 15 meters.

In [None]:
%%sql

-- Your answer here


---

### Solutions

In [None]:
%%sql

-- Solution 1
SELECT * FROM weather_stations
WHERE country = 'Canada' AND elevation_m > 100;

In [None]:
%%sql

-- Solution 2
SELECT station_id, reading_date, reading_time, temperature_c
FROM temperature_readings
ORDER BY temperature_c ASC
LIMIT 1;

In [None]:
%%sql

-- Solution 3
SELECT station_id, reading_date, reading_time, temperature_c, precipitation_mm
FROM temperature_readings
WHERE precipitation_mm > 0
ORDER BY precipitation_mm DESC;

In [None]:
%%sql

-- Solution 4
INSERT INTO weather_stations (station_name, city, province, latitude, longitude, elevation_m)
VALUES ('Halifax Harbour', 'Halifax', 'Nova Scotia', 44.6488, -63.5752, 15);

---

## 7. Summary

### What We Covered Today

1. **Environment Setup**
   - Installed Docker Desktop (Windows and Mac)
   - Configured VS Code with essential extensions
   - Created and ran a MySQL container using Docker Compose
   - Connected to MySQL from Jupyter Notebook

2. **Database Fundamentals**
   - Understanding what databases are and why we need them
   - Relational vs NoSQL databases
   - When to use each type

3. **SQL Basics**
   - Data types in MySQL
   - Creating databases and tables
   - CRUD operations (Create, Read, Update, Delete)
   - Filtering with WHERE clause
   - Sorting with ORDER BY clause

### Key SQL Commands Learned

| Command | Purpose | Example |
|---------|---------|----------|
| CREATE TABLE | Define new table | `CREATE TABLE name (...)` |
| INSERT INTO | Add new rows | `INSERT INTO table VALUES (...)` |
| SELECT | Retrieve data | `SELECT * FROM table` |
| UPDATE | Modify existing data | `UPDATE table SET col = val` |
| DELETE | Remove rows | `DELETE FROM table WHERE ...` |
| WHERE | Filter rows | `WHERE column = value` |
| ORDER BY | Sort results | `ORDER BY column DESC` |

### Next Class Preview

In Lecture 2, we will cover:
- SQL JOINs (INNER, LEFT, RIGHT, FULL)
- Common Table Expressions (CTEs)
- Window Functions
- Aggregation Functions (COUNT, SUM, AVG, MIN, MAX)
- GROUP BY and HAVING clauses

---

## 8. References

### Official Documentation

1. **MySQL Documentation**  
   https://dev.mysql.com/doc/refman/8.0/en/

2. **Docker Documentation**  
   https://docs.docker.com/

3. **Docker Compose Documentation**  
   https://docs.docker.com/compose/

4. **VS Code Documentation**  
   https://code.visualstudio.com/docs

### Books

5. Beaulieu, A. (2020). *Learning SQL: Generate, Manipulate, and Retrieve Data* (3rd ed.). O'Reilly Media.

6. Schwartz, B., Zaitsev, P., & Tkachenko, V. (2012). *High Performance MySQL* (3rd ed.). O'Reilly Media.

### Online Tutorials

7. **W3Schools SQL Tutorial**  
   https://www.w3schools.com/sql/

8. **SQLBolt - Learn SQL with Interactive Exercises**  
   https://sqlbolt.com/

9. **Mode SQL Tutorial**  
   https://mode.com/sql-tutorial/

### Python Libraries

10. **SQLAlchemy Documentation**  
    https://docs.sqlalchemy.org/

11. **PyMySQL Documentation**  
    https://pymysql.readthedocs.io/

12. **ipython-sql Documentation**  
    https://github.com/catherinedevlin/ipython-sql

### Database Comparisons

13. **DB-Engines Ranking**  
    https://db-engines.com/en/ranking

14. **AWS Database Services Overview**  
    https://aws.amazon.com/products/databases/

---

## Cleanup (Optional)

Run these commands when you are finished to clean up the database.

In [None]:
%%sql

-- Drop tables (uncomment to run)
-- DROP TABLE IF EXISTS temperature_readings;
-- DROP TABLE IF EXISTS weather_stations;

To stop the Docker container, run this command in your terminal:

```bash
docker-compose down
```

To stop and remove all data:

```bash
docker-compose down -v
```