# Installation of MySQL on Google Colab
Google Colab environment is based on Ubuntu Linux. The Colab runtime used Ubuntu as its underlying Operating System.
*   In Colab, use `!` character at the beginning of the line in code cell to run shell commands
*   Reload the local package database with `apt-get update` command
*   Install `mysql-server` package with `apt install` command



In [1]:
%%capture
! sudo apt-get update

In [2]:
%%capture
! sudo apt install mysql-server

## Check if MySQL is running

After MySQL has been installed, use **MySQL** service script located in `/etc/init.d/` to start, stop, restart or check status of MySQL database server.

*   `/etc/init.d/mysql status` to check status of MySQL database server
*   `/etc/init.d/mysql start` to start MySQL database server
*   `/etc/init.d/mysql stop` to stop MySQL database server
*   `/etc/init.d/mysql restart` to restart MySQL database server



In [3]:
# Start MySQL database server
! /etc/init.d/mysql stop
! /etc/init.d/mysql status
! /etc/init.d/mysql start
! /etc/init.d/mysql restart

 * Stopping MySQL database server mysqld
   ...done.
 * MySQL is stopped.
 * Starting MySQL database server mysqld
   ...done.
 * Stopping MySQL database server mysqld
   ...done.
 * Starting MySQL database server mysqld
   ...done.


In [4]:
# Check MySQL server running Status
! /etc/init.d/mysql status

 * /usr/bin/mysqladmin  Ver 8.0.41-0ubuntu0.22.04.1 for Linux on x86_64 ((Ubuntu))
Copyright (c) 2000, 2025, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Server version		8.0.41-0ubuntu0.22.04.1
Protocol version	10
Connection		Localhost via UNIX socket
UNIX socket		/var/run/mysqld/mysqld.sock
Uptime:			2 sec

Threads: 2  Questions: 8  Slow queries: 0  Opens: 119  Flush tables: 3  Open tables: 38  Queries per second avg: 4.000


## Change root's password

When MySQL server is running, we can use `mysql` command-line client to execute commands; options can be added; for example:
*   `-u <username>`: specify username
*   `-p` : prompt for password
*   `-e <query>` : execute a query directly from command line without  opening MySQL shell


After the installation completed, you can change the root's password by

```
!mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '8OQsfBWBP4'; FLUSH PRIVILEGES;"
```

This is to :

*   modify MySQL `root` user when logging in from localhost to use `mysql_native_password` as authentication plugin (traditional password-based authentication method). Set `8OQsfBWBP4` as the new  password.
*   Apply the changes with `FLUSH PRIVILEGES`

In [5]:
# Ends the sql statement with a ';'
# 'FLUSH PRIVILEGES' applies all changes
! mysql -e "ALTER USER 'root'@'localhost' IDENTIFIED WITH mysql_native_password BY '8OQsfBWBP4'; FLUSH PRIVILEGES;"

ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: NO)


# MySQL commands

### Login to MySQL
No space between -p and password!
```
mysql -u [username] -p[password]
```
(Notice no space between `-p` and the password)

### Execute SQL Command Using `-e` Option
```
mysql -u [username] -p[password] -e "[SQL command]"
```

### Example: Add a New User
`*[database name]*` => `*.*` means all database
```sh
!mysql -u root -p8OQsfBWBP4 -e "CREATE USER 'test_user'@'localhost' IDENTIFIED BY 'test_p@ssw0rd';"
!mysql -u root -p8OQsfBWBP4 -e "GRANT ALL PRIVILEGES ON *.* TO 'test_user'@'localhost';"
!mysql -u root -p8OQsfBWBP4 -e "FLUSH PRIVILEGES;"
```
The above three commands execute MySQL statements under user `root` with password `8OQsfBWBP4` to perform the following operations:
*   Create user `test_user` for logging in from `localhost`, with password `test_p@ssw0rd`
*   Grant all permissions on any table in any database to user `test_user` for logging in from `localhost`
*   Apply changes on privileges


### Example: Insert Multiple Records
Here’s how you can write a single SQL statement to insert multiple records:
```sh
!mysql -u root -p8OQsfBWBP4 -e "
INSERT INTO prediction (column1, column2, column3)
VALUES
('value1_row1', 'value2_row1', 'value3_row1'),
('value1_row2', 'value2_row2', 'value3_row2'),
('value1_row3', 'value2_row3', 'value3_row3');
"
```

These commands show how to log in, execute SQL commands, and insert multiple records into your MySQL database using the `-e` option. Remember to replace `[username]`, `[password]`, `[SQL command]`, and column values with your actual data.

The above `INSERT` statement inserts 3 rows into table `prediction`. For the first row, the value for `column1` is `value1_row1`, the value for `column2` is `value2_row1`, and the value for `column3` is `value3_row1`,  



In [6]:
# prompt: how can I add a new user test_user to mysql

!mysql -u root -p8OQsfBWBP4 -e "CREATE USER 'test_user'@'localhost' IDENTIFIED BY 'test_p@ssw0rd';"
!mysql -u root -p8OQsfBWBP4 -e "GRANT ALL PRIVILEGES ON *.* TO 'test_user'@'localhost';"
!mysql -u root -p8OQsfBWBP4 -e "FLUSH PRIVILEGES;"

ERROR 1396 (HY000) at line 1: Operation CREATE USER failed for 'test_user'@'localhost'


## Actual command lines

Use mysql command to log-in to MySQL database. This will open MySQL interactive prompt mysql>

At the MySQL prompt, enter MySQL statement to execute MySQL perations.
*   `show databases` : To show list of databases available to the user
*   `use <DB_NAME>` : To select the database `DB_NAME` for use
*   `desc <TABLE_NAME>` : To describe the structure of the table `TABLE_NAME` in the selected database
*   `exit` : To exit the MySQL shell prompt

In [7]:
# Log-in to MySQL and show databases
# ! mysql -u test_user -ptest_p@ssw0rd

In [8]:
# Execute SHOW DATABASES command directly from command line without open MySQL shell
! mysql -u test_user -ptest_p@ssw0rd -e "show databases;"

+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| weather_db         |
+--------------------+


## Using python to connect to MySQL

`mysql-connector-python` is Python driver for communicating with MySQL servers. It enables Python programs to access MySQL databases.

Use Python package installer (pip) to install `mysql-connector-python`.

In [9]:
%%capture
# Install mysql-connector-python
!pip install mysql-connector-python

## Connect to a MySQL server

In python program, use `mysql.connector` to create connection to MySQL database.

**`mysql.connector.connect(host, port, user, password, database)`** creates connection to MySQL database
*   `host` : host name or IP address of the MySQL server
*   `port` : TCP/IP port of the MySQL server. Must be an integer
*   `user` : user name used to authenticate with the MySQL server
*   `password` : password to authenticate the user with the MySQL server
*   `database` : database in MySQL server to connect to

This returns connection to the database.


**`connection.cursor()`** returns a new Cursor Object using the connection. It represents a database cursor, which is used to manage the context of a fetch operation.


**`cursor.execute(operation [, parameters])`** prepares and executes a database operation (query or command).
*   `parameters` may be provided as sequence or mapping and will be bound to variables in the operation.


**`connection.close()`** closes the connection




In [10]:
import mysql.connector

# Make connection
mydb = mysql.connector.connect(
    host="localhost",
    user='test_user',
    password='test_p@ssw0rd'
)

# Create cursor
mycursor = mydb.cursor()

# Execute show databases command
mycursor.execute("SHOW DATABASES;")

# Show results
[x for x in mycursor]

[('information_schema',),
 ('mysql',),
 ('performance_schema',),
 ('sys',),
 ('weather_db',)]

## Get 7timer.info data

We will get weather data for BLOCK28x Samyan from http://www.7timer.info/bin/api.pl?lon=100.5231&lat=13.7367&product=civil&output=json to store in database and perform operations on the data.

(Use formatted string literals for flexible location input)



*   Only elements in array corresponding to key `dataseries` are retrieved from the result and stored in DataFrame
*   Column `rh2m` contains String representing percentage value (ends with %). For each value in the column, remove the last character and make it an integer.



In [11]:
import pandas as pd
import requests
import json

lat = 13.7367
lon = 100.5231

url = f"http://www.7timer.info/bin/api.pl?lon={lon}&lat={lat}&product=civil&output=json"
response = requests.get(url)
data = response.json()
data

{'product': 'civil',
 'init': '2025021018',
 'dataseries': [{'timepoint': 3,
   'cloudcover': 6,
   'lifted_index': 10,
   'prec_type': 'none',
   'prec_amount': 0,
   'temp2m': 22,
   'rh2m': '51%',
   'wind10m': {'direction': 'NW', 'speed': 2},
   'weather': 'mcloudynight'},
  {'timepoint': 6,
   'cloudcover': 6,
   'lifted_index': 10,
   'prec_type': 'none',
   'prec_amount': 0,
   'temp2m': 22,
   'rh2m': '52%',
   'wind10m': {'direction': 'NW', 'speed': 2},
   'weather': 'mcloudyday'},
  {'timepoint': 9,
   'cloudcover': 2,
   'lifted_index': 10,
   'prec_type': 'none',
   'prec_amount': 0,
   'temp2m': 28,
   'rh2m': '42%',
   'wind10m': {'direction': 'W', 'speed': 2},
   'weather': 'clearday'},
  {'timepoint': 12,
   'cloudcover': 2,
   'lifted_index': 6,
   'prec_type': 'none',
   'prec_amount': 0,
   'temp2m': 33,
   'rh2m': '35%',
   'wind10m': {'direction': 'N', 'speed': 2},
   'weather': 'clearday'},
  {'timepoint': 15,
   'cloudcover': 2,
   'lifted_index': 6,
   'prec_typ

In [12]:
# Create DataFrame from elements 'dataseries'
df = pd.DataFrame(data['dataseries'])
df

Unnamed: 0,timepoint,cloudcover,lifted_index,prec_type,prec_amount,temp2m,rh2m,wind10m,weather
0,3,6,10,none,0,22,51%,"{'direction': 'NW', 'speed': 2}",mcloudynight
1,6,6,10,none,0,22,52%,"{'direction': 'NW', 'speed': 2}",mcloudyday
2,9,2,10,none,0,28,42%,"{'direction': 'W', 'speed': 2}",clearday
3,12,2,6,none,0,33,35%,"{'direction': 'N', 'speed': 2}",clearday
4,15,2,6,none,0,34,42%,"{'direction': 'NW', 'speed': 2}",clearday
...,...,...,...,...,...,...,...,...,...
59,180,1,-4,none,0,35,70%,"{'direction': 'S', 'speed': 3}",clearday
60,183,1,-4,none,0,35,67%,"{'direction': 'S', 'speed': 3}",clearday
61,186,1,-4,none,0,30,85%,"{'direction': 'S', 'speed': 3}",clearnight
62,189,1,-4,none,0,28,85%,"{'direction': 'S', 'speed': 3}",clearnight


In [13]:
# For each value in the column 'rh2m', remove the last character and make it an integer.
df['rh2m'] = df['rh2m'].str.replace('%', '').astype(int)

# CRUD Operations

## Create a Database
Example: Create a New Database

```
!mysql -u [username] -p[password] -e "  
CREATE DATABASE IF NOT EXISTS weather_db;  
"  
```###
This command will create the weather_db database if it does not already exist. Similar to the previous examples, it logs in with the given username and password, and executes the SQL command.

Just replace [username] and [password] with your MySQL credentials.

In [14]:
# Create Database from mysql command
! mysql -u test_user -ptest_p@ssw0rd -e "CREATE DATABASE IF NOT EXISTS weather_db;"



Create a python program to do similar things:
*   Connect to database
*   Create table `weather_db` if not exists
*   Connect to `weather_db` database


In [15]:
import mysql.connector
import pandas as pd
import requests
import json

# Create connection
mydb = mysql.connector.connect(
  host="localhost",
  user="test_user",
  password="test_p@ssw0rd"
)

# Create Database
mycursor = mydb.cursor()
mycursor.execute("CREATE DATABASE IF NOT EXISTS weather_db")

# Connect to the newly created database
mydb = mysql.connector.connect(
  host="localhost",
  user="test_user",
  password="test_p@ssw0rd",
  database="weather_db"
)

# Get cursor to database
mycursor = mydb.cursor()

## Create a table

Example: Create a New Table

```
!mysql -u [username] -p[password] -e "
CREATE TABLE IF NOT EXISTS prediction (
    timepoint INT,
    rh2m INT,
    prec_amount INT,
    temp2m INT
);
"
```
This command will create the `prediction` table with the specified columns if it doesn't already exist. Similar to the previous examples, it logs in with the given username and password, and executes the SQL command. The `prediction` table consists of 4 columns:
*   `timepoint` stores integer data
*   `rh2m` stores integer data
*   `prec_amount` stores integer data
*   `temp2m` stores integer data



In [16]:
# continue the python program to execute CREATE TABLE command above.
mycursor.execute(
    """
    CREATE TABLE IF NOT EXISTS prediction (
        timepoint INT,
        rh2m INT,
        prec_amount INT,
        temp2m INT
    );
    """
)

In [17]:
# Execute SHOW TABLES to show all tables in the database
mycursor.execute("SHOW TABLES")
[t for t in mycursor]


[('prediction',)]

## Insert Data Using INSERT INTO  

Example: Add a New Record  

```python
sql = "INSERT INTO prediction (timepoint, rh2m, prec_amount, temp2m) VALUES (%s, %s, %s, %s)"  
val = (12, 80, 5, 28)  

mycursor.execute(sql, val)  
mydb.commit()  

print(f"{mycursor.rowcount} record inserted.")  
```  

The **INSERT INTO** statement adds a new row to the specified table:  

- **INSERT INTO table_name (column1, column2, ...)**: Specifies the target table and columns.  
- **VALUES (%s, %s, ...)**: Uses placeholders to insert values safely.  
- **execute(sql, val)**: Passes the SQL query and values separately to prevent SQL injection.  
- **commit()**: Saves the changes to the database.  
- **rowcount**: Shows the number of rows affected.  

### Insert Multiple Records  
```python
sql = "INSERT INTO prediction (timepoint, rh2m, prec_amount, temp2m) VALUES (%s, %s, %s, %s)"  
values = [  
    (13, 75, 0, 30),  
    (14, 60, 2, 27),  
    (15, 85, 10, 26)  
]  

mycursor.executemany(sql, values)  
mydb.commit()  

print(f"{mycursor.rowcount} records inserted.")  
```  
This inserts multiple rows at once using **`executemany( operation, seq_of_parameters )`**
*   `operation` : database query or command
*   `seq_of_parameters` : parameter sequences




In [18]:
# Iterate over the rows of DataFrame as pairs of index and row
# For each row, create tuple containing values of timepoint, rh2m, prec_amount, temp2m
# Append the tuple to a list
values = []
for index, row in df.iterrows():
    values.append((row['timepoint'], row['rh2m'], row['prec_amount'], row['temp2m']))

# Insert many into prediction table
sql = "INSERT INTO prediction (timepoint, rh2m, prec_amount, temp2m) VALUES (%s, %s, %s, %s)"
mycursor.executemany(sql, values)

# Commit changes
mydb.commit()

# Print rowcount
print(mycursor.rowcount, "records inserted.")

64 records inserted.


## Read Data Using SELECT  

Example: Retrieve Data from a Table  

```  
!mysql -u [username] -p[password] -e "  
SELECT timepoint, temp2m FROM prediction  
WHERE temp2m > 25;  
"  
```  

This command retrieves specific columns (`timepoint` and `temp2m`) from the `prediction` table where the temperature (`temp2m`) is greater than 25.  

- **SELECT**: Specifies the columns to retrieve. Use `*` to select all columns.  
- **FROM**: Defines the table from which data is selected.  
- **WHERE**: Filters the results based on a condition (e.g., `temp2m > 25`).  

Modify the column names, table name, and condition to fit your needs.

**`SELECT * FROM TABLE`** retrieves the values of all columns from the `TABLE`.

Here are some examples of reading data using `SELECT` with `mycursor` in Python:  

### 1. **Retrieve All Rows**

**`cursor.fetchall()`** returns all remaining rows as a sequence of sequences (e.g. a list of tuples)
```python
mycursor.execute("SELECT * FROM prediction")  
results = mycursor.fetchall()  

for row in results:  
    print(row)  
```  
This retrieves all columns and rows from the `prediction` table and prints each row.  



In [19]:
# Fetch all rows in prediction table

mycursor.execute("SELECT * FROM prediction")
results = mycursor.fetchall()

for row in results:
  print(row)


(3, 51, 0, 22)
(6, 52, 0, 22)
(9, 42, 0, 28)
(12, 35, 0, 33)
(15, 42, 0, 34)
(18, 51, 0, 30)
(21, 50, 0, 29)
(24, 73, 0, 26)
(27, 65, 0, 24)
(30, 64, 0, 24)
(33, 53, 0, 30)
(36, 48, 0, 35)
(39, 49, 0, 36)
(42, 61, 0, 31)
(45, 73, 0, 29)
(48, 73, 0, 27)
(51, 79, 0, 26)
(54, 77, 0, 25)
(57, 60, 0, 31)
(60, 57, 0, 36)
(63, 51, 0, 36)
(66, 75, 0, 31)
(69, 80, 0, 28)
(72, 83, 0, 27)
(75, 82, 0, 26)
(78, 84, 0, 25)
(81, 70, 0, 31)
(84, 59, 0, 36)
(87, 60, 0, 37)
(90, 75, 0, 31)
(93, 82, 0, 28)
(96, 80, 0, 27)
(99, 77, 0, 26)
(102, 75, 0, 25)
(105, 51, 0, 31)
(108, 49, 0, 36)
(111, 50, 0, 38)
(114, 57, 0, 32)
(117, 72, 0, 29)
(120, 70, 0, 27)
(123, 75, 0, 26)
(126, 80, 0, 25)
(129, 59, 0, 31)
(132, 49, 0, 36)
(135, 50, 0, 37)
(138, 67, 0, 31)
(141, 77, 0, 28)
(144, 80, 0, 27)
(147, 79, 0, 26)
(150, 78, 0, 26)
(153, 71, 0, 31)
(156, 70, 0, 34)
(159, 67, 0, 35)
(162, 81, 0, 30)
(165, 83, 0, 28)
(168, 89, 0, 27)
(171, 85, 0, 26)
(174, 83, 0, 26)
(177, 72, 0, 31)
(180, 70, 0, 35)
(183, 67, 0, 35)

### 2. **Retrieve Specific Columns**

**`SELECT COLUMN1, COLUMN2, COLUMN3, ... FROM TABLE`** retrieves the values of `COLUMN1`, `COLUMN2`, `COLUMN3`, `...` from the `TABLE`.

```python
mycursor.execute("SELECT timepoint, temp2m FROM prediction")  
results = mycursor.fetchall()  

for row in results:  
    print(f"Time: {row[0]}, Temperature: {row[1]}")  
```  
This fetches only `timepoint` and `temp2m`, displaying them in a formatted way.  

In [20]:
# Select timepoint and temp2m columns
mycursor.execute("SELECT timepoint, temp2m FROM prediction")
results = mycursor.fetchall()

# Print each row which is a tuple (timepoint, temp2m)
for row in results:
  print(f"Time: {row[0]}, Temperature: {row[1]}")

Time: 3, Temperature: 22
Time: 6, Temperature: 22
Time: 9, Temperature: 28
Time: 12, Temperature: 33
Time: 15, Temperature: 34
Time: 18, Temperature: 30
Time: 21, Temperature: 29
Time: 24, Temperature: 26
Time: 27, Temperature: 24
Time: 30, Temperature: 24
Time: 33, Temperature: 30
Time: 36, Temperature: 35
Time: 39, Temperature: 36
Time: 42, Temperature: 31
Time: 45, Temperature: 29
Time: 48, Temperature: 27
Time: 51, Temperature: 26
Time: 54, Temperature: 25
Time: 57, Temperature: 31
Time: 60, Temperature: 36
Time: 63, Temperature: 36
Time: 66, Temperature: 31
Time: 69, Temperature: 28
Time: 72, Temperature: 27
Time: 75, Temperature: 26
Time: 78, Temperature: 25
Time: 81, Temperature: 31
Time: 84, Temperature: 36
Time: 87, Temperature: 37
Time: 90, Temperature: 31
Time: 93, Temperature: 28
Time: 96, Temperature: 27
Time: 99, Temperature: 26
Time: 102, Temperature: 25
Time: 105, Temperature: 31
Time: 108, Temperature: 36
Time: 111, Temperature: 38
Time: 114, Temperature: 32
Time: 117,

### 3. **Retrieve Data with a Condition (`WHERE`)**  

**`SELECT COLUMN1, COLUMN2, COLUMN3, ... FROM TABLE WHERE CONDITION`** retrieves the values of `COLUMN1`, `COLUMN2`, `COLUMN3`, `...` from the `TABLE`, for only rows that match the `CONDITION`.

```python
mycursor.execute("SELECT timepoint, temp2m FROM prediction WHERE temp2m > 25")  
results = mycursor.fetchall()  

for row in results:  
    print(f"Time: {row[0]}, Temp: {row[1]}")  
```  
This filters results to only include rows where `temp2m > 25`.  

In [21]:
# Select timepoint and temp2m columns, for rows with temp2m > 25

mycursor.execute("SELECT timepoint, temp2m FROM prediction WHERE temp2m > 25")
results = mycursor.fetchall()

# Print each row which is a tuple (timepoint, temp2m)
for row in results:
  print(row)


(9, 28)
(12, 33)
(15, 34)
(18, 30)
(21, 29)
(24, 26)
(33, 30)
(36, 35)
(39, 36)
(42, 31)
(45, 29)
(48, 27)
(51, 26)
(57, 31)
(60, 36)
(63, 36)
(66, 31)
(69, 28)
(72, 27)
(75, 26)
(81, 31)
(84, 36)
(87, 37)
(90, 31)
(93, 28)
(96, 27)
(99, 26)
(105, 31)
(108, 36)
(111, 38)
(114, 32)
(117, 29)
(120, 27)
(123, 26)
(129, 31)
(132, 36)
(135, 37)
(138, 31)
(141, 28)
(144, 27)
(147, 26)
(150, 26)
(153, 31)
(156, 34)
(159, 35)
(162, 30)
(165, 28)
(168, 27)
(171, 26)
(174, 26)
(177, 31)
(180, 35)
(183, 35)
(186, 30)
(189, 28)
(192, 27)
(9, 28)
(12, 33)
(15, 34)
(18, 30)
(21, 29)
(24, 26)
(33, 30)
(36, 35)
(39, 36)
(42, 31)
(45, 29)
(48, 27)
(51, 26)
(57, 31)
(60, 36)
(63, 36)
(66, 31)
(69, 28)
(72, 27)
(75, 26)
(81, 31)
(84, 36)
(87, 37)
(90, 31)
(93, 28)
(96, 27)
(99, 26)
(105, 31)
(108, 36)
(111, 38)
(114, 32)
(117, 29)
(120, 27)
(123, 26)
(129, 31)
(132, 36)
(135, 37)
(138, 31)
(141, 28)
(144, 27)
(147, 26)
(150, 26)
(153, 31)
(156, 34)
(159, 35)
(162, 30)
(165, 28)
(168, 27)
(171, 26)
(174, 

### 4. **Read Data Using `SELECT` with `ORDER BY`**

Example: Retrieve Data in a Specific Order  

```python
mycursor.execute("SELECT timepoint, temp2m FROM prediction ORDER BY temp2m ASC")  
results = mycursor.fetchall()  

for row in results:  
    print(f"Time: {row[0]}, Temp: {row[1]}")  
```  

The **ORDER BY** clause is used to sort the results:  

- **ORDER BY column_name**: Sorts the results based on a specific column.  
- **ASC (Ascending, default)**: Sorts from lowest to highest (e.g., `ORDER BY temp2m ASC`).  
- **DESC (Descending)**: Sorts from highest to lowest (e.g., `ORDER BY temp2m DESC`).  


In [22]:
# Select timepoint and temp2m columns order by temp2m
mycursor.execute("SELECT timepoint, temp2m FROM prediction ORDER BY temp2m ASC")
results = mycursor.fetchall()

for row in results:
    print(f"Time: {row[0]}, Temp: {row[1]}")


Time: 3, Temp: 22
Time: 6, Temp: 22
Time: 3, Temp: 22
Time: 6, Temp: 22
Time: 27, Temp: 24
Time: 30, Temp: 24
Time: 27, Temp: 24
Time: 30, Temp: 24
Time: 54, Temp: 25
Time: 78, Temp: 25
Time: 102, Temp: 25
Time: 126, Temp: 25
Time: 54, Temp: 25
Time: 78, Temp: 25
Time: 102, Temp: 25
Time: 126, Temp: 25
Time: 24, Temp: 26
Time: 51, Temp: 26
Time: 75, Temp: 26
Time: 99, Temp: 26
Time: 123, Temp: 26
Time: 147, Temp: 26
Time: 150, Temp: 26
Time: 171, Temp: 26
Time: 174, Temp: 26
Time: 24, Temp: 26
Time: 51, Temp: 26
Time: 75, Temp: 26
Time: 99, Temp: 26
Time: 123, Temp: 26
Time: 147, Temp: 26
Time: 150, Temp: 26
Time: 171, Temp: 26
Time: 174, Temp: 26
Time: 48, Temp: 27
Time: 72, Temp: 27
Time: 96, Temp: 27
Time: 120, Temp: 27
Time: 144, Temp: 27
Time: 168, Temp: 27
Time: 192, Temp: 27
Time: 48, Temp: 27
Time: 72, Temp: 27
Time: 96, Temp: 27
Time: 120, Temp: 27
Time: 144, Temp: 27
Time: 168, Temp: 27
Time: 192, Temp: 27
Time: 9, Temp: 28
Time: 69, Temp: 28
Time: 93, Temp: 28
Time: 141, Tem

In [23]:
# Observe current DataFrame for next exercise
df

Unnamed: 0,timepoint,cloudcover,lifted_index,prec_type,prec_amount,temp2m,rh2m,wind10m,weather
0,3,6,10,none,0,22,51,"{'direction': 'NW', 'speed': 2}",mcloudynight
1,6,6,10,none,0,22,52,"{'direction': 'NW', 'speed': 2}",mcloudyday
2,9,2,10,none,0,28,42,"{'direction': 'W', 'speed': 2}",clearday
3,12,2,6,none,0,33,35,"{'direction': 'N', 'speed': 2}",clearday
4,15,2,6,none,0,34,42,"{'direction': 'NW', 'speed': 2}",clearday
...,...,...,...,...,...,...,...,...,...
59,180,1,-4,none,0,35,70,"{'direction': 'S', 'speed': 3}",clearday
60,183,1,-4,none,0,35,67,"{'direction': 'S', 'speed': 3}",clearday
61,186,1,-4,none,0,30,85,"{'direction': 'S', 'speed': 3}",clearnight
62,189,1,-4,none,0,28,85,"{'direction': 'S', 'speed': 3}",clearnight


## SELECT from SELECT

In [24]:
# Observe the starting timepoint received from the weather API
# In key init of the returned JSON
data['init']

'2025021018'

MySQL functions can be used in MySQL statements.

**`STR_TO_DATE()`** converts string into `DATE` or `DATETIME` value

**`DATE_ADD(date, INTERVAL value unit)`** adds specified time interval to date or datetime value to add days, months, years, hours, minutes, seconds, etc.
*   `date`: Starting date or datetime value to add the interval
*   `value`: Amount of time to add
*   `unit`: Unit of time, such as DAY, MONTH, YEAR, HOUR, etc.




In [25]:
# Create SELECT statement
# Select timepoint column and perform operations on the value, call this field 'dt'
sql = f'''
select DATE_ADD(DATE_ADD(STR_TO_DATE({data['init']},'%Y%m%d%H'),INTERVAL 7 HOUR),INTERVAL timepoint HOUR) AS dt, rh2m, prec_amount, temp2m
FROM prediction ORDER BY dt ASC
'''

In [26]:
# Execute the sql command
mycursor.execute(sql)
results = mycursor.fetchall()

# Print the returned tuples of dt, rh2m, prec_amount, temp2m
for row in results:
  print(row)

(datetime.datetime(2025, 2, 11, 4, 0), 51, 0, 22)
(datetime.datetime(2025, 2, 11, 4, 0), 51, 0, 22)
(datetime.datetime(2025, 2, 11, 7, 0), 52, 0, 22)
(datetime.datetime(2025, 2, 11, 7, 0), 52, 0, 22)
(datetime.datetime(2025, 2, 11, 10, 0), 42, 0, 28)
(datetime.datetime(2025, 2, 11, 10, 0), 42, 0, 28)
(datetime.datetime(2025, 2, 11, 13, 0), 35, 0, 33)
(datetime.datetime(2025, 2, 11, 13, 0), 35, 0, 33)
(datetime.datetime(2025, 2, 11, 16, 0), 42, 0, 34)
(datetime.datetime(2025, 2, 11, 16, 0), 42, 0, 34)
(datetime.datetime(2025, 2, 11, 19, 0), 51, 0, 30)
(datetime.datetime(2025, 2, 11, 19, 0), 51, 0, 30)
(datetime.datetime(2025, 2, 11, 22, 0), 50, 0, 29)
(datetime.datetime(2025, 2, 11, 22, 0), 50, 0, 29)
(datetime.datetime(2025, 2, 12, 1, 0), 73, 0, 26)
(datetime.datetime(2025, 2, 12, 1, 0), 73, 0, 26)
(datetime.datetime(2025, 2, 12, 4, 0), 65, 0, 24)
(datetime.datetime(2025, 2, 12, 4, 0), 65, 0, 24)
(datetime.datetime(2025, 2, 12, 7, 0), 64, 0, 24)
(datetime.datetime(2025, 2, 12, 7, 0), 6

Nested Select can be performed with subquery in different parts of the SELECT clause; for example, subquery in `FROM` clause.

The above SELECT statement returns query results with fields `dt` as `DATETIME` object, `rh2m`, `prec_amount`, `temp2m`. This multi-column result can further be the source for another outer query.
*   Subquery gives `dt` (as `DATETIME` object), `rh2m`, `prec_amount`, `temp2m`
*   Outer query further select `dt`, `HOUR(dt)`, `rh2m`, `prec_amount`, `temp2m`



In [27]:
# Create a query that contains subquery
sql_all=f'SELECT dt, HOUR(dt) as hour, rh2m, prec_amount, temp2m FROM ({sql}) AS datetime_weather'
print(sql_all)

SELECT dt, HOUR(dt) as hour, rh2m, prec_amount, temp2m FROM (
select DATE_ADD(DATE_ADD(STR_TO_DATE(2025021018,'%Y%m%d%H'),INTERVAL 7 HOUR),INTERVAL timepoint HOUR) AS dt, rh2m, prec_amount, temp2m
FROM prediction ORDER BY dt ASC
) AS datetime_weather


In [28]:
# Execute the query
mycursor.execute(sql_all)
results = mycursor.fetchall()

# Print the returned tuples of dt, hour, rh2m, prec_amount, temp2m
for row in results:
  print(row)


(datetime.datetime(2025, 2, 11, 4, 0), 4, 51, 0, 22)
(datetime.datetime(2025, 2, 11, 4, 0), 4, 51, 0, 22)
(datetime.datetime(2025, 2, 11, 7, 0), 7, 52, 0, 22)
(datetime.datetime(2025, 2, 11, 7, 0), 7, 52, 0, 22)
(datetime.datetime(2025, 2, 11, 10, 0), 10, 42, 0, 28)
(datetime.datetime(2025, 2, 11, 10, 0), 10, 42, 0, 28)
(datetime.datetime(2025, 2, 11, 13, 0), 13, 35, 0, 33)
(datetime.datetime(2025, 2, 11, 13, 0), 13, 35, 0, 33)
(datetime.datetime(2025, 2, 11, 16, 0), 16, 42, 0, 34)
(datetime.datetime(2025, 2, 11, 16, 0), 16, 42, 0, 34)
(datetime.datetime(2025, 2, 11, 19, 0), 19, 51, 0, 30)
(datetime.datetime(2025, 2, 11, 19, 0), 19, 51, 0, 30)
(datetime.datetime(2025, 2, 11, 22, 0), 22, 50, 0, 29)
(datetime.datetime(2025, 2, 11, 22, 0), 22, 50, 0, 29)
(datetime.datetime(2025, 2, 12, 1, 0), 1, 73, 0, 26)
(datetime.datetime(2025, 2, 12, 1, 0), 1, 73, 0, 26)
(datetime.datetime(2025, 2, 12, 4, 0), 4, 65, 0, 24)
(datetime.datetime(2025, 2, 12, 4, 0), 4, 65, 0, 24)
(datetime.datetime(2025, 2

## GROUP BY

### Find AVG temp2m

**`AVG()`** function in SQL is aggregate function that calculates  average value of numerical column

In [29]:
# Create our subquery
sql=f'''SELECT DATE_ADD(DATE_ADD(STR_TO_DATE({data['init']},'%Y%m%d%H'),INTERVAL 7 HOUR),INTERVAL timepoint HOUR) AS dt, rh2m, prec_amount, temp2m
FROM prediction ORDER BY dt ASC'''

**`GROUP BY`** clause groups rows that have the same values in specified columns into summary rows, allowing the calculation of aggregate values such as `COUNT()`, `AVG()`, `SUM()`, `MIN()`, or `MAX()` for each group.

In [30]:
# Create query with group by
sql_all=f'''SELECT HOUR(dt), AVG(temp2m)
FROM ({sql}) AS datetime_weather
GROUP BY HOUR(dt)'''
print(sql_all)

SELECT HOUR(dt), AVG(temp2m)
FROM (SELECT DATE_ADD(DATE_ADD(STR_TO_DATE(2025021018,'%Y%m%d%H'),INTERVAL 7 HOUR),INTERVAL timepoint HOUR) AS dt, rh2m, prec_amount, temp2m
FROM prediction ORDER BY dt ASC) AS datetime_weather
GROUP BY HOUR(dt)


In [31]:
# Execute sql_all mysql command
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "{sql_all}"

+----------+-------------+
| HOUR(dt) | AVG(temp2m) |
+----------+-------------+
|        4 |     25.2500 |
|        7 |     24.7500 |
|       10 |     30.5000 |
|       13 |     35.1250 |
|       16 |     36.0000 |
|       19 |     30.7500 |
|       22 |     28.3750 |
|        1 |     26.8750 |
+----------+-------------+


In [32]:
# Execute the query
mycursor.execute(sql_all)
results = mycursor.fetchall()

# Print the returned tuples of hour and avg(temp2m)
for row in results:
  print(row)


(4, Decimal('25.2500'))
(7, Decimal('24.7500'))
(10, Decimal('30.5000'))
(13, Decimal('35.1250'))
(16, Decimal('36.0000'))
(19, Decimal('30.7500'))
(22, Decimal('28.3750'))
(1, Decimal('26.8750'))


## **Update Data**  

Example: Modify Existing Records  

```python
mycursor.execute("UPDATE prediction SET temp2m = 25 WHERE timepoint = 10")  
mydb.commit()  
print(f"{mycursor.rowcount} record(s) updated.")  
```  

The **UPDATE** statement is used to modify existing records in a table:  

- **UPDATE table_name**: Specifies which table to update.  
- **SET column_name = value**: Defines the new value for the specified column.  
- **WHERE condition**: Ensures only specific records are updated (without `WHERE`, all records will be updated).  

To update multiple columns:  

```python
mycursor.execute("UPDATE prediction SET temp2m = 25, rh2m = 60 WHERE timepoint = 10")  
mydb.commit()  
```  

This updates both `temp2m` and `rh2m` where `timepoint = 10`. Always use `WHERE` to avoid modifying all rows.

In [33]:
# Select original data from prediction table
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "SELECT * FROM prediction"

+-----------+------+-------------+--------+
| timepoint | rh2m | prec_amount | temp2m |
+-----------+------+-------------+--------+
|         3 |   51 |           0 |     22 |
|         6 |   52 |           0 |     22 |
|         9 |   42 |           0 |     28 |
|        12 |   35 |           0 |     33 |
|        15 |   42 |           0 |     34 |
|        18 |   51 |           0 |     30 |
|        21 |   50 |           0 |     29 |
|        24 |   73 |           0 |     26 |
|        27 |   65 |           0 |     24 |
|        30 |   64 |           0 |     24 |
|        33 |   53 |           0 |     30 |
|        36 |   48 |           0 |     35 |
|        39 |   49 |           0 |     36 |
|        42 |   61 |           0 |     31 |
|        45 |   73 |           0 |     29 |
|        48 |   73 |           0 |     27 |
|        51 |   79 |           0 |     26 |
|        54 |   77 |           0 |     25 |
|        57 |   60 |           0 |     31 |
|        60 |   57 |           0

In [34]:
# Update temp2m to 10 for rows that timepoint is 3
mycursor.execute("UPDATE prediction SET temp2m = 10 WHERE timepoint = 3")
mydb.commit()

print(f"{mycursor.rowcount} record(s) updated.")

2 record(s) updated.


In [35]:
# Check the result
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "SELECT * FROM prediction WHERE timepoint = 3"

+-----------+------+-------------+--------+
| timepoint | rh2m | prec_amount | temp2m |
+-----------+------+-------------+--------+
|         3 |   51 |           0 |     10 |
|         3 |   51 |           0 |     10 |
+-----------+------+-------------+--------+


## **Delete Data**  



### **1. Using 'DELETE'**
Example: Remove Records from a Table  

```python
mycursor.execute("DELETE FROM prediction WHERE timepoint <= 20")  
mydb.commit()  
print(f"{mycursor.rowcount} record(s) deleted.")  
```  

The **DELETE** statement is used to remove records from a table:  

- **DELETE FROM table_name**: Specifies which table to delete records from.  
- **WHERE condition**: Filters the records to be deleted (without `WHERE`, all records will be deleted).  

To delete all records in a table (use with caution!):  

```python
mycursor.execute("DELETE FROM prediction")  
mydb.commit()  
```  

This deletes all rows in the `prediction` table, leaving the structure intact. Always be careful with **DELETE** queries and ensure the `WHERE` clause is properly defined.

In [36]:
# Delete rows from table prediction for those with timepoint <=20
mycursor.execute("DELETE FROM prediction WHERE timepoint <= 20")
mydb.commit()

print(f"{mycursor.rowcount} record(s) deleted.")

12 record(s) deleted.


In [37]:
# Check the result
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "SELECT * FROM prediction WHERE timepoint <= 40"

+-----------+------+-------------+--------+
| timepoint | rh2m | prec_amount | temp2m |
+-----------+------+-------------+--------+
|        21 |   50 |           0 |     29 |
|        24 |   73 |           0 |     26 |
|        27 |   65 |           0 |     24 |
|        30 |   64 |           0 |     24 |
|        33 |   53 |           0 |     30 |
|        36 |   48 |           0 |     35 |
|        39 |   49 |           0 |     36 |
|        21 |   50 |           0 |     29 |
|        24 |   73 |           0 |     26 |
|        27 |   65 |           0 |     24 |
|        30 |   64 |           0 |     24 |
|        33 |   53 |           0 |     30 |
|        36 |   48 |           0 |     35 |
|        39 |   49 |           0 |     36 |
+-----------+------+-------------+--------+


### **2. Truncate Data Using `TRUNCATE`**  

Example: Remove All Records from a Table  

```python
mycursor.execute("TRUNCATE TABLE prediction")  
mydb.commit()  
print("All records in 'prediction' have been deleted.")  
```  

The **TRUNCATE TABLE** statement is used to remove all rows from a table **without** logging individual row deletions (faster than `DELETE`):  

- **TRUNCATE TABLE table_name**: Removes all records from the specified table.  
- **No WHERE clause**: Unlike `DELETE`, `TRUNCATE` does not allow filtering; it clears all rows in the table.  

**Important**:  
- **TRUNCATE** is more efficient than `DELETE` for removing all rows but cannot be rolled back in some cases, so use it carefully.
- It also **does not fire triggers** like `DELETE` does.  

For clearing large tables, **TRUNCATE** is often preferred because it is faster.

In [38]:
# Truncate table prediction
mycursor.execute("TRUNCATE TABLE prediction")
mydb.commit()

print("All records in 'prediction' have been deleted.")

All records in 'prediction' have been deleted.


In [39]:
# Check the result
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "SELECT COUNT(*) FROM prediction"

+----------+
| COUNT(*) |
+----------+
|        0 |
+----------+


### **3. Delete a Table Using `DROP`**  

Example: Remove a Table from a Database  

```python
mycursor.execute("DROP TABLE IF EXISTS prediction")  
mydb.commit()  
print("Table 'prediction' has been deleted.")  
```  

The **DROP TABLE** statement is used to delete a table from the database entirely:  

- **DROP TABLE table_name**: Deletes the specified table and all its data.  
- **IF EXISTS**: Ensures the query only runs if the table exists, preventing errors if the table is already gone.  

### **Important Notes:**
- **`DROP` is irreversible**: Unlike `DELETE` or `TRUNCATE`, once a table is dropped, it **cannot be recovered** (unless you have a backup).
- **Does not require a `WHERE` clause**: The `DROP TABLE` command will remove the entire table structure and all rows.

In [40]:
# Show tables
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "SHOW TABLES"

+----------------------+
| Tables_in_weather_db |
+----------------------+
| prediction           |
+----------------------+


In [41]:
# Drop table prediction
mycursor.execute("DROP TABLE IF EXISTS prediction")
mydb.commit()

print("Table 'prediction' has been deleted.")

Table 'prediction' has been deleted.


In [42]:
# Show tables to check the result
!mysql -u test_user -ptest_p@ssw0rd weather_db -e "SHOW TABLES"

