# MYSQL Database connector

Before running this notebook make sure the MySQL Python Connector is installed 
TODO: Add an examples of how to install it
TODO: Mention the environment variables are needed in order to connect to the mysql TPCH docker imagen

## Installing MySQL Connector

Make sure to have the `mysql-connector-python` installed:

- If you're working inside the repo:
    ```shell
    pip install -e ".[mysql]"
    ```

- Or install the connector directly with:
    ```shell
    pip install "mysql-connector-python"
    ```

## Importing Required Libraries

In [9]:
import pydough
import datetime
import os

## Loading credencials and connecting to MySQL

1. Load credentials from a local .env file
    * The `.env` file contains your MySQL login details like `MYSQL_USERNAME`, `MYSQL_PASSWORD`, `MYSQL_DB` and `MYSQL_HOST`.
    * These are read using `os.getenv()` function.

2. Connect to MySQL using PyDough
    * `pydough.active_session.load_metadata_graph(...)` loads a metadata graph that maps your MySQL schema (used for query planning or optimizations).
    * `connect_database(...)` uses the loaded credentials to establish a live connection to your MySQL database.

Note: 
- Make sure the `.env` exists and contains all the required keys.
- Make sure the metadata graph path points to a valid JSON file that represents your schema


In [2]:
mysql_username = os.getenv("MYSQL_USERNAME")
mysql_password = os.getenv("MYSQL_PASSWORD")
mysql_tpch_db = os.getenv("MYSQL_DB")
mysql_host = os.getenv("MYSQL_HOST")

pydough.active_session.load_metadata_graph("../../tests/test_metadata/sample_graphs.json", "TPCH")
pydough.active_session.connect_database("mysql", 
        user=mysql_username,
        password=mysql_password,
        database=mysql_tpch_db,
        host=mysql_host,
)

DatabaseContext(connection=<pydough.database_connectors.database_connector.DatabaseConnection object at 0x121296710>, dialect=<DatabaseDialect.MYSQL: 'mysql'>)

## Enabling PyDough's Jupyter Magic Commands

This line loads the `pydough.jupyter_extensions` module, which adds custom magic commands (like %%pydough) to the notebook.

These magic commands allow you to:

- Write PyDough directly in notebook cells using %%pydough
- Automatically render results

This is a Jupyter-specific feature — the %load_ext command dynamically loads these extensions into your current notebook session.

In [3]:
%load_ext pydough.jupyter_extensions

## TPC-H Query 1 with PyDough and MySQL

This cell runs TPC-H Query 1 using PyDough's Python-style DSL instead of raw SQL.

The query computes summary statistics (like sums, averages, and counts) for orders, grouped by return flag and line status, and filtered by a shipping date cutoff.

Finally, pydough.to_df(output) converts and prints the result as a Pandas DataFrame for easy inspection and analysis in Python.

In [10]:
%%pydough
# TPCH Q1
output = (lines.WHERE((ship_date <= datetime.date(1998, 12, 1)))
        .PARTITION(name="groups", by=(return_flag, status))
        .CALCULATE(
            L_RETURNFLAG=return_flag,
            L_LINESTATUS=status,
            SUM_QTY=SUM(lines.quantity),
            SUM_BASE_PRICE=SUM(lines.extended_price),
            SUM_DISC_PRICE=SUM(lines.extended_price * (1 - lines.discount)),
            SUM_CHARGE=SUM(
                lines.extended_price * (1 - lines.discount) * (1 + lines.tax)
            ),
            AVG_QTY=AVG(lines.quantity),
            AVG_PRICE=AVG(lines.extended_price),
            AVG_DISC=AVG(lines.discount),
            COUNT_ORDER=COUNT(lines),
        )
        .ORDER_BY(L_RETURNFLAG.ASC(), L_LINESTATUS.ASC())
)

pydough.to_df(output)

Unnamed: 0,L_RETURNFLAG,L_LINESTATUS,SUM_QTY,SUM_BASE_PRICE,SUM_DISC_PRICE,SUM_CHARGE,AVG_QTY,AVG_PRICE,AVG_DISC,COUNT_ORDER
0,A,F,37734107.0,56586554400.73,53758257134.87,55909065222.82769,25.522006,38273.129735,0.049985,1478493
1,N,F,991417.0,1487504710.38,1413082168.0541,1469649223.194375,25.516472,38284.467761,0.050093,38854
2,N,O,76633518.0,114935210409.19,109189591897.472,113561024263.01378,25.50202,38248.015609,0.05,3004998
3,R,F,37719753.0,56568041380.9,53741292684.604,55889619119.831924,25.505794,38250.854626,0.050009,1478870
