# Step 4 - Transaction Table
In our third `trading.transactions` database table we have each BUY or SELL transaction for a specific ticker performed by each member

## View The Data
You can inspect the most recent 10 transactions by `member_id = 'c4ca42'` (do you remember who that is?)

In [43]:
import pandas as pd
import mysql.connector as sql
import os

In [42]:
connection = sql.connect(
    host = os.environ.get('mysql_host'),
    user = os.environ.get('mysql_user'),
    password = os.environ.get('mysql_password')
)

In [3]:
pd.read_sql_query("""
    SELECT *
    FROM trading.transactions
    WHERE member_id = 'c4ca42'
    ORDER BY txn_time DESC
    LIMIT 10;
    """,
    connection
    )

Unnamed: 0,txn_id,member_id,ticker,txn_date,txn_time,txn_type,quantity,percentage_fee
0,22905,c4ca42,ETH,2021-08-27,2021-08-27 07:24:00,BUY,2.67788,0.3
1,22896,c4ca42,ETH,2021-08-26,2021-08-26 09:11:52,BUY,7.92003,0.28
2,22836,c4ca42,BTC,2021-08-21,2021-08-21 07:02:18,SELL,1.58242,0.3
3,22831,c4ca42,ETH,2021-08-21,2021-08-21 01:39:06,BUY,3.76509,0.1
4,22823,c4ca42,BTC,2021-08-20,2021-08-20 02:42:02,BUY,5.3231,0.3
5,22818,c4ca42,BTC,2021-08-19,2021-08-19 20:53:21,BUY,5.02113,0.23
6,22805,c4ca42,BTC,2021-08-19,2021-08-19 04:31:19,BUY,1.46109,0.3
7,22793,c4ca42,BTC,2021-08-18,2021-08-18 18:25:31,BUY,6.94422,0.19
8,22792,c4ca42,BTC,2021-08-18,2021-08-18 17:27:52,BUY,7.94524,0.03
9,22780,c4ca42,ETH,2021-08-18,2021-08-18 01:00:39,BUY,1.6855,0.3


## Data Dictionary
|Column Name|	Description|
|---|---|
|txn_id	|unique ID for each transaction|
|member_id|	member identifier for each trade|
|ticker|	the ticker for each trade|
|txn_date|	the date for each transaction|
|txn_type|	either BUY or SELL|
|quantity|	the total quantity for each trade|
|percentage_fee|	% of total amount charged as fees|
|txn_time|	the timestamp for each trade|

## Transactions Questions
Let's finish our initial data exploration with a few more questions for the `trading.transactions` table!

### Question 1
How many records are there in the `trading.transactions` table?

In [5]:
pd.read_sql_query(
    """
    SELECT 
        COUNT(*) AS record_count 
    FROM trading.transactions;
    """,
    connection
)

Unnamed: 0,record_count
0,22918


### Question 2
How many unique transactions are there?

In [9]:
pd.read_sql_query(
    """
    SELECT
        COUNT(DISTINCT txn_id) AS unique_transactions
    FROM trading.transactions;
    """,
    connection
)

Unnamed: 0,unique_transactions
0,22918


### Question 3
How many buy and sell transactions are there for Bitcoin?

In [11]:
pd.read_sql_query(
    """
    SELECT
        txn_type,
        COUNT(*) AS transactions_count
    FROM trading.transactions
    WHERE ticker = 'BTC'
    GROUP BY txn_type;
    """,
    connection
)

Unnamed: 0,txn_type,transactions_count
0,BUY,10440
1,SELL,2044


### Question 4
For each year, calculate the following buy and sell metrics for Bitcoin:
- total transaction count
- total quantity
- average quantity per transaction

In [23]:
pd.read_sql_query(
    """
    SELECT
        EXTRACT(YEAR FROM txn_date) AS txn_year,
        txn_type,
        COUNT(*) AS total_transactions_count,
        ROUND(SUM(quantity), 2) AS total_quantity,
        ROUND(SUM(quantity)/COUNT(*), 2) AS average_quantity_per_transaction
    FROM trading.transactions
    WHERE ticker = 'BTC'
    GROUP BY txn_year, txn_type;
    """,
    connection
)

Unnamed: 0,txn_year,txn_type,total_transactions_count,total_quantity,average_quantity_per_transaction
0,2017,BUY,2261,12069.58,5.34
1,2017,SELL,419,2160.22,5.16
2,2018,BUY,2204,11156.06,5.06
3,2018,SELL,433,2145.05,4.95
4,2019,BUY,2192,11114.43,5.07
5,2019,SELL,443,2316.24,5.23
6,2020,BUY,2350,11748.76,5.0
7,2020,SELL,456,2301.98,5.05
8,2021,BUY,1433,7161.32,5.0
9,2021,SELL,293,1478.0,5.04


### Question 5
What was the monthly total quantity purchased and sold for Ethereum in 2020?

In [36]:
pd.read_sql_query(
    """
    SELECT
        EXTRACT(MONTH FROM txn_date) AS txn_month,
        ROUND(SUM(CASE WHEN txn_type='BUY' THEN quantity ELSE 0 END), 2) AS quantity_bought,
        ROUND(SUM(CASE WHEN txn_type='SELL' THEN quantity ELSE 0 END), 2) AS quantity_sold
    FROM trading.transactions
    WHERE EXTRACT(YEAR FROM txn_date)='2020'
        AND ticker = 'ETH'
    GROUP BY txn_month
    ORDER BY txn_month;
    """,
    connection
)

Unnamed: 0,txn_month,quantity_bought,quantity_sold
0,1,801.05,158.13
1,2,687.89,160.07
2,3,804.24,182.19
3,4,761.87,203.17
4,5,787.42,149.08
5,6,787.47,208.34
6,7,890.78,117.02
7,8,800.6,178.54
8,9,767.65,118.87
9,10,744.79,174.27


### Question 6
Summarise all buy and sell transactions for each `member_id` by generating 1 row for each member with the following additional columns:
- Bitcoin buy quantity
- Bitcoin sell quantity
- Ethereum buy quantity
- Ethereum sell quantity

In [48]:
pd.read_sql_query(
    """
    SELECT
        member_id,
        ROUND(
            SUM(
                CASE 
                  WHEN txn_type='BUY' AND ticker='BTC' THEN quantity 
                  ELSE 0 
                END), 2
            ) AS bitcoin_buy_quantity,
        ROUND(
            SUM(
                CASE 
                  WHEN txn_type='SELL' AND ticker='BTC' THEN quantity 
                  ELSE 0 
                END), 2
            ) AS bitcoin_sell_quantity,
        ROUND(SUM(
            CASE 
              WHEN txn_type='BUY' AND ticker='ETH' THEN quantity 
              ELSE 0 
            END), 2
            ) AS ethereum_buy_quantity,
        ROUND(
            SUM(
                CASE 
                  WHEN txn_type='SELL' AND ticker='ETH' THEN quantity 
                  ELSE 0 
                END), 2
            ) AS ethereum_sell_quantity
    FROM trading.transactions
    GROUP BY member_id;
    """,
    connection
)

Unnamed: 0,member_id,bitcoin_buy_quantity,bitcoin_sell_quantity,ethereum_buy_quantity,ethereum_sell_quantity
0,c81e72,2600.93,974.1,4852.52,729.42
1,eccbc8,2844.65,305.35,2573.75,794.06
2,a87ff6,5023.71,863.49,3822.04,318.15
3,e4da3b,3567.39,998.38,2053.98,581.71
4,167909,4448.24,503.04,1119.74,707.0
5,c4ca42,4380.44,1075.56,4516.6,1011.56
6,8f14e4,2647.08,445.74,3233.48,663.78
7,c9f0f8,4572.88,852.36,2343.47,254.27
8,45c48c,3814.24,198.13,4442.14,216.5
9,d3d944,4270.86,735.87,1744.65,1057.42


### Question 7
What was the final quantity holding of Bitcoin for each member? Sort the output from the highest BTC holding to lowest

In [60]:
pd.read_sql_query(
    """
    SELECT
        member_id,
        ROUND(
            SUM(
                CASE 
                  WHEN txn_type = 'BUY' THEN quantity 
                  WHEN txn_type = 'SELL' THEN -quantity
                  ELSE 0
                END), 2
            ) AS btc_final_quantity
    FROM trading.transactions
    WHERE ticker = 'BTC'
    GROUP BY member_id
    ORDER BY btc_final_quantity DESC;
    """,
    connection
)

Unnamed: 0,member_id,btc_final_quantity
0,a87ff6,4160.22
1,c20ad4,4046.09
2,167909,3945.2
3,c9f0f8,3720.52
4,45c48c,3616.11
5,d3d944,3534.99
6,6512bd,3456.91
7,c4ca42,3304.88
8,aab323,2575.58
9,e4da3b,2569.01


### Question 8
Which members have sold less than 500 Bitcoin? Sort the output from the most BTC sold to least
We can actually do this in 3 different ways!
1. 'HAVING' solution
2. 'CTE' solution
3. 'subquery' solution

In [55]:
# Having Solution
pd.read_sql_query(
    """
    SELECT
        member_id,
        ROUND(
            SUM(quantity), 2
            ) AS btc_quantity_sold
    FROM trading.transactions
    WHERE ticker = 'BTC' 
        AND txn_type = 'SELL'
    GROUP BY member_id
    HAVING btc_quantity_sold < 500
    ORDER BY btc_quantity_sold DESC;
    """,
    connection
)

Unnamed: 0,member_id,btc_quantity_sold
0,8f14e4,445.74
1,eccbc8,305.35
2,45c48c,198.13


In [58]:
# CTE Solution
pd.read_sql_query(
    """
    WITH cte AS (
    SELECT
        member_id,
        ROUND(
            SUM(quantity), 2
            ) AS btc_quantity_sold
    FROM trading.transactions
    WHERE ticker = 'BTC' 
        AND txn_type = 'SELL'
    GROUP BY member_id    
    )

    SELECT *
    FROM cte
    WHERE btc_quantity_sold < 500
    ORDER BY btc_quantity_sold DESC;
    """,
    connection
)

Unnamed: 0,member_id,btc_quantity_sold
0,8f14e4,445.74
1,eccbc8,305.35
2,45c48c,198.13


In [57]:
# Subquery Solution
pd.read_sql_query(
    """
    SELECT * FROM (
        SELECT
            member_id,
            ROUND(
                SUM(quantity), 2
                ) AS btc_quantity_sold
        FROM trading.transactions
        WHERE ticker = 'BTC' 
            AND txn_type = 'SELL'
        GROUP BY member_id
    ) AS subquery
    WHERE btc_quantity_sold < 500
    ORDER BY btc_quantity_sold DESC;
    """,
    connection
)

Unnamed: 0,member_id,btc_quantity_sold
0,8f14e4,445.74
1,eccbc8,305.35
2,45c48c,198.13


### Question 9
What is the total Bitcoin quantity for each `member_id` owns after adding all of the BUY and SELL transactions from the `transactions` table? Sort the output by descending total quantity

In [59]:
pd.read_sql_query(
    """
    SELECT
        member_id,
        ROUND(
        SUM(
            CASE
              WHEN txn_type = 'BUY' THEN quantity
              ELSE -quantity
            END), 2
            ) AS btc_quantity 
    FROM trading.transactions
    WHERE ticker = 'BTC'
    GROUP BY member_id
    ORDER BY btc_quantity DESC;
    """,
    connection
)

Unnamed: 0,member_id,btc_quantity
0,a87ff6,4160.22
1,c20ad4,4046.09
2,167909,3945.2
3,c9f0f8,3720.52
4,45c48c,3616.11
5,d3d944,3534.99
6,6512bd,3456.91
7,c4ca42,3304.88
8,aab323,2575.58
9,e4da3b,2569.01


### Question 10
Which `member_id` has the highest buy to sell ratio by quantity?

In [62]:
pd.read_sql_query(
    """
    SELECT
        member_id,
        ROUND(
            SUM(CASE WHEN txn_type = 'BUY' THEN quantity ELSE 0 END)
            /
            SUM(CASE WHEN txn_type = 'SELL' THEN quantity ELSE 0 END
            ), 2
        ) AS quantity_buy_to_sell_ratio
    FROM trading.transactions
    GROUP BY member_id
    ORDER BY quantity_buy_to_sell_ratio DESC;
    """,
    connection
)

Unnamed: 0,member_id,quantity_buy_to_sell_ratio
0,45c48c,19.91
1,a87ff6,7.49
2,c9f0f8,6.25
3,8f14e4,5.3
4,eccbc8,4.93
5,c20ad4,4.65
6,167909,4.6
7,aab323,4.55
8,6512bd,4.53
9,c81e72,4.38


### Question 11
For each `member_id` - which month had the highest total Ethereum quantity sold?

In [89]:
pd.read_sql_query(
    """
    WITH cte_rank AS (
    SELECT
        member_id,
        EXTRACT(MONTH FROM txn_date) AS txn_month,
        ROUND(SUM(quantity), 2) AS eth_quantity_sold,
        RANK() OVER (PARTITION BY member_id ORDER BY SUM(quantity) DESC) AS month_rank
    FROM trading.transactions
    WHERE ticker = 'ETH'
        AND txn_type = 'SELL'
    GROUP BY member_id, txn_month
    ORDER
    ) 

    SELECT 
        member_id,
        txn_month,
        eth_quantity_sold
    FROM cte_rank
    WHERE month_rank = 1
    ORDER BY eth_quantity_sold DESC
    """,
    connection
)

Unnamed: 0,member_id,txn_month,eth_quantity_sold
0,d3d944,4,137.98
1,c51ce4,3,131.49
2,167909,12,122.79
3,c4ca42,4,115.01
4,eccbc8,1,114.26
5,6512bd,1,108.11
6,8f14e4,4,96.11
7,c20ad4,5,90.82
8,a87ff6,7,89.35
9,aab323,1,88.12


In [92]:
pd.read_sql_query(
    """
    SELECT
        member_id,
        EXTRACT(MONTH FROM txn_date) AS txn_month,
        ROUND(SUM(quantity), 2) AS eth_quantity_sold,
        RANK() OVER (PARTITION BY member_id ORDER BY SUM(quantity) DESC) AS month_rank
    FROM trading.transactions
    WHERE ticker = 'ETH'
        AND txn_type = 'SELL'
    GROUP BY member_id, txn_month
    ORDER BY eth_quantity_sold DESC
    """,
    connection
)

Unnamed: 0,member_id,txn_month,eth_quantity_sold,month_rank
0,167909,12,41.85,1
1,167909,4,41.85,2
2,167909,6,41.85,3
3,167909,2,41.85,4
4,167909,9,41.85,5
...,...,...,...,...
163,eccbc8,4,41.85,8
164,eccbc8,11,41.85,9
165,eccbc8,2,41.85,10
166,eccbc8,9,41.85,11


# References
- [Data With Danny Course - Step 4](https://github.com/DataWithDanny/sql-masterclass/blob/main/course-content/step4.md)