## 1831. Maximum Transaction Each Day
### Table: Transactions

| Column Name    | Type     |
|----------------|----------|
| transaction_id | int      |
| day            | datetime |
| amount         | int      |

transaction_id is the primary key for this table.  
Each row contains information about one transaction.

---

Write an SQL query to report the IDs of the transactions with the maximum amount on their respective day. If in one day there are multiple such transactions, return all of them.

Return the result table in ascending order by transaction_id.

---

### Transactions table:

| transaction_id | day                | amount |
|----------------|--------------------|--------|
| 8              | 2021-4-3 15:57:28  | 57     |
| 9              | 2021-4-28 08:47:25 | 21     |
| 1              | 2021-4-29 13:28:30 | 58     |
| 5              | 2021-4-28 16:39:59 | 40     |
| 6              | 2021-4-29 23:39:28 | 58     |

---

### Result table:

| transaction_id |
|----------------|
| 1              |
| 5              |
| 6              |
| 8              |

---

**Explanation:**  
"2021-4-3"  --> We have one transaction with ID 8, so we add 8 to the result table.  
"2021-4-28" --> We have two transactions with IDs 5 and 9. The transaction with ID 5 has an amount of 40, while the transaction with ID 9 has an amount of 21. We only include the transaction with ID 5 as it has the maximum amount this day.  
"2021-4-29" --> We have two transactions with IDs 1 and 6. Both transactions have the same amount of 58, so we include both in the result table.  
We order the result table by transaction_id after collecting these IDs.

In [0]:
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, IntegerType, TimestampType
from pyspark.sql.functions import col, max as max_, to_date
from pyspark.sql.window import Window

# Start Spark session
spark = SparkSession.builder.appName("MaxTransactionPerDay").getOrCreate()

# Define schema
schema = StructType([
    StructField("transaction_id", IntegerType(), True),
    StructField("day", TimestampType(), True),
    StructField("amount", IntegerType(), True)
])

# Sample data
from datetime import datetime
data = [
    (8, datetime.strptime("2021-04-03 15:57:28", "%Y-%m-%d %H:%M:%S"), 57),
    (9, datetime.strptime("2021-04-28 08:47:25", "%Y-%m-%d %H:%M:%S"), 21),
    (1, datetime.strptime("2021-04-29 13:28:30", "%Y-%m-%d %H:%M:%S"), 58),
    (5, datetime.strptime("2021-04-28 16:39:59", "%Y-%m-%d %H:%M:%S"), 40),
    (6, datetime.strptime("2021-04-29 23:39:28", "%Y-%m-%d %H:%M:%S"), 58)
]

# Create DataFrame
df = spark.createDataFrame(data, schema)

#create Sql temp_view

df.createOrReplaceTempView("Transactions")



In [0]:
%sql
with cte as (
SELECT 
    date_FORMAT(day, 'yyyy-MM-dd') AS formatted_day,
    transaction_id,
    day,
    amount
FROM transactions
)
, cte2 as (

    Select rank()over(partition by formatted_day order by amount desc) as d_rank  , * from cte 
)
select transaction_id from cte2 where d_rank =1 order by transaction_id asc;

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.window import *
#format date
df_formated = df.withColumn("formated_date", date_format(col("day"),"yyyy-MM-dd"))

win_spec = Window.partitionBy("formated_date").orderBy(col("amount").desc())
rank = rank().over(win_spec)
df_formated.withColumn("rank",rank).filter(col("rank")==1).select("transaction_id").orderBy("transaction_id").display()