## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**2066. Account Balance (Medium)**

**Table: Transactions**

| Column Name | Type |
|-------------|------|
| account_id  | int  |
| day         | date |
| type        | ENUM |
| amount      | int  |

(account_id, day) is the primary key (combination of columns with unique values) for this table.
Each row contains information about one transaction, including the transaction type, the day it occurred on, and the amount.
type is an ENUM (category) of the type ('Deposit','Withdraw') 
 
**Write a solution to report the balance of each user after each transaction. You may assume that the balance of each account before any transaction is 0 and that the balance will never be below 0 at any moment.**

Return the result table in ascending order by account_id, then by day in case of a tie.

The result format is in the following example.

**Example 1:**

**Input:** 

**Transactions table:**

| account_id | day        | type     | amount |
|------------|------------|----------|--------|
| 1          | 2021-11-07 | Deposit  | 2000   |
| 1          | 2021-11-09 | Withdraw | 1000   |
| 1          | 2021-11-11 | Deposit  | 3000   |
| 2          | 2021-12-07 | Deposit  | 7000   |
| 2          | 2021-12-12 | Withdraw | 7000   |

**Output:** 
| account_id | day        | balance |
|------------|------------|---------|
| 1          | 2021-11-07 | 2000    |
| 1          | 2021-11-09 | 1000    |
| 1          | 2021-11-11 | 4000    |
| 2          | 2021-12-07 | 7000    |
| 2          | 2021-12-12 | 0       |

**Explanation:** 
- Account 1:
  - Initial balance is 0.
  - 2021-11-07 --> deposit 2000. Balance is 0 + 2000 = 2000.
  - 2021-11-09 --> withdraw 1000. Balance is 2000 - 1000 = 1000.
  - 2021-11-11 --> deposit 3000. Balance is 1000 + 3000 = 4000.
- Account 2:
  - Initial balance is 0.
  - 2021-12-07 --> deposit 7000. Balance is 0 + 7000 = 7000.
  - 2021-12-12 --> withdraw 7000. Balance is 7000 - 7000 = 0.

In [0]:
transactions_data_2066 = [
    (1, "2021-11-07", "Deposit", 2000),
    (1, "2021-11-09", "Withdraw", 1000),
    (1, "2021-11-11", "Deposit", 3000),
    (2, "2021-12-07", "Deposit", 7000),
    (2, "2021-12-12", "Withdraw", 7000),
]

transactions_columns_2066 = ["account_id", "day", "type", "amount"]
transactions_df_2066 = spark.createDataFrame(transactions_data_2066, transactions_columns_2066)
transactions_df_2066.show()

+----------+----------+--------+------+
|account_id|       day|    type|amount|
+----------+----------+--------+------+
|         1|2021-11-07| Deposit|  2000|
|         1|2021-11-09|Withdraw|  1000|
|         1|2021-11-11| Deposit|  3000|
|         2|2021-12-07| Deposit|  7000|
|         2|2021-12-12|Withdraw|  7000|
+----------+----------+--------+------+



In [0]:
transactions_df_2066 = transactions_df_2066\
                            .withColumn( "net_amount",
                                        when(transactions_df_2066.type == "Deposit", transactions_df_2066.amount).otherwise(-transactions_df_2066.amount))

In [0]:
windowSpec = Window.partitionBy("account_id").orderBy("day").rowsBetween(Window.unboundedPreceding, Window.currentRow)

In [0]:
transactions_df_2066\
        .withColumn("balance", sum("net_amount").over(windowSpec))\
            .select("account_id", "day", "balance").orderBy("account_id", "day").show()

+----------+----------+-------+
|account_id|       day|balance|
+----------+----------+-------+
|         1|2021-11-07|   2000|
|         1|2021-11-09|   1000|
|         1|2021-11-11|   4000|
|         2|2021-12-07|   7000|
|         2|2021-12-12|      0|
+----------+----------+-------+

