## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**1194. Tournament Winners (Hard)**

**Table: Players**

| Column Name | Type  |
|-------------|-------|
| player_id   | int   |
| group_id    | int   |

player_id is the primary key (column with unique values) of this table.
Each row of this table indicates the group of each player.

**Table: Matches**

| Column Name   | Type    |
|---------------|---------|
| match_id      | int     |
| first_player  | int     |
| second_player | int     | 
| first_score   | int     |
| second_score  | int     |

match_id is the primary key (column with unique values) of this table.
Each row is a record of a match, first_player and second_player contain the player_id of each match.
first_score and second_score contain the number of points of the first_player and second_player respectively.
You may assume that, in each match, players belong to the same group.
 
The winner in each group is the player who scored the maximum total points within the group. In the case of a tie, the lowest player_id wins.

**Write a solution to find the winner in each group.**

Return the result table in any order.

The result format is in the following example.

**Example 1:**

**Input:** 

**Players table:**
| player_id | group_id   |
|-----------|------------|
| 15        | 1          |
| 25        | 1          |
| 30        | 1          |
| 45        | 1          |
| 10        | 2          |
| 35        | 2          |
| 50        | 2          |
| 20        | 3          |
| 40        | 3          |

**Matches table:**

| match_id   | first_player | second_player | first_score | second_score |
|------------|--------------|---------------|-------------|--------------|
| 1          | 15           | 45            | 3           | 0            |
| 2          | 30           | 25            | 1           | 2            |
| 3          | 30           | 15            | 2           | 0            |
| 4          | 40           | 20            | 5           | 2            |
| 5          | 35           | 50            | 1           | 1            |

**Output:** 
| group_id  | player_id  |
|-----------|------------| 
| 1         | 15         |
| 2         | 35         |
| 3         | 40         |


In [0]:
players_data_1194 = [
    (15, 1), (25, 1), (30, 1), (45, 1),
    (10, 2), (35, 2), (50, 2),
    (20, 3), (40, 3)
]

players_columns_1194 = ["player_id", "group_id"]
players_df_1194 = spark.createDataFrame(players_data_1194, players_columns_1194)
players_df_1194.show()

matches_data_1194 = [
    (1, 15, 45, 3, 0),
    (2, 30, 25, 1, 2),
    (3, 30, 15, 2, 0),
    (4, 40, 20, 5, 2),
    (5, 35, 50, 1, 1),
]

matches_columns_1194 = ["match_id", "first_player", "second_player", "first_score", "second_score"]
matches_df_1194 = spark.createDataFrame(matches_data_1194, matches_columns_1194)
matches_df_1194.show()

+---------+--------+
|player_id|group_id|
+---------+--------+
|       15|       1|
|       25|       1|
|       30|       1|
|       45|       1|
|       10|       2|
|       35|       2|
|       50|       2|
|       20|       3|
|       40|       3|
+---------+--------+

+--------+------------+-------------+-----------+------------+
|match_id|first_player|second_player|first_score|second_score|
+--------+------------+-------------+-----------+------------+
|       1|          15|           45|          3|           0|
|       2|          30|           25|          1|           2|
|       3|          30|           15|          2|           0|
|       4|          40|           20|          5|           2|
|       5|          35|           50|          1|           1|
+--------+------------+-------------+-----------+------------+



In [0]:
first_scores_df_1194 = matches_df_1194.select(
    col("first_player").alias("player_id"),
    col("first_score").alias("score")
)

second_scores_df_1194 = matches_df_1194.select(
    col("second_player").alias("player_id"),
    col("second_score").alias("score")
)


In [0]:
all_scores_df_1194 = first_scores_df_1194.union(second_scores_df_1194)

In [0]:
player_scores_df_1194 = all_scores_df_1194\
                    .groupBy("player_id")\
                        .agg(sum("score").alias("total_score"))

In [0]:
players_with_scores_df_1194 = players_df_1194.join(player_scores_df_1194, on="player_id", how="left").fillna(0)

In [0]:
window_spec = Window.partitionBy("group_id").orderBy(
    col("total_score").desc(),
    col("player_id").asc()
)

In [0]:
players_with_scores_df_1194\
    .withColumn("rank", row_number().over(window_spec))\
        .filter(col("rank") == 1).select("group_id", "player_id").show()

+--------+---------+
|group_id|player_id|
+--------+---------+
|       1|       15|
|       2|       35|
|       3|       40|
+--------+---------+

