## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**1783. Grand Slam Titles (Medium)**

**Table: Players**

| Column Name    | Type    |
|----------------|---------|
| player_id      | int     |
| player_name    | varchar |

player_id is the primary key (column with unique values) for this table.
Each row in this table contains the name and the ID of a tennis player.
 
**Table: Championships**

| Column Name   | Type    |
|---------------|---------|
| year          | int     |
| Wimbledon     | int     |
| Fr_open       | int     |
| US_open       | int     |
| Au_open       | int     |

year is the primary key (column with unique values) for this table.
Each row of this table contains the IDs of the players who won one each tennis tournament of the grand slam.
 
**Write a solution to report the number of grand slam tournaments won by each player. Do not include the players who did not win any tournament.**

Return the result table in any order.

The result format is in the following example.

**Example 1:**

**Input:** 

**Players table:**

| player_id | player_name |
|-----------|-------------|
| 1         | Nadal       |
| 2         | Federer     |
| 3         | Novak       |

**Championships table:**
| year | Wimbledon | Fr_open | US_open | Au_open |
|------|-----------|---------|---------|---------|
| 2018 | 1         | 1       | 1       | 1       |
| 2019 | 1         | 1       | 2       | 2       |
| 2020 | 2         | 1       | 2       | 2       |

**Output:** 
| player_id | player_name | grand_slams_count |
|-----------|-------------|-------------------|
| 2         | Federer     | 5                 |
| 1         | Nadal       | 7                 |

**Explanation:** 
- Player 1 (Nadal) won 7 titles: Wimbledon (2018, 2019), Fr_open (2018, 2019, 2020), US_open (2018), and Au_open (2018).
- Player 2 (Federer) won 5 titles: Wimbledon (2020), US_open (2019, 2020), and Au_open (2019, 2020).
- Player 3 (Novak) did not win anything, we did not include them in the result table.

In [0]:
players_data_1783 = [
    (1, "Nadal"),
    (2, "Federer"),
    (3, "Novak"),
]
players_columns_1783 = ["player_id", "player_name"]
players_df_1783 = spark.createDataFrame(players_data_1783, players_columns_1783)
players_df_1783.show()

championships_data_1783 = [
    (2018, 1, 1, 1, 1),
    (2019, 1, 1, 2, 2),
    (2020, 2, 1, 2, 2),
]

championships_columns_1783 = ["year", "Wimbledon", "Fr_open", "US_open", "Au_open"]
championships_df_1783 = spark.createDataFrame(championships_data_1783, championships_columns_1783)
championships_df_1783.show()

+---------+-----------+
|player_id|player_name|
+---------+-----------+
|        1|      Nadal|
|        2|    Federer|
|        3|      Novak|
+---------+-----------+

+----+---------+-------+-------+-------+
|year|Wimbledon|Fr_open|US_open|Au_open|
+----+---------+-------+-------+-------+
|2018|        1|      1|      1|      1|
|2019|        1|      1|      2|      2|
|2020|        2|      1|      2|      2|
+----+---------+-------+-------+-------+



In [0]:
champ_unpivoted_df_1783 = championships_df_1783\
                                .select( col("year"),
                                        explode(
                                            array(
                                            col("Wimbledon").alias("winner"),
                                            col("Fr_open").alias("winner"),
                                            col("US_open").alias("winner"),
                                            col("Au_open").alias("winner")
                                            )
                                        ).alias("player_id"))

In [0]:
wins_df_1783 = champ_unpivoted_df_1783\
                        .groupBy("player_id").count()\
                            .withColumnRenamed("count", "grand_slams_count")


In [0]:
wins_df_1783\
    .join(players_df_1783, on="player_id")\
        .select("player_id", "player_name", "grand_slams_count").show()


+---------+-----------+-----------------+
|player_id|player_name|grand_slams_count|
+---------+-----------+-----------------+
|        1|      Nadal|                7|
|        2|    Federer|                5|
+---------+-----------+-----------------+

