## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**1811. Find Interview Candidates (Medium)**

**Table: Contests**

| Column Name  | Type |
|--------------|------|
| contest_id   | int  |
| gold_medal   | int  |
| silver_medal | int  |
| bronze_medal | int  |

contest_id is the column with unique values for this table.
This table contains the LeetCode contest ID and the user IDs of the gold, silver, and bronze medalists.
It is guaranteed that any consecutive contests have consecutive IDs and that no ID is skipped.
 
**Table: Users**

| Column Name | Type    |
|-------------|---------|
| user_id     | int     |
| mail        | varchar |
| name        | varchar |

user_id is the column with unique values for this table.
This table contains information about the users.
 
**Write a solution to report the name and the mail of all interview candidates. A user is an interview candidate if at least one of these two conditions is true:**
- The user won any medal in three or more consecutive contests.
- The user won the gold medal in three or more different contests (not necessarily consecutive).

Return the result table in any order.

The result format is in the following example.

**Example 1:**

**Input:** 

**Contests table:**

| contest_id | gold_medal | silver_medal | bronze_medal |
|------------|------------|--------------|--------------|
| 190        | 1          | 5            | 2            |
| 191        | 2          | 3            | 5            |
| 192        | 5          | 2            | 3            |
| 193        | 1          | 3            | 5            |
| 194        | 4          | 5            | 2            |
| 195        | 4          | 2            | 1            |
| 196        | 1          | 5            | 2            |

**Users table:**
| user_id | mail               | name  |
|---------|--------------------|-------|
| 1       | sarah@leetcode.com | Sarah |
| 2       | bob@leetcode.com   | Bob   |
| 3       | alice@leetcode.com | Alice |
| 4       | hercy@leetcode.com | Hercy |
| 5       | quarz@leetcode.com | Quarz |

**Output:** 
| name  | mail               |
|-------|--------------------|
| Sarah | sarah@leetcode.com |
| Bob   | bob@leetcode.com   |
| Alice | alice@leetcode.com |
| Quarz | quarz@leetcode.com |

**Explanation:** 
- Sarah won 3 gold medals (190, 193, and 196), so we include her in the result table.
- Bob won a medal in 3 consecutive contests (190, 191, and 192), so we include him in the result table.
    - Note that he also won a medal in 3 other consecutive contests (194, 195, and 196).
- Alice won a medal in 3 consecutive contests (191, 192, and 193), so we include her in the result table.
- Quarz won a medal in 5 consecutive contests (190, 191, 192, 193, and 194), so we include them in the result table.

In [0]:
contests_data_1811 = [
    (190, 1, 5, 2),
    (191, 2, 3, 5),
    (192, 5, 2, 3),
    (193, 1, 3, 5),
    (194, 4, 5, 2),
    (195, 4, 2, 1),
    (196, 1, 5, 2),
]

contests_columns_1811 = ["contest_id", "gold_medal", "silver_medal", "bronze_medal"]
contests_df_1811 = spark.createDataFrame(contests_data_1811, contests_columns_1811)
contests_df_1811.show()

users_data_1811 = [
    (1, "sarah@leetcode.com", "Sarah"),
    (2, "bob@leetcode.com", "Bob"),
    (3, "alice@leetcode.com", "Alice"),
    (4, "hercy@leetcode.com", "Hercy"),
    (5, "quarz@leetcode.com", "Quarz"),
]

users_columns_1811 = ["user_id", "mail", "name"]
users_df_1811 = spark.createDataFrame(users_data_1811, users_columns_1811)
users_df_1811.show()

+----------+----------+------------+------------+
|contest_id|gold_medal|silver_medal|bronze_medal|
+----------+----------+------------+------------+
|       190|         1|           5|           2|
|       191|         2|           3|           5|
|       192|         5|           2|           3|
|       193|         1|           3|           5|
|       194|         4|           5|           2|
|       195|         4|           2|           1|
|       196|         1|           5|           2|
+----------+----------+------------+------------+

+-------+------------------+-----+
|user_id|              mail| name|
+-------+------------------+-----+
|      1|sarah@leetcode.com|Sarah|
|      2|  bob@leetcode.com|  Bob|
|      3|alice@leetcode.com|Alice|
|      4|hercy@leetcode.com|Hercy|
|      5|quarz@leetcode.com|Quarz|
+-------+------------------+-----+



In [0]:
unpivoted_df_1811 = contests_df_1811\
                        .select(
                            col("contest_id"),
                                explode(
                                    array(
                                        col("gold_medal"),
                                        col("silver_medal"),
                                        col("bronze_medal")
                                    )
                                ).alias("user_id")
                            )

In [0]:
gold_winners_df_1811 = contests_df_1811\
                            .groupBy("gold_medal").count() \
                                .filter(col("count") >= 3) \
                                    .select(col("gold_medal").alias("user_id"))

In [0]:
window = Window.partitionBy("user_id").orderBy("contest_id")

In [0]:
ranked_df_1811 = unpivoted_df_1811\
                        .withColumn("rn", row_number().over(window))

In [0]:
grouped_df_1811 = ranked_df_1811\
                        .withColumn("grp", col("contest_id") - col("rn"))

In [0]:

consecutive_df_1811 = grouped_df_1811\
                            .groupBy("user_id", "grp").count().filter(col("count") >= 3) \
                                .select("user_id").distinct()

In [0]:
candidates_df_1811 = gold_winners_df_1811.union(consecutive_df_1811).distinct()

In [0]:
candidates_df_1811\
    .join(users_df_1811, candidates_df_1811.user_id == users_df_1811.user_id, "inner") \
        .select("name", "mail").show()

+-----+------------------+
| name|              mail|
+-----+------------------+
|Sarah|sarah@leetcode.com|
|  Bob|  bob@leetcode.com|
|Alice|alice@leetcode.com|
|Quarz|quarz@leetcode.com|
+-----+------------------+

