## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**1633. Percentage of Users Attended a Contest (Easy)**

**Table: Users**

| Column Name | Type    |
|-------------|---------|
| user_id     | int     |
| user_name   | varchar |

user_id is the primary key (column with unique values) for this table.
Each row of this table contains the name and the id of a user.
 

**Table: Register**

| Column Name | Type    |
|-------------|---------|
| contest_id  | int     |
| user_id     | int     |

(contest_id, user_id) is the primary key (combination of columns with unique values) for this table.
Each row of this table contains the id of a user and the contest they registered into.
 
**Write a solution to find the percentage of the users registered in each contest rounded to two decimals.**

Return the result table ordered by percentage in descending order. In case of a tie, order it by contest_id in ascending order.

The result format is in the following example.

**Example 1:**

**Input:**
**Users table:**

| user_id | user_name |
|---------|-----------|
| 6       | Alice     |
| 2       | Bob       |
| 7       | Alex      |

**Register table:**
| contest_id | user_id |
|------------|---------|
| 215        | 6       |
| 209        | 2       |
| 208        | 2       |
| 210        | 6       |
| 208        | 6       |
| 209        | 7       |
| 209        | 6       |
| 215        | 7       |
| 208        | 7       |
| 210        | 2       |
| 207        | 2       |
| 210        | 7       |

**Output:**
| contest_id | percentage |
|------------|------------|
| 208        | 100.0      |
| 209        | 100.0      |
| 210        | 100.0      |
| 215        | 66.67      |
| 207        | 33.33      |

**Explanation:**
- All the users registered in contests 208, 209, and 210. The percentage is 100% and we sort them in the answer table by contest_id in ascending order.
- Alice and Alex registered in contest 215 and the percentage is ((2/3) * 100) = 66.67%
- Bob registered in contest 207 and the percentage is ((1/3) * 100) = 33.33%

In [0]:
users_data_1633 = [
    (6, "Alice"),
    (2, "Bob"),
    (7, "Alex")
]
users_columns_1633 = ["user_id", "user_name"]
users_df_1633 = spark.createDataFrame(users_data_1633, users_columns_1633)
users_df_1633.show()

register_data_1633 = [
    (215, 6),
    (209, 2),
    (208, 2),
    (210, 6),
    (208, 6),
    (209, 7),
    (209, 6),
    (215, 7),
    (208, 7),
    (210, 2),
    (207, 2),
    (210, 7)
]
register_columns_1633 = ["contest_id", "user_id"]
register_df_1633 = spark.createDataFrame(register_data_1633, register_columns_1633)
register_df_1633.show()

+-------+---------+
|user_id|user_name|
+-------+---------+
|      6|    Alice|
|      2|      Bob|
|      7|     Alex|
+-------+---------+

+----------+-------+
|contest_id|user_id|
+----------+-------+
|       215|      6|
|       209|      2|
|       208|      2|
|       210|      6|
|       208|      6|
|       209|      7|
|       209|      6|
|       215|      7|
|       208|      7|
|       210|      2|
|       207|      2|
|       210|      7|
+----------+-------+



In [0]:
total_users = users_df_1633.count()

In [0]:
contest_users_df_1633 = register_df_1633\
                            .groupBy("contest_id") \
                                .agg(countDistinct("user_id").alias("contest_user_count"))

In [0]:
contest_users_df_1633\
        .withColumn( "percentage", round((col("contest_user_count") / total_users) * 100, 2))\
            .select("contest_id", "percentage") \
                .orderBy(col("percentage").desc(), col("contest_id").asc()).show()

+----------+----------+
|contest_id|percentage|
+----------+----------+
|       208|     100.0|
|       209|     100.0|
|       210|     100.0|
|       215|     66.67|
|       207|     33.33|
+----------+----------+

