##### 176. Second Highest Salary

Write a SQL query to get the second highest salary from the `Employee` table.

#### Table: Employee

| Id  | Salary |
|-----|--------|
| 1   | 100    |
| 2   | 200    |
| 3   | 300    |

---

##### Expected Output

For example, given the above `Employee` table, the query should return **200** as the second highest salary.  
If there is no second highest salary, then the query should return **null**.

| SecondHighestSalary |
|---------------------|
| 200                 |


##### PySpark Solution

In [0]:
df_sal = spark.createDataFrame(data = [(1, 100), (2, 200), (3, 300), (4, 400)], schema = ['Id', 'Salary'])
df_sal.display()

In [0]:
from pyspark.sql.functions import dense_rank, col
from pyspark.sql.window import Window

windowSpec = Window.partitionBy('Id').orderBy('Salary')

df_2nd_highest_sal = (df_sal
                      .withColumn('rank', dense_rank().over(windowSpec))
                      .filter(col("rank") == 2)
                      .withColumnRenamed("Salary", "SecondHighestSalary")
                      .drop("Id")
                      .drop("rank")
                      )

df_2nd_highest_sal.display()

In [0]:
window_spec = Window.orderBy(col("Salary").desc())

ranked_df = df_sal.withColumn("rank", dense_rank().over(window_spec))

second_highest_df = ranked_df.filter(col("rank") == 2).select(col("Salary").alias("SecondHighestSalary"))

from pyspark.sql.functions import lit

if second_highest_df.count() == 0:
    result_df = spark.createDataFrame([(None,)], ["SecondHighestSalary"])
else:
    result_df = second_highest_df

display(second_highest_df)