## Ranking Functions

We can use ranking functions to assign ranks to a particular record within a partition.

* Sparse Rank - rank
* Dense Rank - dense_rank
* Assigning Row Numbers - row_number
* Percentage Rank - percent_rank

### Tasks

Let us perform few tasks related to ranking.

In [None]:
airlines_path = "/public/airlines_all/airlines-part/flightmonth=200801"

In [None]:
airlines = spark. \
    read. \
    parquet(airlines_path)

In [None]:
from pyspark.sql.functions import col, lit, lpad, concat
from pyspark.sql.functions import rank, dense_rank
from pyspark.sql.functions import percent_rank, row_number, round
from pyspark.sql.window import Window

In [None]:
spec = Window. \
    partitionBy("FlightDate", "Origin"). \
    orderBy(col("DepDelay").desc())

In [None]:
airlines. \
    filter("IsDepDelayed = 'YES' and Cancelled = 0"). \
    select(concat("Year", 
                  lpad("Month", 2, "0"), 
                  lpad("DayOfMonth", 2, "0")
                 ).alias("FlightDate"),
           "Origin",
           "UniqueCarrier",
           "FlightNum",
           "CRSDepTime",
           "IsDepDelayed",
           col("DepDelay").cast("int").alias("DepDelay")
          ). \
    withColumn("srank", rank().over(spec)). \
    withColumn("drank", dense_rank().over(spec)). \
    withColumn("prank", round(percent_rank().over(spec), 2)). \
    withColumn("rn", row_number().over(spec)). \
    orderBy("FlightDate", "Origin", col("DepDelay").desc()). \
    show()