# PySpark SQL desc() Function: Sorting in Descending Order

## Introduction to the `desc()` Function

The `desc()` function in PySpark is used to sort a DataFrame in descending order based on one or more columns. It works similarly to the SQL `DESC` keyword used in `ORDER BY` statements and is often used when you want to rank or prioritize data, such as finding the top values.


## Basic Syntax:

```
DataFrame.orderBy(columnName.desc())
```

### Parameter:

- **`columnName`**: The column you want to sort in descending order.


## Why Use `desc()`?

- Sorting in descending order is useful when you want to display the largest, most recent, or highest values first.
- It is particularly useful for ranking, leaderboards, financial data analysis (e.g., showing top sales), or highlighting the top performers in any dataset.


## Practical Examples

### 1. Sorting a Column in Descending Order

**Scenario**: You have a DataFrame with sales data, and you want to sort it by `SALES` in descending order using `desc()`.

**Code Example**:

In [0]:
from pyspark.sql.functions import desc

df = spark.createDataFrame([
    ("ItemA", 100),
    ("ItemB", 200),
    ("ItemA", 300),
    ("ItemC", 400),
    ("ItemB", 500)
], ["ITEM", "SALES"])

# Sort by SALES in descending order using desc()
df.orderBy(df.SALES.desc()).show()


+-----+-----+
| ITEM|SALES|
+-----+-----+
|ItemB|  500|
|ItemC|  400|
|ItemA|  300|
|ItemB|  200|
|ItemA|  100|
+-----+-----+



### 2. Sorting Multiple Columns with Descending Order

**Scenario**: You want to sort by `ITEM` in ascending order and then by `SALES` in descending order.

**Code Example**:

In [0]:
# Sort by ITEM in ascending and SALES in descending order using desc()
df.orderBy("ITEM", df.SALES.desc()).show()


### 3. Using `desc()` with Multiple Aggregations

**Scenario**: You want to group by `ITEM` and calculate the total sales, and then sort the result by `TOTAL_SALES` in descending order.

**Code Example**:

In [0]:
from pyspark.sql.functions import sum

# Group by ITEM, sum SALES, and sort by TOTAL_SALES in descending order
df.groupBy("ITEM").agg(sum("SALES").alias("TOTAL_SALES")).orderBy(desc("TOTAL_SALES")).show()


+-----+-----------+
| ITEM|TOTAL_SALES|
+-----+-----------+
|ItemB|        700|
|ItemA|        400|
|ItemC|        400|
+-----+-----------+



### 4. Using `desc()` with Column Expressions

**Scenario**: You want to sort based on a calculated value, such as sorting by `SALES` after adding 10% to each value in descending order.

**Code Example**:

In [0]:
from pyspark.sql.functions import expr

# Sort by SALES after increasing it by 10% in descending order
df.orderBy(expr("SALES * 1.1").desc()).show()


+-----+-----+
| ITEM|SALES|
+-----+-----+
|ItemB|  500|
|ItemC|  400|
|ItemA|  300|
|ItemB|  200|
|ItemA|  100|
+-----+-----+



### 5. Handling Null Values with Descending Order

**Scenario**: You want to sort by `SALES` in descending order while handling null values, placing them either first or last.

**Code Example**:

In [0]:
df_with_nulls = spark.createDataFrame([
    ("ItemA", 100),
    ("ItemB", None),
    ("ItemA", 300),
    ("ItemC", 400),
    ("ItemB", None)
], ["ITEM", "SALES"])

# Sort by SALES in descending order and place nulls last
df_with_nulls.orderBy(df_with_nulls.SALES.desc_nulls_last()).show()


+-----+-----+
| ITEM|SALES|
+-----+-----+
|ItemC|  400|
|ItemA|  300|
|ItemA|  100|
|ItemB| null|
|ItemB| null|
+-----+-----+



### 6. Sorting Strings in Descending Order

**Scenario**: You want to sort a column of strings in descending alphabetical order using `desc()`.

**Code Example**:

In [0]:
df_items = spark.createDataFrame([
    ("Apple", 100),
    ("Banana", 200),
    ("Cherry", 300),
    ("Date", 400),
    ("Elderberry", 500)
], ["FRUIT", "QUANTITY"])

# Sort by FRUIT in descending alphabetical order
df_items.orderBy(df_items.FRUIT.desc()).show()


+----------+--------+
|     FRUIT|QUANTITY|
+----------+--------+
|Elderberry|     500|
|      Date|     400|
|    Cherry|     300|
|    Banana|     200|
|     Apple|     100|
+----------+--------+

