# Conditional Logic in PySpark

- **How to use expr() with a CASE statement**

- **How to store CASE logic in a variable**

- **How to use selectExpr()**

- **How to use PySpark’s native when() function**

- **how to apply conditional calculations**

In [0]:
df = spark.read.table('ytdemo.data.global_superstore')
df.display()

## Method 1: Using expr() with CASE Statement

- Profit < 0 → Loss

- Profit between 0 and 100 → Low Profit

- Profit between 101 and 500 → Medium Profit

- Otherwise → High Profit

In [0]:
from pyspark.sql.functions import expr

In [0]:
df.withColumn("Profit_Bucket",
              expr("Case when Profit < 0 then 'Loss' \
                  when profit between 0 and 100 then 'Low Profit'\
                    when Profit between 101 and 500 then 'Medium Profit'\
                        ELSE 'High Profit'\
                        END ")
              ).display()

In [0]:
v_sql_string = """
    CASE 
        WHEN Profit < 0 THEN 'Loss' 
        WHEN Profit BETWEEN 0 AND 100 THEN 'Low Profit' 
        WHEN Profit BETWEEN 101 AND 500 THEN 'Medium Profit' 
        ELSE 'High Profit'
    END
"""

In [0]:
df.select(
    "Order_id",
    "Customer_Name",
    "Ship_mode",
    "Profit",
    expr(v_sql_string).alias("Profit_Bucket")
).display()

### Using selectExpr()

In [0]:
df.selectExpr(
    "Order_id",
    "Customer_Name",
    "Ship_mode",
    "Profit",
    "CASE \
        WHEN Profit < 0 THEN 'Loss' \
        WHEN Profit BETWEEN 0 AND 100 THEN 'Low Profit' \
        WHEN Profit BETWEEN 101 AND 500 THEN 'Medium Profit' \
        ELSE 'High Profit'\
    END as Profit_Bucket" 
    ).display()

### Using Native pyspark when functions

In [0]:
from pyspark.sql.functions import when

In [0]:
df.withColumn('Profit_Bucket',
              when(df.Profit < 0 ,'Loss')
              .when((df.Profit >= 0) & (df.Profit <= 100) ,'Low Profit')
              .when((df.Profit >= 101) & (df.Profit <= 500) ,'Medium Profit')
              .otherwise('High Profit')
              
              ).display()

### Real Business Example: Conditional Calculation

- Consumer → 5% of Sales

- Corporate → 10% of Sales

- Home Office → 8% of Sales

- Otherwise → 0

In [0]:
df.withColumn(
    'Discount_Amount',
    expr("""
        CASE 
            WHEN Segment = 'Consumer' THEN Sales * 0.05
            WHEN Segment = 'Corporate' THEN Sales * 0.10
            WHEN Segment = 'Home Office' THEN Sales * 0.08
            ELSE 0
        END
    """)
).display()