Problem Statement:

You have a table named seats with two columns: id and student, representing a seating arrangement for students. Your goal is to rearrange the students in a way that:

Odd-numbered seats (i.e., where id is odd) should be filled with the student seated in the next seat (i.e., the student from the seat with the next higher id).

If the current seat is the last seat (has no next seat), keep the same student in place.
Even-numbered seats (i.e., where id is even) should be filled with the student seated in the previous seat (i.e., the student from the seat with the next lower id).

In [0]:
# Define the data
data = [
    (1, "Amit"),
    (2, "Deepa"),
    (3, "Rohit"),
    (4, "Anjali"),
    (5, "Neha"),
    (6, "Sanjay"),
    (7, "Priya"),
]

# Define the schema
columns = ["id", "student"]

# Create a DataFrame
df = spark.createDataFrame(data, schema=columns)

# Write DataFrame to a table in Databricks
df.display()

id,student
1,Amit
2,Deepa
3,Rohit
4,Anjali
5,Neha
6,Sanjay
7,Priya


In [0]:
df.createGlobalTempView("seats")

In [0]:
%sql
with cte as (
  select
    *,
    lead(student) over(
      order by
        id
    ) as ld,
    lag(student) over(
      order by
        id
    ) as lg
  from
    seats
)
select
  id,
  case
    when id % 2 = 1 then coalesce(ld, student)
    else lg
  end as swaped_student
from
  cte

id,swaped_student
1,Deepa
2,Amit
3,Anjali
4,Rohit
5,Sanjay
6,Neha
7,Priya


In [0]:
from pyspark.sql.window import Window
import pyspark.sql.functions as F

# Define a window specification that orders by 'id'
window_spec = Window.orderBy("id")

# Apply LEAD and LAG functions
df_with_lag_lead = df.withColumn("ld", F.lead("student").over(window_spec)) \
                     .withColumn("lg", F.lag("student").over(window_spec))

# Apply CASE logic with when and coalesce
df_swapped = df_with_lag_lead.withColumn(
    "swapped_student",
    F.when(F.col("id") % 2 == 1, F.coalesce(F.col("ld"), F.col("student")))  # If id is odd
     .otherwise(F.col("lg"))  # If id is even
)

# Select the relevant columns
result_df = df_swapped.select("id", "swapped_student")

# display the result
result_df.display()


id,swapped_student
1,Deepa
2,Amit
3,Anjali
4,Rohit
5,Sanjay
6,Neha
7,Priya


Explanation:

Window Specification: Defines the window for LEAD and LAG, ordering by id.
withColumn: Adds two columns: ld for lead and lg for lag.
when and coalesce: Implements the CASE logic to swap students based on whether the id is odd or even.
coalesce: Ensures that if lead returns None, the current student value is retained.