## Importing Libraries

In [0]:
from pyspark.sql.functions import *
from pyspark.sql.types import *
from pyspark.sql.window import Window

**3140. Consecutive Available Seats II (Medium)**

**Table: Cinema**

| Column Name | Type |
|-------------|------|
| seat_id     | int  |
| free        | bool |

seat_id is an auto-increment column for this table.
Each row of this table indicates whether the ith seat is free or not. 1 means free while 0 means occupied.

**Write a solution to find the length of longest consecutive sequence of available seats in the cinema.**
**Note:**
- There will always be at most one longest consecutive sequence.
- If there are multiple consecutive sequences with the same length, include all of them in the output.

Return the result table ordered by first_seat_id in ascending order.

The result format is in the following example.

**Example:**

**Input:**

**Cinema table:**

| seat_id | free |
|---------|------|
| 1       | 1    |
| 2       | 0    |
| 3       | 1    |
| 4       | 1    |
| 5       | 1    |

**Output:**

| first_seat_id   | last_seat_id   | consecutive_seats_len |
|-----------------|----------------|-----------------------|
| 3               | 5              | 3                     |

**Explanation:**
- Longest consecutive sequence of available seats starts from seat 3 and ends at seat 5 with a length of 3.

Output table is ordered by first_seat_id in ascending order.

In [0]:
cinema_data_3140 = [
    (1, 1),
    (2, 0),
    (3, 1),
    (4, 1),
    (5, 1),
]

cinema_columns_3140 = ["seat_id", "free"]
cinema_df_3140 = spark.createDataFrame(cinema_data_3140, cinema_columns_3140)
cinema_df_3140.show()

+-------+----+
|seat_id|free|
+-------+----+
|      1|   1|
|      2|   0|
|      3|   1|
|      4|   1|
|      5|   1|
+-------+----+



In [0]:
window_spec = Window.orderBy("seat_id")

In [0]:
free_df_3140 = cinema_df_3140\
                .filter(col("free") == 1)\
                    .withColumn("rn", row_number().over(window_spec))\
                        .withColumn("grp", col("seat_id") - col("rn"))                        



In [0]:
seq_df_3140 = free_df_3140\
                .groupBy("grp")\
                    .agg(
                        min("seat_id").alias("first_seat_id"),
                        max("seat_id").alias("last_seat_id"),
                        count("*").alias("consecutive_seats_len")
                        )



In [0]:
max_len_3140 = seq_df_3140\
                .agg(
                    max("consecutive_seats_len").alias("max_len")
                    ).collect()[0]["max_len"]



In [0]:
seq_df_3140\
    .filter(col("consecutive_seats_len") == max_len_3140) \
        .select("first_seat_id", "last_seat_id", "consecutive_seats_len") \
            .orderBy("first_seat_id").display()



first_seat_id,last_seat_id,consecutive_seats_len
3,5,3
