# Challenge 1

```yaml
title: "Prep Air's Flow Card"
week: 1
posted_on: 2024-01-03
created_on: 2024-03-07
last_upated: 2024-03-08
input: [
    "PD 2024 Wk 1 Input.csv"
]
output: [
    "passenger_flight_details.ndjson"
]
```

## Setup the notebook

In [1]:
import polars as pl

from src.challenge1 import run_pipeline

## Parameters

In [2]:
INPUT_CSV = "input/PD 2024 Wk 1 Input.csv"
OUTPUT_NDJSON = "output/flight_details.ndjson"

## Run the pipline

Load the data.

In [3]:
data = pl.scan_csv(INPUT_CSV)

Preview the input data.

In [4]:
data.collect().sample(8)

Flight Details,Flow Card?,Bags Checked,Meal Type
str,i64,i64,str
"""2024-05-10//PA…",0,1,"""Nut Free"""
"""2024-08-23//PA…",0,0,"""Egg Free"""
"""2024-08-09//PA…",1,1,"""Nut Free"""
"""2024-12-11//PA…",0,0,"""Egg Free"""
"""2024-08-15//PA…",1,1,"""Nut Free"""
"""2024-01-16//PA…",0,1,"""Dairy Free"""
"""2024-01-03//PA…",1,2,"""Dairy Free"""
"""2024-05-26//PA…",1,1,"""Egg Free"""


Run the transformation pipeline.

In [5]:
transformed_data = data.pipe(run_pipeline)

This is the end of the activity, so collect the data.

In [6]:
collected_data = transformed_data.collect()

Preview the output data.

In [7]:
collected_data.sample(8)

id,date,flight_number,from,to,class,price,has_flow_card,number_of_bags_checked,meal_type
u32,date,str,str,str,str,f64,bool,i64,str
2716,2024-02-19,"""PA010""","""Tokyo""","""New York""","""Premium Econom…",1502.5,False,0,"""Nut Free"""
661,2024-07-22,"""PA010""","""Tokyo""","""New York""","""Premium Econom…",1540.0,False,0,"""Dairy Free"""
1742,2024-10-14,"""PA009""","""New York""","""Tokyo""","""Economy""",1435.0,False,1,"""Dairy Free"""
2293,2024-06-01,"""PA012""","""Tokyo""","""Perth""","""First Class""",647.0,False,2,"""Vegan"""
436,2024-04-04,"""PA010""","""Tokyo""","""New York""","""First Class""",266.0,False,1,"""Dairy Free"""
3644,2024-12-26,"""PA006""","""Tokyo""","""London""","""Premium Econom…",1060.0,True,1,"""Vegan"""
3415,2024-01-13,"""PA002""","""New York""","""London""","""Economy""",3040.0,False,0,"""None"""
2943,2024-07-30,"""PA011""","""Perth""","""Tokyo""","""Premium Econom…",1305.0,False,1,"""Vegetarian"""


Output the data.

In [8]:
collected_data.write_ndjson(OUTPUT_NDJSON)