### Cleaning Data: Pandas

Table must be cleaned
```
            From_To  FlightNumber  RecentDelays              Airline
0      LoNDon_paris       10045.0      [23, 47]               KLM(!)
1      MAdrid_miLAN           NaN            []    <Air France> (12)
2  londON_StockhOlm       10065.0  [24, 43, 87]  (British Airways. )
3    Budapest_PaRis           NaN          [13]       12. Air France
4   Brussels_londOn       10085.0      [67, 32]          "Swiss Air"
```

```
Inspect the data

Look for missing values (NaN)

Identify inconsistent formats (e.g., "abc" vs 'abc', or strings vs numbers)

Fix broken strings or lists

Split strings if needed (e.g., 'a,b,c' â†’ ['a', 'b', 'c'])

Use .str.strip(), .str.replace(), etc. to clean text

Handle missing data

Use df.isnull() and df.fillna() or df.dropna()

Decide whether to fill missing values (with mean, median, etc.) or drop them

Standardize formats

Convert all dates to datetime format

Convert all numeric fields to the correct type (int, float)

Remove duplicates or irrelevant rows

Rename columns or fix typos
```

In [None]:
from os import replace
import pandas as pd
import numpy as np
df9 = pd.DataFrame({'From_To': ['LoNDon_paris', 'MAdrid_miLAN', 'londON_StockhOlm',
                               'Budapest_PaRis', 'Brussels_londOn'],
              'FlightNumber': [10045, np.nan, 10065, np.nan, 10085],
              'RecentDelays': [[23, 47], [], [24, 43, 87], [13], [67, 32]],
                   'Airline': ['KLM(!)', '<Air France> (12)', '(British Airways. )',
                               '12. Air France', '"Swiss Air"']})
tem = df9.From_To.str.split('_', expand=True)
tem.columns = ['From', 'To']
tem.From = tem.From.str.capitalize()
tem.To = tem.To.str.capitalize()
df9 = df9.drop('From_To', axis = 1)
df9 = pd.concat([df9, tem], axis = 1)

df9.fillna({'FlightNumber':0}, inplace = True)

df9.Airline = df9.Airline.str.replace(r'[^a-zA-Z\s]', '', regex=True)
df9

Unnamed: 0,FlightNumber,RecentDelays,Airline,From,To
0,10045.0,"[23, 47]",KLM,London,Paris
1,0.0,[],Air France,Madrid,Milan
2,10065.0,"[24, 43, 87]",British Airways,London,Stockholm
3,0.0,[13],Air France,Budapest,Paris
4,10085.0,"[67, 32]",Swiss Air,Brussels,London
