# **Travel Trip dataset - ETL Pipeline**


Structured ETL pipeline to clean, transform, and enrich the plant health dataset from Kaggle.
Includes multi-level stress encoding (binary, zone-based, interaction effects) and visualizations.

## 🌍 Project Objectives – Travel Insights & Cost Analysis

This project explores travel behaviors, preferences, and cost patterns using a rich dataset of traveler demographics, trip details, accommodation, and transport types. The ultimate goal is to generate insights for travel agencies, transport providers, and tourism planners to enhance service design, bundling, and personalization.

This notebook showcases how the dataset can help answer strategic questions and validate hypotheses using both statistical and visual analytics.

## 📌 Key Goals:

- Understand destination preferences by demographic groups.

- Analyze how trip duration, age, and gender affect travel behavior.

- Examine cost variations across accommodation and transportation types.

- Investigate relationships between fuel prices, temperature, and travel cost over time (from external datasets).

- Support product bundling strategies through multi-attribute analysis.

## 🔍 Key Questions

- Which destinations are most popular across age, gender, and nationality?

- How does trip duration vary with age or gender?

- Are accommodation and transport preferences linked to demographics?

- How have transportation costs evolved, and how are they affected by fuel price and climate?

- Which regions or cities show the strongest travel demand?

- Can we categorize travel behaviors into actionable segments for agencies?

## ✅ **Hypotheses**

ID	Hypothesis
H1	Gender significantly influences destination preference
H2	Age group affects trip duration
H3	Age group has a significant correlation with accommodation cost
H4	Age group is associated with destination preference
H5	Age group influences accommodation type
H6	Transportation cost is significantly affected by fuel price
H7	There is a strong association between fuel price and average destination temperature
H8	Fuel prices increased post-2022 war; dropped during 2020–21 COVID years

Note: The pre formulated hypothesis on Nationality vs Destination was excluded due to sparse entries. 


## 📥 **Inputs**

*Dataset Overview*

**Dataset 1:** Traveler Trip Dataset
Includes:

- Traveler demographics (age, gender, nationality)
- Trip duration and dates
- Accommodation and transportation types and costs
- Destination city/country

**Dataset 2:** Global Weather Repository
Used to enrich destination countries with:
- Average temperature (°C)
- Climate zone classification (derived)

**Dataset 3:** U.S. Fuel Price Data (Excel)
Captures monthly fuel price trends from 1990–2025.

- Used to analyze impact of geopolitical events on travel costs

## **Outputs**

- Cleaned and enriched dataset
- Power BI dashboards with multi-page views:
    - Destination preferences
    - Trip duration and demographics
    - Accommodation & transport trends
    - Cost vs fuel price and climate zones
- Markdown-based documentation
- Final .pbix Power BI dashboard file
- Statistical plots (correlation heatmaps, boxplots) from Python
- README for GitHub 

## **Additional Comments**

- City-to-country mapping was resolved using a custom lookup table
- Transport and accommodation costs were binned for better interpretability
- Rare category encoding and ordinal encoding were used to simplify analysis
- Fuel prices were averaged and mapped against time and country climate
- Advanced visuals (heatmaps, scatter matrices) are created in Python and imported into Power BI

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import pingouin as pg
import scipy

### Travel Data set upload

In [2]:
import os
current_dir = os.getcwd()
current_dir

os.chdir(os.path.dirname(current_dir))
print("You set a new current directory")
current_dir = os.getcwd()
current_dir

You set a new current directory


'c:\\Users\\organ\\Desktop\\Code institute\\python\\hackatonproyect\\travel_trends'

In [3]:
# Create environment
# source/Scripts/activate
# pip install -r requirements.txt
# pip freeze > requirements.txt

# Set paths
raw_path = "data/raw/Travel details dataset.csv" 
processed_path = "data/processed/Travel details dataset_cleaned.csv"

# Ensure directories exist
os.makedirs(os.path.dirname(raw_path), exist_ok=True)
os.makedirs(os.path.dirname(processed_path), exist_ok=True)

# Load dataset
df = pd.read_csv(raw_path)
df.head()

Unnamed: 0,Trip ID,Destination,Start date,End date,Duration (days),Traveler name,Traveler age,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost
0,1,"London, UK",5/1/2023,5/8/2023,7.0,John Smith,35.0,Male,American,Hotel,1200,Flight,600
1,2,"Phuket, Thailand",6/15/2023,6/20/2023,5.0,Jane Doe,28.0,Female,Canadian,Resort,800,Flight,500
2,3,"Bali, Indonesia",7/1/2023,7/8/2023,7.0,David Lee,45.0,Male,Korean,Villa,1000,Flight,700
3,4,"New York, USA",8/15/2023,8/29/2023,14.0,Sarah Johnson,29.0,Female,British,Hotel,2000,Flight,1000
4,5,"Tokyo, Japan",9/10/2023,9/17/2023,7.0,Kim Nguyen,26.0,Female,Vietnamese,Airbnb,700,Train,200


### This includes the number of non-null entries, data types, and memory usage for each column

In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 139 entries, 0 to 138
Data columns (total 13 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   Trip ID               139 non-null    int64  
 1   Destination           137 non-null    object 
 2   Start date            137 non-null    object 
 3   End date              137 non-null    object 
 4   Duration (days)       137 non-null    float64
 5   Traveler name         137 non-null    object 
 6   Traveler age          137 non-null    float64
 7   Traveler gender       137 non-null    object 
 8   Traveler nationality  137 non-null    object 
 9   Accommodation type    137 non-null    object 
 10  Accommodation cost    137 non-null    object 
 11  Transportation type   136 non-null    object 
 12  Transportation cost   136 non-null    object 
dtypes: float64(2), int64(1), object(10)
memory usage: 14.2+ KB


### Drop Trip ID column

In [5]:
df.drop(columns=['Trip ID', 'Traveler name'], inplace=True)

### Drop all rows with missing values

In [6]:
df.dropna(inplace=True)

### Convert the 'Start Time' and 'End Time' columns to datetime format

In [7]:
df['Start date'] = pd.to_datetime(df['Start date'])
df['End date'] = pd.to_datetime(df['End date'])
df.head()
df.shape

(136, 11)

### Create three new columns: 'Month', 'Month Name', 'Year'

In [8]:
df['Month'] = df['Start date'].dt.month
df['Month_name'] = df['Start date'].dt.month_name()
df['Year'] = df['Start date'].dt.year

### Replace characters with empty string in order to convert Accomodation cost and Transport cost to numeric

In [9]:
df['Accommodation cost'] = df['Accommodation cost'].replace({' USD': '', ',':''}, regex=True)
df['Transportation cost'] = df['Transportation cost'].replace({' USD': '', ',':''}, regex=True)
df.head()

Unnamed: 0,Destination,Start date,End date,Duration (days),Traveler age,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost,Month,Month_name,Year
0,"London, UK",2023-05-01,2023-05-08,7.0,35.0,Male,American,Hotel,1200,Flight,600,5,May,2023
1,"Phuket, Thailand",2023-06-15,2023-06-20,5.0,28.0,Female,Canadian,Resort,800,Flight,500,6,June,2023
2,"Bali, Indonesia",2023-07-01,2023-07-08,7.0,45.0,Male,Korean,Villa,1000,Flight,700,7,July,2023
3,"New York, USA",2023-08-15,2023-08-29,14.0,29.0,Female,British,Hotel,2000,Flight,1000,8,August,2023
4,"Tokyo, Japan",2023-09-10,2023-09-17,7.0,26.0,Female,Vietnamese,Airbnb,700,Train,200,9,September,2023


### Replace $ sign with empty string in order to convert Accomodation cost and Transport cost to numeric

In [10]:
df['Accommodation cost'] = df['Accommodation cost'].str.replace('$', '', regex=False).astype(int)
df['Transportation cost'] = df['Transportation cost'].str.replace('$', '', regex=False).astype(int)
df.head()

Unnamed: 0,Destination,Start date,End date,Duration (days),Traveler age,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost,Month,Month_name,Year
0,"London, UK",2023-05-01,2023-05-08,7.0,35.0,Male,American,Hotel,1200,Flight,600,5,May,2023
1,"Phuket, Thailand",2023-06-15,2023-06-20,5.0,28.0,Female,Canadian,Resort,800,Flight,500,6,June,2023
2,"Bali, Indonesia",2023-07-01,2023-07-08,7.0,45.0,Male,Korean,Villa,1000,Flight,700,7,July,2023
3,"New York, USA",2023-08-15,2023-08-29,14.0,29.0,Female,British,Hotel,2000,Flight,1000,8,August,2023
4,"Tokyo, Japan",2023-09-10,2023-09-17,7.0,26.0,Female,Vietnamese,Airbnb,700,Train,200,9,September,2023


### Convert Duration (days) and Traverler age columns to 'int' data type

In [11]:
df['Duration (days)'] = df['Duration (days)'].astype(int)
df['Traveler age'] = df['Traveler age'].astype(int)

### Create new Dest_Country column from Destination column for future analysis

In [12]:
Dest_Country = []
for i in df['Destination']:
    if i == 'London, UK':
        Dest_Country.append('UK')
    elif i == 'Phuket, Thailand':
        Dest_Country.append('Thailand')
    elif i == 'Bangkok, Thai':
        Dest_Country.append('Thailand')
    elif i == 'Bali, Indonesia':
        Dest_Country.append('Indonesia')
    elif i == 'New York, USA':
        Dest_Country.append('USA')
    elif i == 'Tokyo, Japan':
        Dest_Country.append('Japan')
    elif i == 'Paris, France':
        Dest_Country.append('France')
    elif i == 'Sydney, Australia':
        Dest_Country.append('Australia')
    elif i == 'Rio de Janeiro, Brazil':
        Dest_Country.append('Brazil')
    elif i == 'Amsterdam, Netherlands':
        Dest_Country.append('Netherlands')
    elif i == 'Amsterdam':
        Dest_Country.append('Netherlands')
    elif i == 'Dubai, United Arab Emirates':
        Dest_Country.append('UAE')
    elif i == 'Cancun, Mexico':
        Dest_Country.append('Mexico')
    elif i == 'Honolulu, Hawaii':
        Dest_Country.append('USA')
    elif i == 'Barcelona, Spain':
        Dest_Country.append('Spain')
    elif i == 'Berlin, Germany':
        Dest_Country.append('Germany')
    elif i == 'Marrakech, Morocco':
        Dest_Country.append('Morocco')
    elif i == 'Paris':
        Dest_Country.append('France')
    elif i == 'Bali':
        Dest_Country.append('Indonesia')
    elif i == 'Tokyo':
        Dest_Country.append('Japan')
    elif i == 'London':
        Dest_Country.append('UK')
    elif i == 'New York':
        Dest_Country.append('USA')
    elif i == 'Sydney':
        Dest_Country.append('Australia')
    elif i == 'Rome':
        Dest_Country.append('Italy')
    elif i == 'Bangkok':
        Dest_Country.append('Thailand')
    elif i == 'Hawaii':
        Dest_Country.append('USA')
    elif i == 'Barcelona':
        Dest_Country.append('Spain')
    elif i == 'New York City, USA':
        Dest_Country.append('USA')
    elif i == 'Los Angeles, USA':
        Dest_Country.append('USA')
    elif i == 'Vancouver, Canada':
        Dest_Country.append('Canada')
    elif i == 'Sydney, AUS':
        Dest_Country.append('Australia')
    elif i == 'Sydney, Aus':
        Dest_Country.append('Australia')
    elif i == 'Seoul, South Korea':
        Dest_Country.append('South Korea')
    elif i == 'Cape Town':
        Dest_Country.append('South Africa')
    elif i == 'Cape Town, SA':
        Dest_Country.append('South Africa')
    elif i == 'Cancun, Mexico':
        Dest_Country.append('Mexico')
    elif i == 'Athens, Greece':
        Dest_Country.append('Greece')
    elif i == 'Rome, Italy':
        Dest_Country.append('Italy')
    elif i == 'Auckland, New Zealand':
        Dest_Country.append('New Zealand')
    elif i == 'Rio de Janeiro':
        Dest_Country.append('Brazil')
    elif i == 'Dubai':
        Dest_Country.append('UAE')
    elif i == 'Bangkok, Thailand':
        Dest_Country.append('Thailand')
    elif i == 'Seoul':
        Dest_Country.append('South Korea')
    elif i == 'Phuket, Thai':
        Dest_Country.append('Thailand')
    elif i == 'Phuket':
        Dest_Country.append('Thailand')
    elif i == 'Santorini':
        Dest_Country.append('Greece')
    elif i == 'Phnom Penh':
        Dest_Country.append('Cambodia')   
    elif i == 'Cape Town, South Africa':
        Dest_Country.append('South Africa')
    elif i == 'United Kingdom':
        Dest_Country.append('UK')
    elif i == 'Edinburgh, Scotland':
        Dest_Country.append('UK')
    else:
        Dest_Country.append(i)
df['Dest_Country'] = Dest_Country

### Create Age_Group column from Traveler Age column for future analysis

In [13]:
Age_Group = []
for i in df['Traveler age']:
    if i < 29:
        Age_Group.append('Youth')
    elif i < 39:
        Age_Group.append('Young Adult')
    elif i < 49:
        Age_Group.append('Mid Adult')
    elif i < 59:
        Age_Group.append('Mature Adult')
    else:
        Age_Group.append('Senior')
df['Age_Group'] = Age_Group

### Uplooad fuel_price data set which was partially cleaned separately

In [14]:
df1 = pd.read_csv('C:\\Users\\organ\\Desktop\\Code institute\\python\\hackatonproyect\\travel_trends\\jupyter_notebooks\\fuel_price.csv')
df1.head()

Unnamed: 0,Date,Fuel Price($/gal),Year,Month
0,2021-01-15,2.42,2021,1
1,2021-02-15,2.587,2021,2
2,2021-03-15,2.898,2021,3
3,2021-04-15,2.948,2021,4
4,2021-05-15,3.076,2021,5


### Convert Date column to datetime format
### Create two new columns: 'Pricing_Month_Name' and 'Pricing_Year'
### Drop Date, Year and Month columns

In [15]:
df1['Date'] = pd.to_datetime(df1['Date'])
df1['Pricing_Month_Name'] = df1['Date'].dt.month_name()
df1['Pricing_Year'] = df1['Date'].dt.year
df1.drop(columns=['Date','Year','Month'], inplace=True)
df1.head()

Unnamed: 0,Fuel Price($/gal),Pricing_Month_Name,Pricing_Year
0,2.42,January,2021
1,2.587,February,2021
2,2.898,March,2021
3,2.948,April,2021
4,3.076,May,2021


### Merge fuel_price with Travel data set to travel_da dataframe for the ease of data analysis

In [16]:
travel_da = pd.merge(df,df1, how = 'inner', left_on = ['Year','Month_name'], right_on = ['Pricing_Year','Pricing_Month_Name'])
travel_da.head()

Unnamed: 0,Destination,Start date,End date,Duration (days),Traveler age,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost,Month,Month_name,Year,Dest_Country,Age_Group,Fuel Price($/gal),Pricing_Month_Name,Pricing_Year
0,"London, UK",2023-05-01,2023-05-08,7,35,Male,American,Hotel,1200,Flight,600,5,May,2023,UK,Young Adult,3.666,May,2023
1,"Bangkok, Thailand",2023-05-01,2023-05-07,6,29,Female,Indian,Airbnb,500,Bus,50,5,May,2023,Thailand,Young Adult,3.666,May,2023
2,New York,2023-05-08,2023-05-14,6,50,Male,China,Airbnb,800,Car rental,300,5,May,2023,USA,Mature Adult,3.666,May,2023
3,"Paris, France",2023-05-01,2023-05-07,6,35,Male,American,Hotel,5000,Airplane,2500,5,May,2023,France,Young Adult,3.666,May,2023
4,"Tokyo, Japan",2023-05-15,2023-05-22,7,28,Female,British,Airbnb,7000,Train,1500,5,May,2023,Japan,Youth,3.666,May,2023


### Remove the 'Pricing_Month_Name' and 'Pricing_Year' columns from the merged dataframe as they are redundant

In [17]:
travel_da.drop(columns=['Pricing_Month_Name', 'Pricing_Year'], inplace=True)
travel_da.head()

Unnamed: 0,Destination,Start date,End date,Duration (days),Traveler age,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost,Month,Month_name,Year,Dest_Country,Age_Group,Fuel Price($/gal)
0,"London, UK",2023-05-01,2023-05-08,7,35,Male,American,Hotel,1200,Flight,600,5,May,2023,UK,Young Adult,3.666
1,"Bangkok, Thailand",2023-05-01,2023-05-07,6,29,Female,Indian,Airbnb,500,Bus,50,5,May,2023,Thailand,Young Adult,3.666
2,New York,2023-05-08,2023-05-14,6,50,Male,China,Airbnb,800,Car rental,300,5,May,2023,USA,Mature Adult,3.666
3,"Paris, France",2023-05-01,2023-05-07,6,35,Male,American,Hotel,5000,Airplane,2500,5,May,2023,France,Young Adult,3.666
4,"Tokyo, Japan",2023-05-15,2023-05-22,7,28,Female,British,Airbnb,7000,Train,1500,5,May,2023,Japan,Youth,3.666


### Re-arrange the columns in the merged dataframe

In [18]:
travel_da = travel_da[['Destination','Dest_Country', 'Start date', 'End date', 'Month', 'Month_name', 'Year', 'Duration (days)', 'Traveler age', 'Age_Group', 'Traveler gender', 'Traveler nationality', 'Accommodation type','Accommodation cost', 'Transportation type','Transportation cost', 'Fuel Price($/gal)']]

### Upload Temperature data set which was partially cleaned separately

In [19]:
df2 = pd.read_csv('C:\\Users\\organ\\Desktop\\Code institute\\python\\hackatonproyect\\travel_trends\\jupyter_notebooks\\Temperature Data.csv')
df2.head()

Unnamed: 0,country,temperature_celsius,W_Date,Temp_month,Temp_year
0,Afghanistan,26.6,2024-05-16,5,2024
1,Albania,19.0,2024-05-16,5,2024
2,Algeria,23.0,2024-05-16,5,2024
3,Andorra,6.3,2024-05-16,5,2024
4,Angola,26.0,2024-05-16,5,2024


### Rename and adjust country column data according to the travel data set

In [20]:
def country_mapping(country):
    if country == 'United States of America':
        return 'USA'
    elif country == 'United Kingdom':
        return 'UK'
    elif country == 'South Korea':
        return 'South Korea'
    elif country == 'United Arab Emirates':
        return 'UAE'
    else:
        return country

df2['country'] = df2['country'].apply(country_mapping)

### Matching contry names in the temperature data set with the travel data set and save it as SET

In [21]:
matching = set(travel_da['Dest_Country']) & set(df2['country'])
matching

{'Australia',
 'Brazil',
 'Cambodia',
 'Canada',
 'Egypt',
 'France',
 'Germany',
 'Greece',
 'Indonesia',
 'Italy',
 'Japan',
 'Mexico',
 'Morocco',
 'Netherlands',
 'New Zealand',
 'South Africa',
 'South Korea',
 'Spain',
 'Thailand',
 'UAE',
 'UK',
 'USA'}

### Keep country names in the temperature data set which are in 'matching' SET

In [22]:
df2['Country'] = df2['country'].apply(lambda x: x if x in matching else None)

### Remove redundant 'Country' column from the temperature data set

In [23]:
df2.dropna(subset=['Country'], inplace=True)
df2

Unnamed: 0,country,temperature_celsius,W_Date,Temp_month,Temp_year,Country
8,Australia,9.0,2024-05-16,5,2024,Australia
23,Brazil,23.1,2024-05-16,5,2024,Brazil
30,Cambodia,38.0,2024-05-16,5,2024,Cambodia
32,Canada,12.0,2024-05-16,5,2024,Canada
51,Egypt,27.0,2024-05-16,5,2024,Egypt
...,...,...,...,...,...,...
68766,Spain,13.1,2025-05-05,5,2025,Spain
68775,Thailand,29.1,2025-05-05,5,2025,Thailand
68786,UAE,43.1,2025-05-05,5,2025,UAE
68787,UK,10.0,2025-05-05,5,2025,UK


### Get average temperature of a country from the data set
### Rename 'temperature_celsius' column to 'Avg_Temperature_Celsius'

In [24]:
weather_data = pd.DataFrame(df2.groupby('Country')['temperature_celsius'].mean()).reset_index()
weather_data.rename(columns={'temperature_celsius': 'Avg_Temperature'}, inplace=True)

### Check the output of the weather_data dataframe

In [25]:
def climate_mapping(temp):
    if temp < 10:
        return 'Very Cold'
    elif temp < 16:
        return 'Cool'
    elif temp < 22:
        return 'Mild'
    elif temp < 28:
        return 'Warm'
    else:
        return 'Hot'

weather_data['Climate'] = weather_data['Avg_Temperature'].apply(climate_mapping)

In [26]:
weather_data.head()

Unnamed: 0,Country,Avg_Temperature,Climate
0,Australia,12.971751,Cool
1,Brazil,25.356286,Warm
2,Cambodia,31.021246,Hot
3,Canada,5.389744,Very Cold
4,Egypt,27.292635,Warm


### Now, merge the weather_data with travel_data dataframe
### Remove the 'Country' column from the merged dataframe as it is redundant

In [27]:
travel_da = pd.merge(travel_da,weather_data, how = 'inner', left_on = ['Dest_Country'], right_on = ['Country'])
travel_da.drop(columns=['Country'], inplace=True)
travel_da.head()

Unnamed: 0,Destination,Dest_Country,Start date,End date,Month,Month_name,Year,Duration (days),Traveler age,Age_Group,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost,Fuel Price($/gal),Avg_Temperature,Climate
0,"London, UK",UK,2023-05-01,2023-05-08,5,May,2023,7,35,Young Adult,Male,American,Hotel,1200,Flight,600,3.666,12.891477,Cool
1,London,UK,2023-07-22,2023-07-28,7,July,2023,6,35,Young Adult,Female,British,Hotel,1200,Train,150,3.712,12.891477,Cool
2,"London, UK",UK,2024-03-15,2024-03-23,3,March,2024,8,35,Young Adult,Male,British,Hotel,1000,Train,200,3.542,12.891477,Cool
3,"Edinburgh, Scotland",UK,2024-09-05,2024-09-12,9,September,2024,7,32,Young Adult,Male,Scottish,Hotel,900,Train,150,3.338,12.891477,Cool
4,London,UK,2022-06-10,2022-06-15,6,June,2022,5,38,Young Adult,Female,United Kingdom,Hotel,900,Train,150,5.032,12.891477,Cool


### Check the status of the final merged dataframe

In [28]:
travel_da.describe(include='all')

Unnamed: 0,Destination,Dest_Country,Start date,End date,Month,Month_name,Year,Duration (days),Traveler age,Age_Group,Traveler gender,Traveler nationality,Accommodation type,Accommodation cost,Transportation type,Transportation cost,Fuel Price($/gal),Avg_Temperature,Climate
count,135,135,135,135,135.0,135,135.0,135.0,135.0,135,135,135,135,135.0,135,135.0,135.0,135.0,135
unique,59,22,,,,12,,,,5,2,41,8,,9,,,,5
top,"Paris, France",France,,,,September,,,,Young Adult,Female,American,Hotel,,Plane,,,,Cool
freq,7,15,,,,17,,,,63,69,23,59,,56,,,,48
mean,,,2023-04-25 06:24:00,2023-05-02 19:12:00,6.688889,,2022.814815,7.607407,33.088889,,,,,1249.481481,,642.555556,3.793822,18.313095,
min,,,2021-06-15 00:00:00,2021-06-20 00:00:00,1.0,,2021.0,5.0,20.0,,,,,150.0,,20.0,3.157,5.389744,
25%,,,2022-08-28 12:00:00,2022-09-06 00:00:00,5.0,,2022.0,7.0,28.0,,,,,600.0,,200.0,3.496,12.971751,
50%,,,2023-06-07 00:00:00,2023-06-14 00:00:00,7.0,,2023.0,7.0,31.0,,,,,900.0,,500.0,3.712,17.404533,
75%,,,2023-11-16 00:00:00,2023-11-22 12:00:00,9.0,,2023.0,8.0,37.5,,,,,1200.0,,800.0,3.954,24.485487,
max,,,2025-02-14 00:00:00,2025-02-20 00:00:00,12.0,,2025.0,14.0,60.0,,,,,8000.0,,3000.0,5.032,32.225141,


In [None]:
travel_da.to_csv('data\processed\Travel details dataset_cleaned.csv', index=False)