pandas as pd: Used for handling data in tabular format, perfect for working with flight or hotel data.

amadeus.Client: Allows interaction with the Amadeus API, which provides access to travel data like flights, hotels, and city info.

datetime, timedelta: Used for working with and manipulating dates — important when querying flight availability for specific time windows.

IPython.display.display: Enables pretty display of DataFrames or outputs in Jupyter/Colab environments.

time: Standard Python module for managing time-based tasks like delays or measuring performance.

In [None]:
import pandas as pd
from amadeus import Client
from datetime import datetime, timedelta
from IPython.display import display
import time

### Function Overview
This function automates the process of collecting flight offer data from New York (JFK) to major tech hub cities using the Amadeus API. Data is collected in weekly batches over a specified period (default: 28 days).


Key Components
Component	Description
cities	Dictionary of tech hub cities and their airport codes
batch_days	Number of days per batch fetch (default = 7)
Client()	Initializes Amadeus API with provided credentials
flight_offers_search.get(...)	Retrieves flight offers for specified date and destination
all_flights.append(...)	Stores relevant flight data such as airline, price, stops, and travel time
time.sleep(3)	Pauses to avoid exceeding API rate limits
df.to_csv(...)	Saves all collected data to a CSV file for future analysis

In [None]:
def fetch_travel_data_batch(batch_days=7, total_days=28):
    amadeus = Client(
        client_id='SAissvHXkQwiVPmmg7XbOFfC2OYlqlde',
        client_secret='1jNqoOBhG7s7xgDE'
    )

    cities = {
        "San Francisco": {"airport": "SFO", "city_code": "SFO"},
        "London": {"airport": "LON", "city_code": "LON"},
        "Bangalore": {"airport": "BLR", "city_code": "BLR"},
        "Singapore": {"airport": "SIN", "city_code": "SIN"},
        "Tel Aviv": {"airport": "TLV", "city_code": "TLV"}
    }

    start_date = datetime.today() + timedelta(days=7)
    end_date = start_date + timedelta(days=total_days)

    all_flights = []

    current_start = start_date

    while current_start < end_date:
        current_end = min(current_start + timedelta(days=batch_days), end_date)
        date_range = [current_start + timedelta(days=i) for i in range((current_end - current_start).days)]

        print(f"Fetching data from {current_start.strftime('%Y-%m-%d')} to {current_end.strftime('%Y-%m-%d')}...")

        for travel_date in date_range:
            check_in_date = travel_date.strftime('%Y-%m-%d')

            for city_name, codes in cities.items():
                try:
                    flight_response = amadeus.shopping.flight_offers_search.get(
                        originLocationCode='JFK',
                        destinationLocationCode=codes['airport'],
                        departureDate=check_in_date,
                        adults=1
                    )

                    if flight_response.data:
                        for offer in flight_response.data:
                            itinerary = offer['itineraries'][0]
                            segment = itinerary['segments'][0]
                            all_flights.append({
                                'Destination': city_name,
                                'Departure Date': segment['departure']['at'][:10],
                                'Airline': segment['carrierCode'],
                                'Price (USD)': float(offer['price']['total']),
                                'Number of Stops': len(itinerary['segments']) - 1,
                                'Travel Time': segment['duration'].replace('PT', ''),
                                'Available Seats': offer.get('numberOfBookableSeats', 'N/A')
                            })
                except Exception as e:
                    print(f"Flight error for {city_name} on {check_in_date}: {e}")
                time.sleep(3)

        current_start = current_end

    df_flights = pd.DataFrame(all_flights)
    df_flights.to_csv("tech_city_flights.csv", index=False)

    print("\n Flights Data Sample:")
    display(df_flights.head())

#fetch_travel_data_batch()


###Loading the Collected Flight Data

pd.read_csv("tech_city_flights.csv"):
Loads the flight data previously saved into a CSV file. This file contains all flight offers collected by the fetch_travel_data_batch() function.

df_flights:
Displays the full DataFrame in the notebook or Google Colab environment so you can visually inspect the rows and columns.

In [None]:
df_flights = pd.read_csv("tech_city_flights.csv")
df_flights

Unnamed: 0,Destination,Departure Date,Airline,Price (USD),Number of Stops,Travel Time,Available Seats
0,San Francisco,2025-07-01,B6,79.90,0,6H15M,1
1,San Francisco,2025-07-01,B6,79.90,0,6H20M,1
2,San Francisco,2025-07-01,B6,79.90,0,6H35M,1
3,San Francisco,2025-07-01,B6,79.90,0,6H45M,1
4,San Francisco,2025-07-01,F9,156.03,1,5H42M,3
...,...,...,...,...,...,...,...
10704,Tel Aviv,2025-07-28,LY,1892.03,1,1H48M,9
10705,Tel Aviv,2025-07-28,LY,1892.03,1,1H29M,4
10706,Tel Aviv,2025-07-28,LX,4883.50,1,7H50M,9
10707,Tel Aviv,2025-07-28,IB,9694.38,1,7H30M,2


###Counting Flight Entries by Destination
The .value_counts() function is used to count the number of occurrences of each unique value in the Destination column of the df_flights DataFrame.

This tells us how many flight offers were collected for each tech hub city

Helps identify which cities have the most or least data in your dataset.

Useful for balancing comparisons (e.g., are there fewer offers for Tel Aviv than for London?).



In [None]:
df_flights['Destination'].value_counts()


Destination
London           4442
Tel Aviv         1719
Bangalore        1658
San Francisco    1467
Singapore        1423
Name: count, dtype: int64

In [None]:
pip install amadeus

Note: you may need to restart the kernel to use updated packages.
