# COMP30760 Assignment 1 Sample - Task 1

In this assignment we makes an API request to the ***Ticketmaster API*** in order to obtain USA Event, Venue, Ticket, Prices, Segments and many more information, which is subsequently processed and saved to a CSV file. This notebook covers ***Task 1 - Data Collection***. 

This code has:

1. Using an API key and parameters, we created an API request to the Ticketmaster API.

2. After submitting the API request, we verify its success.

3. We retrieve event data from the API response if we are successful.

4. We process this data, extracting information about the event, location, cost, and category.

5. We use a structured DataFrame to store this processed data.

6. A CSV file containing the DataFrame is saved for later examination.

7. An error message appears if the API request is unsuccessful.


API Link :  https://developer.ticketmaster.com/products-and-docs/apis/discovery-api/v2/

If it doesn't already exist, create the directory for storing raw data

In [1]:
from pathlib import Path
dir_raw = Path("raw")
dir_raw.mkdir(parents=True, exist_ok=True)

## Data Collection 

Here I created a code to gather the all relevent information of API Ticketmaster from given description of the key and parameters.

****
We configured the URL and API key needed to submit a request to the Ticketmaster API.
The params dictionary defines the query parameters, which include the API key and any extra arguments required to perfect the query.

In [2]:
import pandas as pd
import requests
from pathlib import Path

# list is used to store data 
event_data = []

# API key and URL
api_key = "SLAUh2m7UqGMFF20aWqGpIvElNCn7O0W"
url = "https://app.ticketmaster.com/discovery/v2/events.json"

for p in range(5):  # (5 API req * 200 records) = 1000 records

    # Set up the API request parameters
    params = {
        'apikey': api_key,
        'size': 200,
        'page': p,
    }

    # API request
    response = requests.get(url, params=params)

    # Confirm the success of the request (status code 200)
    if response.status_code == 200:

        # Compile the answer in JSON format
        data = response.json()

        # Extracting and handling event data
        events = data.get('_embedded', {}).get('events', [])

        for event in events:
            
            
            # Event information
            event_id = event.get('id', '') 
            event_name = event.get('name', '')
            event_url = event.get('url', '')
            event_type = event.get('type', '')
            start_date_time = event.get("dates", {}).get("start", {}).get("localDate", "")
          

            # Location information
            location = event.get('_embedded', {}).get('venues', [{}])[0]
            city = location.get('name', '')
            state_info = location.get('state', {}) if location else {}
            state = state_info.get('name', '')
            country_info = location.get('country', {}) if location else {}
            country = country_info.get('name', '')
            venue_name = location.get('name', '')
            box_office_info = location.get('boxOfficeInfo', {}).get("phoneNumberDetail","")
            open_hours = location.get('boxOfficeInfo', {}).get("openHoursDetail","")
            willCallDetail = location.get('boxOfficeInfo', {}).get("willCallDetail","")
            
            
            
            # Venue details
            postal_code = location.get('postalCode', '')
            timezone = location.get('timezone', '')
            address = location.get('address', {}).get('line1', '')
            longitude = location.get('location', {}).get('longitude', '')
            latitude = location.get('location', {}).get('latitude', '')
            parking_info = location.get("parkingDetail", "")
            
            venue_data = event.get("_embedded", {}).get("venues", [{}])[0]
            
            
           
            # Extracting and handling the classifications information
            classifications = event.get('classifications', [{}])[0]
            segment = classifications.get('segment', {}).get('name', '')
            genre = classifications.get('genre', {}).get('name', '')
            subGenre = classifications.get('subGenre', {}).get('name', '')
            type = classifications.get('type', {}).get('name', '')
            subType = classifications.get('subType', {}).get('name', '')
            family = classifications.get('family', False)
            

            # Price ranges
            price_ranges = event.get('priceRanges', [{}])[0]
            standard_price_currency = price_ranges.get('currency', '')
            standard_price_min = price_ranges.get('min', 0.0)
            standard_price_max = price_ranges.get('max', 0.0)

            # Cost information
            cost = f"{standard_price_currency} {standard_price_min} - {standard_price_currency} {standard_price_max}"

            # Product information
            products = event.get('products', [{}])[0]
            product_name = products.get('name', '')
            product_id = products.get('id', '')
            product_url = products.get('url', '')
            product_type = products.get('type', '')
            

            # Payment details if available
            payment_info = event.get('promoter', {}).get('name', '')
            
            
            # Ticket limit information
            ticket_limit = event.get('accessibility', {}).get('ticketLimit', 'Not Available')
            ticketLimit_info = event.get("info", "")
            
            
            
            # Append Data 
            event_data.append([
                event_id, event_name, event_url, start_date_time, venue_name,
                segment, genre, subGenre, type, subType, family,
                standard_price_currency, standard_price_min, standard_price_max,
                cost, ticket_limit, ticketLimit_info,   
                product_name, product_id, product_url, product_type,
                city, state, country, event_type, payment_info,
                postal_code, timezone, address, longitude, latitude,
                box_office_info, open_hours, willCallDetail, parking_info
            ])
            
        # Transform the event data into a DataFrame
        df = pd.DataFrame(event_data, columns=[
            'Event ID', 'Event Name', 'Event URL', 'Event Date', 'Venue',
            'Segment', 'Genre', 'SubGenre', 'Type', 'SubType', 'Family',
            'Standard Price Currency', 'Standard Price Min', 'Standard Price Max',
            'Cost', 'Ticket Limit', 'Ticket Limit Info', 
            'Product Name', 'Product ID', 'Product URL', 'Product Type',
            'City', 'State', 'Country', 'Event Type', 'Payment Info',
            'Postal Code', 'Timezone', 'Address', 'Longitude', 'Latitude',
            'Box Office Info','Open hours','will Call Details', 'Parking Info'
        ])

        df = df.drop_duplicates(subset=['Event ID'])

    else:
        print("API request failed with status code:", response.status_code)


1. Save DataFrame to the designated path. 
2. Give the whole path to the raw data CSV file.

In [3]:
# Save the data to a CSV file in the particular location
raw_csv_path = Path('raw/raw_Event_Data.csv')
df.to_csv(raw_csv_path, index=False)
print(f"1000 records have been added to {raw_csv_path}.")


1000 records have been added to raw/raw_Event_Data.csv.


##### Load CSV from raw directory 

In [4]:
rawdata = pd.read_csv('raw/raw_Event_Data.csv')
print(rawdata.shape)

(1000, 35)


## Summary : 

To summarize, this code collects event data by systematically using the Ticketmaster API to obtain 1000 records, divided into 200 records each time. 

For every segment, it obtains event information from the API, such as the name, date, location, categories, cost, and payment information and many more. 

Creates a DataFrame from the data, with each row representing an event.

Adds the DataFrame in the end to compile all the data.

The end product is an extensive dataset that was gathered through numerous API requests and contains financial information related to the events.