In [1]:
import os
import pandas as pd
import numpy as np
import json

import acquire


In [2]:

#pd.set_option('display.max_rows', 500)
pd.set_option('display.max_columns', 500)
pd.set_option('display.width', 1000)

## Overview

The purpose of this project is to integrate and analyze the diverse range of data collected by the Houston Rockets, including ticket transactions, retail sales, and fan surveys. The current challenge lies in the fact that the data is sourced from various systems and formats, making it difficult to gain comprehensive insights about the team's fan base. By creating a unified database table, this project aims to provide the Business Intelligence & Innovation team with a consolidated dataset that can be leveraged to build fan segments and understand their behaviors effectively.

Need to haves (Deliverables):
First what is needed:

Download project files from https://htxrockets.com/redirect/to/id?id=148

Aquire.py - Script of data acquition of all data from project source files.

Prepare.py - Script of wrangling data to an unified database table that meets the requiremnts of the stakeholder.

final notebook to run project

README



## Data Dictionary

**Tickets.csv** - Ticket sales transactions over the course of 41 home games.

|   Field Name      |   Description                                                     |
|-------------------|-------------------------------------------------------------------|
|   transaction_id  |   identification number for ticket transaction                    |
|   account_no      |   customer account number                                         |
|   email           |   customer email address                                          |
|   zip             |   customer zip code                                               |
|   phone_no        |   customer phone number                                           |
|   section         |   section of the arena that the tickets were purchased for        |
|   row             |   row of the section that the tickets were purchased for          |
|   qty             |   quantity of tickets purchased in transaction                    |
|   total_price     |   total transaction price                                         |
|   event_id        |   identification number for event the tickets were purchased for  |
|   channel         |   distribution channel for ticket transaction                     |



**Retail.json** - Online retail Purchases

|   Field Name      |   Description                                       |
|-------------------|-----------------------------------------------------|
|   transaction_id  |   identification number for the retail transaction  |
|   email           |   customer email address                            |
|   account_no      |   customer account number                           |
|   product_type    |   type of product purchased                         |
|   quantity        |   quantity of items purchased                       |
|   unit_price      |   price per unit                                    |
|   shipping_cost   |   shipping cost for the transaction                 |



**Surveys.csv**  - Melted response data from post-game surveys

|   Field Name                                                        |   Description                                                                                              |
|---------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
|   Submission ID                                                     |   unique identifier for each survey submitted (hint: can be used as index for pivot table of responses)    |
|   phone_no                                                          |   survey respondent phone number                                                                           |
|   event_id                                                          |   identification number for the event related to the survey                                                |
|   how_satisfied_were_you_with_this_event                            |   5-point scale response to question: "How satisfied were you with this event?"                            |
|   how_satisfied_were_you_with_your_retail_experience_at_this_event  |   5-point scale response to question: "How satisfied were you with your retail experience at this event?"  |
|   how_likely_are_you_to_attend_this_event_in_the_future             |   5-point scale response to question: "How likely are you to attend this event in the future?"             |
|   what_is_your_birthdate                                            |   survey respondent's date of birth                                                                        |
|   what_is_your_household_income                                     |   survey respondent's household income range                                                               |
|   what_is_your_highest_level_of_education_that_you_have_attained    |   survey respondent's highest level of education                                                           |


## Acquire

In [6]:
retail_data, survey_data, ticket_data = acquire.import_data(json_file = 'retail.json',
    csv_file1 = 'surveys.csv',
    csv_file2 = 'tickets.csv')

In [9]:
# Preview of dataframe from JSON
retail_data.head()


Unnamed: 0,transaction_id,account_no,email,zip,phone_no,section,row,qty,total_price,event_id,channel
0,1,A87144476G,user400@rockets.com,77066,280-379-5220,109,9,1,200,3223,Web
1,2,A66578188Z,user141@rockets.com,76673,490-491-8071,101,10,4,800,3221,Box Office
2,3,A11689958W,user98@rockets.com,77031,244-805-9413,100,18,8,1600,3237,Box Office
3,4,A47432461Z,user213@rockets.com,76136,826-458-9773,400,7,1,50,3240,Web
4,5,A80089942I,user472@rockets.com,75559,803-733-6051,414,17,1,25,3215,Box Office


In [12]:
# Preview of survey data
survey_data.head(9)

Unnamed: 0,Submission ID,Attribute,Value
0,1,phone_no,290-551-1299
1,1,event_id,3220
2,1,how_satisfied_were_you_with_this_event,2
3,1,how_satisfied_were_you_with_your_retail_experi...,3
4,1,how_likely_are_you_to_attend_this_event_in_the...,5 - Very Likely
5,1,what_is_your_birthdate,33939
6,1,what_is_your_household_income,"Less than $50,000"
7,1,what_is_your_highest_level_of_education_that_y...,Associate's Degree
8,2,phone_no,663-795-4865


In [13]:
# Preview of ticket data
ticket_data.head()

Unnamed: 0,transaction_id,account_no,email,zip,phone_no,section,row,qty,total_price,event_id,channel
0,1,A87144476G,user400@rockets.com,77066,280-379-5220,109,9,1,200,3223,Web
1,2,A66578188Z,user141@rockets.com,76673,490-491-8071,101,10,4,800,3221,Box Office
2,3,A11689958W,user98@rockets.com,77031,244-805-9413,100,18,8,1600,3237,Box Office
3,4,A47432461Z,user213@rockets.com,76136,826-458-9773,400,7,1,50,3240,Web
4,5,A80089942I,user472@rockets.com,75559,803-733-6051,414,17,1,25,3215,Box Office


## Prepare