# Project Description

## **Prompt:**

#### <u>Company Background</u>:

Awesome Sports is a dynamic player in the athletic market, dedicated to offering a diverse range of top-tier sports and fitness products.

#### <u>Objective</u>:

Recognizing the power of data-driven decision-making, Awesome Sports seeks to optimize customer experience and market positioning. Your expertise as a data analyst is enlisted to extract actionable insights from a comprehensive market research survey.

#### <u>Key Questions</u>:

Customer Satisfaction and Loyalty:

* Overall Rating: Gauge customer perceptions of Awesome Sports.
* Recommendation Rate: Determine the percentage of customers who would recommend Awesome Sports to their family and friends.

Product Perception:

* Pricing Perception: Assess how Awesome Sports is perceived regarding pricing.
* Product Quality Satisfaction: Evaluate customer satisfaction with product quality.

Brand and Trust:

* Brand Trust-Recommendation Correlation: Investigate the correlation between brand trust and the likelihood of customer recommendation

Fitness and Lifestyle:

* Sport Preferences: Identify the most popular sport among Awesome Sports customers.

## **Approach:**

To effectively analyze the survey data, I systematically streamlined the dataset to directly extract insights related to the key questions, ensuring both clarity and data integrity. Here are the steps I followed:

1. <u>Data Cleaning and Reduction:</u>
    * I began by curating the dataset to include only the columns that are directly relevant to the key questions. This selective approach minimizes clutter and enhances focus on critical information.

2. <u>Data Transformation:</u>
    * For non-numerical answers within the dataset, I employed a method of assigning dummy variables to standardize and quantify responses. However, for the question regarding favorite sports (Q3), I retained the original answers without alteration.

    * Dummy variable assignment adhered to the following pattern:

        * For binary responses ("No"= 1, "Yes"= 2).
        * For Likert scale responses ("Strongly Disagree"= 1, "Disagree"= 2, "Neither Agree nor Disagree"= 3, "Agree"= 4, "Strongly Agree"= 5).
        * Blank or missing responses remained as NaN to preserve the integrity of the data and prevent introducing bias into the analysis.

This approach ensures that the data remains concise, facilitates a standardized format for analysis, and enables precise examination of the key questions without compromising the data's original structure and significance.

## **Results:**

<u>Customer Satisfaction and Loyalty:</u>

* Overall Rating: The average rating for Awesome Sports is 3.1 out of 5. This suggests a moderate brand perception. It implies that while Awesome Sports meets the expectations of some customers, there's room for improvement. Further segmentation analysis can provide deeper insights.

* Recommendation Rate: Approximately 35.27% of customers would recommend Awesome Sports to family and friends. This indicates that a portion of customers had positive experiences and are willing to share them. However, there's room for enhancement through improved customer experiences and satisfaction.

<u>Product Perception:</u>

* Pricing Perception: The average response to the question "Is Awesome Sports well priced" is 4.14 out of 5. This indicates a positive pricing perception, with a high number of customers believing Awesome Sports offers well-priced products. This perception can also serve as a competitive advantage by influencing customer decisions.

* Product Quality Satisfaction: The average response for quality satisfaction is 2.55 out of 5, indicating a relatively lower level of satisfaction with product quality. This underscores the need for enhancements in product quality to better align with customer expectations.

<u>Brand and Trust:</u>

* Brand Trust-Recommendation Correlation: The correlation between brand trust and customer recommendation is 0.42. This indicates a moderately positive relationship. As brand trust increases, the likelihood of customer recommendations tends to rise. This highlights the importance of building and maintaining brand trust for customer advocacy and referrals.

<u>Fitness and Lifestyle:</u>

* Sport Preferences: The most popular sport among Awesome Sports' customers is basketball, with 218 responses. This suggests a significant interest in basketball-related products or activities within the customer base. Recognizing this popularity can guide the company's efforts, potentially leading to expanded product lines, basketball-related promotions, and sponsorship of related events to engage this customer segment.

# Imports

In [15]:
import pandas as pd
import matplotlib.pyplot as plt

# Datasets

In [44]:
#Read file
survey_data= pd.read_excel("Market Research Survey-Data Analysis.xlsx")

#Inspect DataFrame
survey_data

Unnamed: 0,Age,Customer Since,Gender,ID,Location,Number of Records,Q10__How_do_you_rate_AWESOME_SPORTS___1___5_,Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_,Q12__Where_do_you_normally_purchase_from_,Q13__What_was_your_last_sporting_purchase_,...,Q22__Can_we_share_this_information_with_our_partners_,Q1__Have_you_purchased_from_AWESOMESPORT_in_the_last_3_months_,Q3__What_is_your_favourite_sport_,Q4__How_often_do_you_train_,Q5__Do_you_have_a_current_gym_membership_,Q6__Do_you_take_suppliments____i_e___Protein_Shakes_,Q7__How_would_you_describe_yourself_,Q8__Do_you_plan_to_get_fitter_in_the_next_12_months_,Q9__Is_equipment_key_to_your_training_goals_,Segment
0,21,2014-01-04,Female,394,The Netherlands,1,3.0,No,Outlets,Accessories,...,Yes,No,Basketball,Once a month,No,No,Athletic,No,No,Corporate
1,35,2014-01-05,Male,134,France,1,3.0,,Outlets,Accessories,...,,,Basketball,1 - 3 times a week,,,Casual Runner,,,Corporate
2,18,2014-01-06,Male,205,The Netherlands,1,2.0,No,Outlets,Clothing,...,No,Yes,Rugby,Once a week,Yes,No,Prefer not to say,Yes,No,New Customer
3,60,2014-01-10,Male,929,The Netherlands,1,2.0,No,Online,Smart Product,...,No,No,Rugby,Once a month,Yes,No,Casual Runner,Yes,Yes,Returning Customer
4,58,2014-01-11,Female,36,Spain,1,3.0,Yes,In store,Equipment,...,Yes,No,Tennis,1 - 3 times a week,No,Yes,Unfit,No,No,New Customer
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
426,39,2016-03-09,Male,1114,France,1,5.0,,3rd Party,Equipment,...,,,Other,More than 3 times a week,,,Unfit,,,New Customer
427,22,2016-03-10,Female,404,Sweden,1,4.0,No,Outlets,Clothing,...,Yes,No,Basketball,Once a week,No,Yes,Casual Runner,Yes,Yes,Members
428,32,2016-03-11,Male,393,Italy,1,4.0,No,Outlets,Clothing,...,Yes,No,Rugby,Once a week,No,No,Athletic,Yes,Yes,Corporate
429,35,2016-03-11,Male,122,France,1,4.0,No,Online,Clothing,...,Yes,Yes,Rugby,Once a week,No,Yes,Bodybuilder,Yes,No,Members


In [17]:
#Inspecting columns in DataFrame
survey_data.columns

Index(['Age', 'Customer Since', 'Gender', 'ID', 'Location',
       'Number of Records', 'Q10__How_do_you_rate_AWESOME_SPORTS___1___5_',
       'Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_',
       'Q12__Where_do_you_normally_purchase_from_',
       'Q13__What_was_your_last_sporting_purchase_',
       'Q14__AWESOME_SPORT_has_a_good_selection_of_products',
       'Q15__AWESOME_SPORT_is_well_priced',
       'Q16__The_AWESOME_SPORT_brand_I_like_the_most_is',
       'Q17__How_satisfied_are_you_in_AWESOME_SPORT_quality_',
       'Q18__AWESOME_SPORT_is_both_fashionable_and_practical',
       'Q19__I_prefer_AWESOME_SPORT_to_other_brands', 'Unnamed: 16',
       'Q20__AWESOME_SPORT_is_a_brand_I_trust',
       'Q21__Would_you_like_to_sign_up_to_our_newsletters_',
       'Q22__Can_we_share_this_information_with_our_partners_',
       'Q1__Have_you_purchased_from_AWESOMESPORT_in_the_last_3_months_',
       'Q3__What_is_your_favourite_sport_', 'Q4__How_often_do_you_train_',
       'Q5_

# Data Cleaning and Formatting

In [18]:
#set "ID" column as the DataFrame's index
survey_data.set_index("ID", inplace=True)

In [19]:
#Condense the DataFrame so we only have what we want to work with
survey_data= survey_data[[
    "Q10__How_do_you_rate_AWESOME_SPORTS___1___5_",
    "Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_",
    "Q15__AWESOME_SPORT_is_well_priced",
    "Q17__How_satisfied_are_you_in_AWESOME_SPORT_quality_",
    "Q20__AWESOME_SPORT_is_a_brand_I_trust",
    "Q3__What_is_your_favourite_sport_"
]]

In [20]:
#Obtain unique values for each column
for column in survey_data.columns:
    unique_values= survey_data[column].unique()
    print(column) 
    print(unique_values)

Q10__How_do_you_rate_AWESOME_SPORTS___1___5_
[ 3.  2.  4.  1. nan  5.]
Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_
['No' nan 'Yes']
Q15__AWESOME_SPORT_is_well_priced
['Strongly Agree' 'Neither Agree or Disagree' 'Agree' 'Strongly Disagree'
 nan 'Disagree']
Q17__How_satisfied_are_you_in_AWESOME_SPORT_quality_
['Unsatisfied' 'Very unsatisfied' 'Satisfied' 'Neither' nan
 'Very satisfied']
Q20__AWESOME_SPORT_is_a_brand_I_trust
['No' nan 'Yes']
Q3__What_is_your_favourite_sport_
['Basketball' 'Rugby' 'Tennis' 'Football' 'Other' nan]


In [21]:
#Create dictionaries to numerize all of the unique text responses

#Create dictionary for numerizing "Yes" or "No" answers
yes_no_dict= {"No":1, "Yes":2}

#Create dictionary for numerizing "Strongly Disagree" to "Strongly Agree" answers
agree_disagree_dict= {
    "Strongly Disagree": 1,
    "Disagree": 2,
    "Neither Agree or Disagree": 3,
    "Agree": 4,
    "Strongly Agree": 5
}

#Create dictionary for numerizing "Very unsatisfied" to "Very satisfied" answers
satisfied_unsatisfied_dict= {
    "Very unsatisfied": 1,
    "Unsatisfied": 2,
    "Neither": 3,
    "Satisfied": 4,
    "Very satisfied": 5
} 

In [22]:
#Create new columns with numerized answers for Q11, Q20 (no, yes, blanks= remain blanks/NaN)
survey_data["Q11_clean"]= survey_data["Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_"].map(yes_no_dict)
survey_data["Q20_clean"]= survey_data["Q20__AWESOME_SPORT_is_a_brand_I_trust"].map(yes_no_dict)

#Create new columns with numerized answers for Q15 (agree/aisagree likert scale, blanks= remain blanks/NaN )
survey_data["Q15_clean"]= survey_data["Q15__AWESOME_SPORT_is_well_priced"].map(agree_disagree_dict)

#Create new columns with numerized answers for Q17 (satisfied/unsatisfied likert scale, blanks= remain blanks/NaN)
survey_data["Q17_clean"]= survey_data["Q17__How_satisfied_are_you_in_AWESOME_SPORT_quality_"].map(satisfied_unsatisfied_dict)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  survey_data["Q11_clean"]= survey_data["Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_"].map(yes_no_dict)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  survey_data["Q20_clean"]= survey_data["Q20__AWESOME_SPORT_is_a_brand_I_trust"].map(yes_no_dict)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-

In [23]:
#Inspect the updated dataframe
survey_data

Unnamed: 0_level_0,Q10__How_do_you_rate_AWESOME_SPORTS___1___5_,Q11__Would_you_recommend_AWESOME_SPORT_to_family_friends_,Q15__AWESOME_SPORT_is_well_priced,Q17__How_satisfied_are_you_in_AWESOME_SPORT_quality_,Q20__AWESOME_SPORT_is_a_brand_I_trust,Q3__What_is_your_favourite_sport_,Q11_clean,Q20_clean,Q15_clean,Q17_clean
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
394,3.0,No,Strongly Agree,Unsatisfied,No,Basketball,1.0,1.0,5.0,2.0
134,3.0,,Neither Agree or Disagree,Unsatisfied,,Basketball,,,3.0,2.0
205,2.0,No,Strongly Agree,Very unsatisfied,Yes,Rugby,1.0,2.0,5.0,1.0
929,2.0,No,Agree,Satisfied,No,Rugby,1.0,1.0,4.0,4.0
36,3.0,Yes,Neither Agree or Disagree,Unsatisfied,Yes,Tennis,2.0,2.0,3.0,2.0
...,...,...,...,...,...,...,...,...,...,...
1114,5.0,,Disagree,Very satisfied,,Other,,,2.0,5.0
404,4.0,No,Strongly Agree,Unsatisfied,Yes,Basketball,1.0,2.0,5.0,2.0
393,4.0,No,Strongly Agree,Neither,No,Rugby,1.0,1.0,5.0,3.0
122,4.0,No,Strongly Agree,Neither,No,Rugby,1.0,1.0,5.0,3.0


# Data Analysis

### How do customers rate awesome sports overall?

In [24]:
#Calculate the average rating
avg_rating= survey_data["Q10__How_do_you_rate_AWESOME_SPORTS___1___5_"].sum()/survey_data["Q10__How_do_you_rate_AWESOME_SPORTS___1___5_"].count()

#Round the average rating to two decimal places
rounded_avg= round(avg_rating,2)

#Print rounded average rating
print("The average rating for Awesome Sports is:", rounded_avg, "out of 5")

The average rating for Awesome Sports is: 3.1 out of 5


### What percentage of customers would recommend Awesome Sports to family and friends?

In [25]:
#Calculate "yes" responses
Q11_yes_count= (survey_data["Q11_clean"]== 2).sum()

#Calculate total responses
total_responses= len(survey_data)

#Calculate percentage
percentage= (Q11_yes_count/total_responses) * 100

#Round the percentage to two decimal places
rounded_percentage= round(percentage,2)

#Print rounded percentage
print("The percentage of customers that would recommend Awesome Sports to family and friends is:",rounded_percentage,"%") 

The percentage of customers that would recommend Awesome Sports to family and friends is: 35.27 %


### Is Awesome Sports perceived as a well-priced brand?

In [26]:
#Calculate the average response for Q15
Q15_average_response= survey_data["Q15_clean"].sum()/survey_data["Q15_clean"].count()

#Round average response to two decimal places
rounded_Q15_average_responses= round(Q15_average_response, 2)

#Print rounded average response
print("The average response to Q15 is:", rounded_Q15_average_responses, "out of 5")

The average response to Q15 is: 4.14 out of 5


### What is the customers' satisfaction level with the quality of Awesome Sports products?

In [27]:
#Calculate the average response for Q17
Q17_average_response= survey_data["Q17_clean"].sum()/survey_data["Q17_clean"].count()

#Round average response to two decimal places
rounded_Q17_average_responses= round(Q17_average_response, 2)

#Print rounded average response
print("The average response to Q17 is:", rounded_Q17_average_responses, "out of 5")

The average response to Q17 is: 2.55 out of 5


### Is there a correlation between brand trust and the likelihood of customer recommendation for Awesome Sports?

In [43]:
#Calculate the correlation coefficient between brand trust and the likelihood of customer recommendation
correlation_coefficient= survey_data["Q20_clean"].corr(survey_data["Q11_clean"])

#Print out correlation coefficient
print("The correlation between brand trust and the likelihood of customer recommendation is:", correlation_coefficient)

The correlation between brand trust and the likelihood of customer recommendation is: 0.42845852946138546


### What is the most popular sport among Awesome Sports customers

In [28]:
#What is the most popular sport among Awesome Sports customers

#Obtain count of each sport
sport_count= survey_data["Q3__What_is_your_favourite_sport_"].value_counts()

#obtain the name and count for the most popular sport
most_popular_sport= sport_count.idxmax()
most_popular_count= sport_count.max()

#Print the most popular sport
print("The most popular sport among Awesome Sports' customers is:", most_popular_sport, "with", most_popular_count, "responses in its favor")

The most popular sport among Awesome Sports' customers is: Basketball with 218 responses in its favor
