![gym](gym.png)


You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.

### The Data

You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products. 

### workout.csv

| Column     | Description              |
|------------|--------------------------|
| `'month'` | Month when the data was measured. |
| `'workout_worldwide'` | Index representing the popularity of the keyword 'workout', on a scale of 0 to 100. |

### three_keywords.csv

| Column     | Description              |
|------------|--------------------------|
| `'month'` | Month when the data was measured. |
| `'home_workout_worldwide'` | Index representing the popularity of the keyword 'home workout', on a scale of 0 to 100. |
| `'gym_workout_worldwide'` | Index representing the popularity of the keyword 'gym workout', on a scale of 0 to 100. |
| `'home_gym_worldwide'` | Index representing the popularity of the keyword 'home gym', on a scale of 0 to 100. |

### workout_geo.csv

| Column     | Description              |
|------------|--------------------------|
| `'country'` | Country where the data was measured. |
| `'workout_2018_2023'` | Index representing the popularity of the keyword 'workout' during the 5 year period. |

### three_keywords_geo.csv

| Column     | Description              |
|------------|--------------------------|
| `'country'` | Country where the data was measured. |
| `'home_workout_2018_2023'` | Index representing the popularity of the keyword 'home workout' during the 5 year period. |
| `'gym_workout_2018_2023'` | Index representing the popularity of the keyword 'gym workout' during the 5 year period.  |
| `'home_gym_2018_2023'` | Index representing the popularity of the keyword 'home gym' during the 5 year period. |

In [234]:
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt

## Import all the data

In [235]:
# Start coding here
workout_data = pd.read_csv('data/workout.csv')
three_keywords_data = pd.read_csv('data/three_keywords.csv')
workout_geo_data = pd.read_csv('data/workout_geo.csv')
three_keywords_geo_data = pd.read_csv('data/three_keywords_geo.csv')

In [236]:
# Put the years as an additional column for grouping later on
years = []
for year in workout_data['month']:
    yyyy= year.split('-')
    years.append(yyyy[0])

## Add years to the dataframe and then determine the maximum searches

In [237]:
workout_data["years"] = pd.Series(years)

In [238]:
max_workout_searches = workout_data.groupby("years")["workout_worldwide"].sum()

In [239]:
max_workout_searches.index

Index(['2018', '2019', '2020', '2021', '2022', '2023'], dtype='object', name='years')

In [240]:
# Find index of Series after finding out the macimum in the series
year_str = max_workout_searches[max_workout_searches == max_workout_searches.max()].index[0]

## Calculate highest interest during covid (2020) vs now(2023)

In [241]:
years_2 = []
for year in three_keywords_data['month']:
    yyyy= year.split('-')
    years_2.append(yyyy[0])

In [242]:
three_keywords_data["years"] = pd.Series(years_2)

In [243]:
word_searches_covid = three_keywords_data[["home_workout_worldwide", "gym_workout_worldwide", "home_gym_worldwide"]][three_keywords_data.years=="2020"].sum()

In [244]:
peak_covid = word_searches_covid[word_searches_covid == word_searches_covid.max()].index[0]

In [245]:
word_searches_now = three_keywords_data[["home_workout_worldwide", "gym_workout_worldwide", "home_gym_worldwide"]][three_keywords_data.years=="2023"].sum()

In [246]:
current = word_searches_now[word_searches_now == word_searches_now.max()].index[0]

## Country-specific searches for United States, Australia or Japan

In [247]:
workout_geo_data.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 250 entries, 0 to 249
Data columns (total 2 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   country            250 non-null    object 
 1   workout_2018_2023  61 non-null     float64
dtypes: float64(1), object(1)
memory usage: 4.0+ KB


In [248]:
all_world_search = workout_geo_data.sort_values(by="workout_2018_2023", ascending = False)

## Comparing values between the three countries

In [249]:
USA = all_world_search[all_world_search["country"]=="United States"]
AUS = all_world_search[all_world_search["country"]=="Australia"]
JPN = all_world_search[all_world_search["country"]=="Japan"]

In [250]:
highest_interest = max(USA.workout_2018_2023.item(),AUS.workout_2018_2023.item(),JPN.workout_2018_2023.item())

In [251]:
top_country = all_world_search.country[all_world_search.workout_2018_2023==highest_interest].item()

## Checking interest in Philippines or Malaysia

In [252]:
home_workout_search = three_keywords_geo_data[["Country", "home_workout_2018_2023"]]

In [253]:
PHI = home_workout_search[home_workout_search["Country"]=="Philippines"]
MLY = home_workout_search[home_workout_search["Country"]=="Malaysia"]

In [254]:
PHI_MLY =max(PHI.home_workout_2018_2023.item(), MLY.home_workout_2018_2023.item())

In [255]:
home_workout_geo = ""
if PHI_MLY == PHI.home_workout_2018_2023.item():
    home_workout_geo = "Philippines"
else:
    home_workout_geo = "Malaysia"