![gym](gym.png)


You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.

### The Data

You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products. 

### workout.csv

| Column     | Description              |
|------------|--------------------------|
| `'month'` | Month when the data was measured. |
| `'workout_worldwide'` | Index representing the popularity of the keyword 'workout', on a scale of 0 to 100. |

### three_keywords.csv

| Column     | Description              |
|------------|--------------------------|
| `'month'` | Month when the data was measured. |
| `'home_workout_worldwide'` | Index representing the popularity of the keyword 'home workout', on a scale of 0 to 100. |
| `'gym_workout_worldwide'` | Index representing the popularity of the keyword 'gym workout', on a scale of 0 to 100. |
| `'home_gym_worldwide'` | Index representing the popularity of the keyword 'home gym', on a scale of 0 to 100. |

### workout_geo.csv

| Column     | Description              |
|------------|--------------------------|
| `'country'` | Country where the data was measured. |
| `'workout_2018_2023'` | Index representing the popularity of the keyword 'workout' during the 5 year period. |

### three_keywords_geo.csv

| Column     | Description              |
|------------|--------------------------|
| `'country'` | Country where the data was measured. |
| `'home_workout_2018_2023'` | Index representing the popularity of the keyword 'home workout' during the 5 year period. |
| `'gym_workout_2018_2023'` | Index representing the popularity of the keyword 'gym workout' during the 5 year period.  |
| `'home_gym_2018_2023'` | Index representing the popularity of the keyword 'home gym' during the 5 year period. |

### Project Instructions

Help the fitness studio explore interest in workouts at a global and national level.

When was the global search for 'workout' at its peak? Save the year of peak interest as a string named year_str in the format "yyyy".

Of the keywords available, what was the most popular during the covid pandemic, and what is the most popular now? Save your answers as variables called peak_covid and current respectively.

What country has the highest interest for workouts among the following: United States, Australia, or Japan? Save your answer as top_country.

You'd be interested in expanding your virtual home workouts offering to either the Philippines or Malaysia. Which of the two countries has the highest interest in home workouts? Identify the country and save it as home_workout_geo.

In [10]:
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import warnings
import datetime
warnings.filterwarnings("ignore")
pd.set_option("display.max_columns", None)
pd.set_option("display.width", 500)

# Import databases

workout = pd.read_csv('/Users/irvincastillopalacios/Desktop/Proyectos/Data-Driven Product Management Conducting a Market Analysis/workout.csv')
workout_geo = pd.read_csv('/Users/irvincastillopalacios/Desktop/Proyectos/Data-Driven Product Management Conducting a Market Analysis/workout_geo.csv')
three_keywords = pd.read_csv('/Users/irvincastillopalacios/Desktop/Proyectos/Data-Driven Product Management Conducting a Market Analysis/three_keywords.csv')
three_keywords_geo = pd.read_csv('/Users/irvincastillopalacios/Desktop/Proyectos/Data-Driven Product Management Conducting a Market Analysis/three_keywords_geo.csv')


In [11]:
def dataframe_info(Dataframe):
    print(Dataframe.info())
    print()

dataframe_info(workout)
dataframe_info(workout_geo)
dataframe_info(three_keywords)
dataframe_info(three_keywords_geo)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 61 entries, 0 to 60
Data columns (total 2 columns):
 #   Column             Non-Null Count  Dtype 
---  ------             --------------  ----- 
 0   month              61 non-null     object
 1   workout_worldwide  61 non-null     int64 
dtypes: int64(1), object(1)
memory usage: 1.1+ KB
None

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 250 entries, 0 to 249
Data columns (total 2 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   country            250 non-null    object 
 1   workout_2018_2023  61 non-null     float64
dtypes: float64(1), object(1)
memory usage: 4.0+ KB
None

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 61 entries, 0 to 60
Data columns (total 4 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   month                   61 non-null     object
 1   home_workout_worldwide  61 non-null  

In [12]:
# Ensure 'month' is a string column in case it is a datetime or other type
workout["month"] = workout["month"].astype(str)

# Get the peak search value for 'workout' worldwide
peak_workout_search = workout["workout_worldwide"].max()

# Get the rows where the peak value occurs
peak_row_location = workout[workout["workout_worldwide"] == peak_workout_search]

# Extract the year from the 'month' column and get the first occurrence
year_str = peak_row_location['month'].str[:4].values[0]

print(year_str)


2020


In [17]:
# Of the keywords available, what was the most popular during the covid pandemic, and what 
# is the most popular now? Save your answers as variables called peak_covid and current respectively.

covid_dates = (three_keywords["month"] >= "2020-01") & (three_keywords["month"] <= "2020-12")
covid_data = three_keywords[covid_dates]
peak_covid = covid_data[["home_workout_worldwide", "gym_workout_worldwide", "home_gym_worldwide"]].max().idxmax()
print(peak_covid)

current_data = three_keywords[three_keywords["month"] > "2022-12"]
current = current_data[["home_workout_worldwide", "gym_workout_worldwide", "home_gym_worldwide"]].max().idxmax()
print(current)


home_workout_worldwide
gym_workout_worldwide


In [14]:
# What country has the highest interest for workouts among the following: United States, Australia, or Japan? Save your answer as top_country.
countries = ["United States", "Australia", "Japan"]
countries_data = workout_geo[workout_geo["country"].isin(countries)]
top_country = countries_data.loc[countries_data["workout_2018_2023"].idxmax(), "country"]
print(top_country)


United States


In [15]:
# You'd be interested in expanding your virtual home workouts offering to either the Philippines or Malaysia. Which of the 
# two countries has the highest interest in home workouts? Identify the country and save it as home_workout_geo.
asean_countries = ["Philippines", "Malaysia"]
asean_countries_data = three_keywords_geo[three_keywords_geo["Country"].isin(asean_countries)]
home_workout_geo = asean_countries_data.loc[asean_countries_data["home_workout_2018_2023"].idxmax(), "Country"]
print(home_workout_geo)



Philippines
