![gym](gym.png)


You are a product manager for a fitness studio and are interested in understanding the current demand for digital fitness classes. You plan to conduct a market analysis in Python to gauge demand and identify potential areas for growth of digital products and services.

### The Data

You are provided with a number of CSV files in the "Files/data" folder, which offer international and national-level data on Google Trends keyword searches related to fitness and related products. 

### workout.csv

| Column     | Description              |
|------------|--------------------------|
| `'month'` | Month when the data was measured. |
| `'workout_worldwide'` | Index representing the popularity of the keyword 'workout', on a scale of 0 to 100. |

### three_keywords.csv

| Column     | Description              |
|------------|--------------------------|
| `'month'` | Month when the data was measured. |
| `'home_workout_worldwide'` | Index representing the popularity of the keyword 'home workout', on a scale of 0 to 100. |
| `'gym_workout_worldwide'` | Index representing the popularity of the keyword 'gym workout', on a scale of 0 to 100. |
| `'home_gym_worldwide'` | Index representing the popularity of the keyword 'home gym', on a scale of 0 to 100. |

### workout_geo.csv

| Column     | Description              |
|------------|--------------------------|
| `'country'` | Country where the data was measured. |
| `'workout_2018_2023'` | Index representing the popularity of the keyword 'workout' during the 5 year period. |

### three_keywords_geo.csv

| Column     | Description              |
|------------|--------------------------|
| `'country'` | Country where the data was measured. |
| `'home_workout_2018_2023'` | Index representing the popularity of the keyword 'home workout' during the 5 year period. |
| `'gym_workout_2018_2023'` | Index representing the popularity of the keyword 'gym workout' during the 5 year period.  |
| `'home_gym_2018_2023'` | Index representing the popularity of the keyword 'home gym' during the 5 year period. |

In [176]:
# Import the necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
import os

def path_builder(path_add, path_based = 'data'):
    """ makes the path for the files
    Args:
        path_add (str): the addition to the root path
        path_based (str): the root path
    Returns:
        str : the final path
    """
    value_tmp = path_based + '/' + path_add
    if os.path.exists(value_tmp):
        return value_tmp
    else: 
        TypeError('The inputs should be str!')


def print_clean(str):
    print('-'*30)
    print(str)
    print('-'*30)

In [177]:
# reading the file and cleaning it
data_country_workout = pd.read_csv(path_builder('workout_geo.csv'))

data_country_kws = pd.read_csv(path_builder('three_keywords_geo.csv'))
data_country_kws.columns = [col.lower() for col in data_country_kws.columns]

data_country = data_country_workout.merge(data_country_kws, on = 'country', how = 'outer').\
                                    dropna(subset = ['workout_2018_2023',\
                                                     'home_workout_2018_2023',\
                                                     'gym_workout_2018_2023',\
                                                     'home_gym_2018_2023'], how = 'all')

# finding the country which has the highest interest for 
# workouts among the following: United States, Australia, or Japan
data_country_tmp = data_country.loc[data_country['country'].isin(['United States', 'Australia', 'Japan'])]
top_country = data_country_tmp[['country','workout_2018_2023']].max()['country']
print(top_country)

# expanding your virtual home workouts offering to 
# either the Philippines or Malaysia

data_country_tmp = data_country.loc[data_country['country'].isin(['Philippines', 'Malaysia'])]
home_workout_geo = data_country_tmp[['country','home_workout_2018_2023']].max()['country']
print(home_workout_geo)

United States
Philippines


In [178]:
data_workout = pd.read_csv(path_builder('workout.csv'))
data_kws = pd.read_csv(path_builder('three_keywords.csv'))
data_year = data_workout.merge(data_kws, on = 'month', how = 'outer').\
                            dropna(subset = ['workout_worldwide',\
                                            'home_workout_worldwide',\
                                            'gym_workout_worldwide',\
                                            'home_gym_worldwide'],how = 'all')

# adding the column year to the DataFrame
data_year['year'] = pd.to_datetime(data_year['month']).dt.strftime('%Y')

# finding the year with highest search for workout
year_str = data_year[['year', 'workout_worldwide']].sort_values('workout_worldwide', ascending = False).iloc[0,0]
print_clean(year_str)


# Of the keywords available, what was the most popular 
# during the covid pandemic?
peak_covid_tmp = data_year.loc[(data_year['month'] > '2020-03') & (data_year['month'] < '2023-05')]\
                                                        [['home_workout_worldwide',\
                                                        'gym_workout_worldwide',\
                                                        'home_gym_worldwide']].sum()
print_clean(peak_covid_tmp)

peak_covid = 'home_workout'

#and what is the most popular now?
current_tmp = data_year.sort_values('month', ascending = False).iloc[0]
print_clean(current_tmp)

current = 'gym_workout'


------------------------------
2020
------------------------------
------------------------------
home_workout_worldwide    696
gym_workout_worldwide     573
home_gym_worldwide        587
dtype: int64
------------------------------
------------------------------
month                     2023-03
workout_worldwide              55
home_workout_worldwide         13
gym_workout_worldwide          19
home_gym_worldwide             12
year                         2023
Name: 60, dtype: object
------------------------------
