Task 1: Generate a synthetic dataset that matches the description
- Participant ID (numeric)
- Exercise group (jogging, weightlifting, yoga)
- Pre-exercise systolic blood pressure (numeric)
- Post-exercise systolic blood pressure (numeric)


In [11]:
# Import libraries 
import pandas as pd
import numpy as np

#Set random seed for numpy's pseudo random generator
np.random.seed(0)

#Define number of participants
num_participants = 100

# ---------------------- TASK 1 ----------------------------------------
#Genereate Participant IDs - using the arange functionality of numpy 
# Beginning at 1 untoil the number of participant + 1
participants_id = np.arange(1, num_participants + 1)

#Define exercise groups 
exercise_groups = np.random.choice(['jogging', 'weightlifting', 'yoga'], num_participants)

# Define the range for the pre and post blood pressure
# The number for pre and post value were from an estimate of normal heartrates
pre_minValue = 90
pre_maxValue = 150
post_minValue = 160
post_maxValue = 220

#Generate systolic blood pressure before and after exercise
# For post and pre bp it is randomized to take in the min and (max+1) value
pre_bp = np.random.randint(pre_minValue, pre_maxValue+1, num_participants)
post_bp = np.random.randint(post_minValue, post_maxValue+1, num_participants)

# Create a data frame that shows the column for the description given above
data = pd.DataFrame({
    'Participant ID' : participants_id,
    'Exercise Group' : exercise_groups,
    'Pre-exercise Systolic BP': pre_bp,
    'Post-exercise Systolic BP' : post_bp

    })

# Export the dataframe to a csv file
data.to_csv('exercise_data.csv', index=False)

# Display beginning and ends of data. For better readability of data csv is created
print(data)



    Participant ID Exercise Group  Pre-exercise Systolic BP  \
0                1        jogging                       132   
1                2  weightlifting                       103   
2                3        jogging                       138   
3                4  weightlifting                       129   
4                5  weightlifting                       111   
..             ...            ...                       ...   
95              96        jogging                       143   
96              97           yoga                       109   
97              98        jogging                       123   
98              99  weightlifting                       130   
99             100           yoga                       122   

    Post-exercise Systolic BP  
0                         196  
1                         219  
2                         219  
3                         217  
4                         166  
..                        ...  
95                 

Task 2: Count Vowels in Exercise Types
- Takes in name of the exercise
- Returns number of vowels in the string

In [7]:
# ------------------------- TASK 2 ---------------------------------
# Ask users for exercise they want to input
inputStr = input ('Enter the exercise: ')

# Create a count variable
vowel_count = 0

# Set vowel for both lower and upper case
vowels = set("aeiouAEIOU") 

# Create a for loop that will iterate in every letter in the inputStr
for letter in inputStr:

#   Then check if the letter is a vowel then the count increments
    if letter in vowels:
        vowel_count += 1 

# Print out the input string and the number of vowels
print(inputStr)
print('Number of vowels : ', vowel_count)

# When running the code in Jupyter notebook, the input will be asked on the top of the page


YOGA
Number of vowels :  2


Task 3: Highest Pre-Exercise Blood Pressure by Group
- Read the exercise_data.csv 
- Print the participant with highest pre-exercise systolic blood pressure in each group

In [12]:
# Import libraries
import pandas as pd

# Read the dataset from the CSV file
exercise = pd.read_csv('exercise_data.csv')

# Group the data by Exercise Group
grouped = exercise.groupby('Exercise Group')

# Use max functionality to check the max number in each Exercise group
max_pre_bp = grouped['Pre-exercise Systolic BP'].max()

# Print Pre-exercised max blood pressure
print(max_pre_bp)

Exercise Group
jogging          150
weightlifting    149
yoga             147
Name: Pre-exercise Systolic BP, dtype: int64


Task 4: Extract Even Participant IDs
- List Particiapnt ID
- Return new list conataining only even-numbered ID

In [13]:
# ----------------------------Task 4 -----------------------------
# Import library
import pandas as pd

# Import exercise_data 
data = pd.read_csv('exercise_data.csv')

# Define odd and even values
def parity(x):
    # if x has a 0 remainder then it is even
    if x % 2 == 0:
        parity = 'even'
    # anything else then it is odd
    else: 
        parity = 'odd'
    return parity

# Apply parity of the data on a new column
data['Parity'] = data['Participant ID'].apply(parity)
# Create a variable to place the even parity 
evenParity = data[data['Parity'] == 'even']

# Print out the Participants with even parity
print(evenParity)

# Export Even parities to a csv file
evenParity.to_csv('even_parity.csv', index=False)

    Participant ID Exercise Group  Pre-exercise Systolic BP  \
1                2  weightlifting                       103   
3                4  weightlifting                       129   
5                6           yoga                        99   
7                8           yoga                       100   
9               10        jogging                       133   
11              12           yoga                       113   
13              14           yoga                        92   
15              16        jogging                       124   
17              18  weightlifting                       120   
19              20  weightlifting                        93   
21              22  weightlifting                       136   
23              24        jogging                       110   
25              26           yoga                       140   
27              28           yoga                       104   
29              30  weightlifting                      

Task 5: Monthly Blood Pressure Change
- Assume blood pressure measurements were taken monthly
- Compute and print average change in blood pressure for each group



In [14]:
# Import library
import pandas as pd

# Read the dataset from the CSV file
exercise = pd.read_csv('exercise_data.csv')

# Define a function to that calculates the mean and average difference
# It's parameter has to take in the exercise group
def mean_and_meanDiff(exercise_group):
    # Retrieve the exercise for grouping
    group_data = exercise[exercise['Exercise Group']== exercise_group]

    # Calculate the average of all the Pre-exercise BP in the exercise group
    pre_bp_mean = group_data['Pre-exercise Systolic BP'].mean()
    # Calculate the average of all the Pre-exercise BP in the exercise group
    pre_bp_diff = group_data['Pre-exercise Systolic BP'].diff().mean()

    # Calculate the average of all the Post-exercise BP in the exercise group
    post_bp_mean = group_data['Post-exercise Systolic BP'].mean()
    # Calculate the average difference of all the Pre-exercise BP in the exercise group
    post_bp_diff = group_data['Post-exercise Systolic BP'].diff().mean()

    # Display the pre and post exercise mean and mean difference
    print(f'{exercise_group.upper()} Exercise Group:' )
    print('BEFORE EXERCISE:')
    print('Mean:', pre_bp_mean)
    print('Average Difference:', pre_bp_diff)
    print('AFTER EXERCISE:')
    print('Mean:', post_bp_mean)
    print('Average Difference:', post_bp_diff)
    print()

# Call out the function with their exercise group
mean_and_meanDiff('jogging')
mean_and_meanDiff('weightlifting')
mean_and_meanDiff('yoga')

JOGGING Exercise Group:
BEFORE EXERCISE:
Mean: 122.33333333333333
Average Difference: -0.23684210526315788
AFTER EXERCISE:
Mean: 186.84615384615384
Average Difference: -0.5263157894736842

WEIGHTLIFTING Exercise Group:
BEFORE EXERCISE:
Mean: 123.6470588235294
Average Difference: 0.8181818181818182
AFTER EXERCISE:
Mean: 191.94117647058823
Average Difference: -1.5454545454545454

YOGA Exercise Group:
BEFORE EXERCISE:
Mean: 114.18518518518519
Average Difference: 0.8846153846153846
AFTER EXERCISE:
Mean: 186.33333333333334
Average Difference: -0.7692307692307693



Task 6: Compare Pre and Post Exercise Blood Pressure
- Takes Pre and Post systolic blood pressure
- Returns New list representing their differences

In [15]:
# Import Library
import pandas as pd

# Read file using panda
data = pd.read_csv('exercise_data.csv')

# Create a BP Difference by subtracting post BP to Pre BP
data['BP Difference'] = data['Post-exercise Systolic BP'] - data['Pre-exercise Systolic BP']

# Export the data with the addition of the difference column back to the csv file
data.to_csv('exercise_data.csv', index=False)

# Print the BP Difference
print(data['BP Difference'])

0      64
1     116
2      81
3      88
4      55
     ... 
95     22
96    104
97     53
98     38
99     39
Name: BP Difference, Length: 100, dtype: int64


Task 7: Total Blood Pressure Reduction for Each Exercise Group
- Read CSV file
- Compute the total blood Pressure reduction of each exercise group

In [16]:
# Import Library
import pandas as pd

# Read file using panda
data = pd.read_csv('exercise_data.csv')

# Group the data by Exercise Group
grouped = data.groupby('Exercise Group')

# Calculate the total blood pressure reduction for each group
# This sums up the post BP for each group and subract from th sum of pre BP for each group
total_reduction = grouped['Post-exercise Systolic BP'].sum() - grouped['Pre-exercise Systolic BP'].sum() 

# Print the total blood pressure reduction
print("Total Blood Pressure Reduction:")
print(total_reduction)


Total Blood Pressure Reduction:
Exercise Group
jogging          2516
weightlifting    2322
yoga             1948
dtype: int64
