### Assessment Description

A researcher is conducting a study on the effects of different exercise regimens on blood pressure. The study involves 100 participants who are randomly assigned to one of three exercise groups: jogging, weightlifting, or yoga. Each participant's blood pressure is measured before and after the 6-week exercise program.

The researcher has collected the data and stored it in a CSV file. The file contains the following columns:

Participant ID (numeric)
Exercise group (text: "jogging," "weightlifting," or "yoga")
Pre-exercise systolic blood pressure (numeric)
Post-exercise systolic blood pressure (numeric)
The researcher wants to analyze the data using Python and NumPy. Complete the following tasks as part of the initial statistical analysis of the scenario above.

### Generate Synthetic Dataset on Exercise and Blood Pressure

1.     Create a Python script that generates a synthetic dataset matching the description of your study. The dataset should be saved as a CSV file named "exercise_data.csv"

In [2]:
import random
import pandas as pd
import numpy as np

In [1]:
# Number of participants
number_of_participants = 100

# Create synthetic data
np.random.seed(0)  # For reproducibility
participant_ids = np.arange(1, number_of_participants + 1)
exercise_groups = np.random.choice(['jogging', 'weightlifting', 'yoga'], number_of_participants)
pre_exercise_bp = np.random.normal(120, 15, number_of_participants)  # Assume normal distribution around 120 mmHg
post_exercise_bp = pre_exercise_bp - np.random.normal(5, 10, number_of_participants)  # Decrease with some variability

# Create DataFrame
data = {
    'Participant ID': participant_ids,
    'Exercise group': exercise_groups,
    'Pre-exercise systolic BP': pre_exercise_bp,
    'Post-exercise systolic BP': post_exercise_bp
}
df = pd.DataFrame(data)

# Save as CSV file
csv_file_path = 'exercise_data.csv'
df.to_csv(csv_file_path, index=False)



Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


OSError: Cannot save file into a non-existent directory: '\data'

### Highest Pre-Exercise Blood Pressure by Group

2.     Write a Python script to read the "exercise_data.csv" file and print the participant with the highest pre-exercise systolic blood pressure in each exercise group.

### Extract the 5 Participants with Highest Blood Pressure

3.     Write a Python function that sorts the list based on blood pressure and displays the full record of the top 5.

### Monthly Blood Pressure Changes

4.     Write a Python script that assumes that blood pressure measurements were taken monthly. Compute and print the average change in blood pressure for each exercise group. Note: This is hypothetical as the original study is for 6 weeks only.

### Compare Pre- and Post-Exercise Blood Pressure

5.     Search for the 5 participants from the pre-exercise (Topic 4) and find their post-exercise blood pressure. Produce a table that compares their pre- and post-exercise pressure and displays the difference.

### Total Blood Pressure Reduction for Each Exercise Group

6.     Write a Python script to read the "exercise_data.csv" file and compute the measures of central tendency for each exercise group: mean, mode, standard deviation.