# Simulating and Modelling datasets for investigating the effects of six weeks aerobic and anaerobic intermittent swimming on VO2max and other lung volumes and capacities in athletes using the numpy.random package.
## Abstract:
Increased interests in physical exercises and the continued involvement in special exercises to enhance uptimum performance and to excel in sports bring many questions. One of them has to do with the kinds of possible physiological changes in the athlete during performance. For athletes, the primary area of interest is the respiratory organs.
So, the questions to investigate are: 
- What are the most important possible changes in the respiratory system? 
- How could the respiratory system changes help the athletes to plan and develop their skills, etc.


## In this project, we:
1.  Chose a real-world phenomenon that can be measured and for which we could model at least 100 data points across 4 different variables, among 2 groups.
2.  We conducted an investigation of the types of variables involved, their likely distributions, and their relationships with each other and we synthesized/simulated a data set as closely matching their properties as possible.
3. Details of our research findings and implementation of the simulated data are presented in this Jupyter notebook.

# Submitted by: Francis Adepoju (G00364694)

### Definition:
1. Aerobic Exercise: Exercise regime designed to increase heart and lung activity while toning muscles by burning fat.
2. Anaerobic Exercise: Exercises such as weight training that improves the athlete’s strength but does not raise his/her heart rate.


Within the human body’s functioning system, respiratory system is special. This is because the respiratory organs play important and prominent role in metabolism to provide the necessary energy for different body tissues and
organs in order to perform their various roles in the body.
Research has confirmed that the respiratory system is influenced by short and long term exercises[ 1].

In order to enhance the athlete’s performance, most coaches expose the athletes to a session of aerobic or anaerobic intermittent exercise and training. Engaging in intermittent exercises has been proven to be effective for physical preparation of the athlete by considerably increasing or developing, and to also help promote the efficiency of different body systems, especially the respiratory system which is of great importance in the athlete’s physical preparation [2].
It has been observed in [2] that Aerobic and Anaerobic Intermittent exercises play important roles in promoting VO2max level of athlete from different sport fields. They enable the athlete to be able to do more work with reduced exhaustion. Some of the researchers do believe that intermittent exercises in addition to increasing the level of VO2max also influence some of pulmonary capacities and volumes. This can be attributed to the effects of increase in respiratory muscles strength and pulmonary performance.

The maximum oxygen the body uses during exercise in order to get exhausted is referred to as VO2max. It is the estimation of the maximum consumed oxygen. The maximum consumed oxygen is one of the best variables for predicting cardiorespiratory endurance and aerobic preparation. 
As the individuals need for energy is proportional to the body size, so is the maximum energy consumption (VO2max) can also be expressed as a function of the body weight. The regular aerobic physical
activity cause increase in VO2max [3].

Because there is a strong link between VO2max and pulmonary capacities and volumes, we decided to model the influence of aerobic and anaerobic intermittent swimming exercises on VO2max, Expiratory Reserve Volume(ERV), Residual Volume(RV), and Total Lung Capacity(TLC) of a simulated batch of athletes – before and after a session of six-weeks of aerobic and anaerobic intermittent exercises in 100 athletes (50 in each group).


# Synthesized Data

In [2]:
import numpy as np
import pandas as pd

data = {
        'vo2b': [100, 200, 300, 400, 500],
        'vo2a': [110, 230, 330, 410, 580],
        'ervb': [100, 200, 300, 400, 500],
        'erva': [100, 200, 300, 400, 500],
        'vcb' : [100, 200, 300, 400, 500],
        'vca' : [100, 200, 300, 400, 500],
        'rvb' : [100, 200, 300, 400, 500],
        'rva' : [110, 230, 330, 410, 580],
        'tlcb': [100, 200, 300, 400, 500],
        'tlca': [110, 230, 330, 410, 580]
       }

frame = pd.DataFrame(data)
print(frame)


   vo2b  vo2a  ervb  erva  vcb  vca  rvb  rva  tlcb  tlca
0   100   110   100   100  100  100  100  110   100   110
1   200   230   200   200  200  200  200  230   200   230
2   300   330   300   300  300  300  300  330   300   330
3   400   410   400   400  400  400  400  410   400   410
4   500   580   500   500  500  500  500  580   500   580


In [3]:
import numpy as np
import pandas as pd

# Chose to use the normal distribution to generate my sythesized data due to it's ability to...
# 
data2 = pd.DataFrame({'vo2b': np.random.normal(1.8, 0.05, 100), 
                      'vo2a': np.random.normal(85, 1, 100),
                      'ervb': np.random.normal(1.8, 0.05, 100),
                      'erva': np.random.normal(85, 1, 100),
                      'vcb' : np.random.normal(1.8, 0.05, 100),
                      'vca' : np.random.normal(85, 1, 100),
                      'rvb' : np.random.normal(1.8, 0.05, 100),
                      'rva' : np.random.normal(85, 1, 100),
                      'tlcb': np.random.normal(1.8, 0.05, 100),
                      'tlca': np.random.normal(85, 1, 100),
                     })
print(data2)

        vo2b       vo2a      ervb       erva       vcb        vca       rvb  \
0   1.829932  85.583779  1.777143  84.121151  1.791195  86.833990  1.844461   
1   1.801518  83.000198  1.744366  86.027849  1.688775  81.763278  1.719472   
2   1.712542  85.260146  1.732403  86.328520  1.783636  84.492260  1.786097   
3   1.760631  85.455248  1.813264  84.993409  1.811874  85.722781  1.822216   
4   1.800920  84.007661  1.817079  85.571198  1.785068  84.007871  1.859371   
5   1.797757  86.223190  1.720082  86.781886  1.844476  83.017011  1.852802   
6   1.825137  87.072358  1.801419  84.389741  1.839572  83.608363  1.813114   
7   1.864123  84.780071  1.812878  83.959939  1.828770  86.689028  1.771366   
8   1.772879  83.612477  1.725656  83.420182  1.759462  85.747891  1.841306   
9   1.856098  85.185868  1.740948  84.861643  1.744631  83.458538  1.811474   
10  1.802761  85.760466  1.725327  84.734458  1.809021  84.909060  1.742896   
11  1.782923  86.429449  1.757971  84.473226  1.7434