# Title: Using Python in Noteable  
**Author details:** *Author:*  Mairead Bermingham. *Contact details:* mairead.bermingham@ed.ac.uk.  
**Notebook and data info:** This notebook provides an example of using Python as a calculator, and to collect BMI data entered in the notebook, and finally saving the data as a csv file to the working 'Data' folder.     
**Data:** Data consists of numerical data. The data was collected was inputted on the 3rd June 2022.     
**Copyright statement:** This notebook is the product of The University of Edinburgh.  


## Python arithmetic operators
Python operators are used to perform operations on values and variables. Arithmetic operators are used to perform mathematical operations like addition, subtraction, multiplication and division. Here are five arithmetic operators in Python:

Operator | Name           | Example
-------- | -----------    |-----------
+        | Addition       | x + y
-        | Subtraction    | x - y
*        | Multiplication | x * y
/        | Division       | x / y
**       | Exponentiation | x ** y

**Note**: R arithmetic operators are identical, except that you can use ^ or ** for  exponentiation.

### Here is an example of using Python as a calculator to perform a numerical operation.

In [1]:
(3 + 7)**2 - 1*10/4

97.5

If you would like to the the same in R here is the code:
#### Here is an example of using R as a calculator to perform a numerical operation. 
(3 + 7)^2 - 1*10/4

# Collecting data entered into the notebook

Here we are going to use the `array()` function from the *numpy* Python package to collect data entered into the notebook. The *numpy* package is the core package for scientific computing in Python. It provides a high-performance multidimensional array object, and tools for working with these arrays. A *numpy*  array is a grid of values, all of the same type. We are going to name each collection of data, and then perform a numerical operation to calculate BMI. In this example we conduct an analysis that is analogous to working in a spreadsheet.   

To calculate BMI we need to divide weight in kilograms by height in meters squared. We use the `power()` function from the *numpy*  package square height. The `power()` function treats elements in the first input array as a base and returns it raised to the power of the corresponding component of the second input array.  

Jupyter will only print the result of the last line in a cell. You will need to use the the Python `print()` function to print the output from earlier lines. The Python `print()` function takes in any number of parameters, and prints them out on one line of text.  

We will then use the `matrix()` function from the *numpy*  package to combine the Height, Weight and BMI arrays, and the `transpose()` function from the *numpy*  package to assign record number as rows, and the Height , Weight and BMI arrays as columns to the matrix. The `matrix()` function returns a matrix from an array-like object, or from a string of data. A matrix is a specialized 2-D array that retains its 2-D nature through operations. The `matrix()` function reverses the axes of an array.  

In [11]:
#Load the 'numpy' package
import numpy as np
Height =np.array([1.6, 1.8, 2.0, 2.5]) # height data in m
Height2 =np.power(Height,2) # height data in m
Weight =np.array([50, 60, 64, 95])     # weight data in kg
BMI = Weight/Height2      # BMI
print(BMI) #to print the BMI array
np.transpose(np.matrix([Height , Weight, BMI])) # column bind, like spreadsheet

[19.53125    18.51851852 16.         15.2       ]


matrix([[ 1.6       , 50.        , 19.53125   ],
        [ 1.8       , 60.        , 18.51851852],
        [ 2.        , 64.        , 16.        ],
        [ 2.5       , 95.        , 15.2       ]])

If you would like to the the same in R here is the code:

### Using the `c()` and `cbind()` *base* R functions to combine data values and objects respectively. 
Here we are going to use the c() function from the *base* R package to collect data entered at the console. The 'c()' function in R programming stands for combine. We are then going to name each collection of data and perform a numerical operation to calculate BMI. We will then use the `cbind()` to combine the Height,  Weight and BMI objects. The `cbind()` *base* R function combine R objects by rows or columns. In this example, we conduct an analogous analysis to working in a spreadsheet.  

`Height <- c(1.6, 1.8, 2.0, 2.5)` # height data in m  
`Weight <- c(50, 60, 64, 95)`    # weight data in kg   
`BMI <-   Weight/(Height^2)`      # BMI  
`cbind(Height , Weight, BMI)` # column bind, like spreadsheet  
    

# Using *pandas* to write the BMI data to file
*pandas* is a Python package for data manipulation and analysis.
Packages are extensions to the Python statistical programming language. Python packages contain code, data, and documentation in a standardised collection format that can be installed by Python users.  
Load the Python package 'pandas' that you will need to run the next few lines of code.  

In [3]:
import pandas as pd
#Create the numpy array
array = np.transpose(np.matrix([Height , Weight, BMI]))# column bind, like spreadshee
print(array)
#Create a list of index names
index_values = list(range(1,5))
print(index_values) 
#Create a list of column names
column_values = ['Height' , 'Weight', 'BMI']
column_values  
#Create the dataframe
df = pd.DataFrame(data = array, 
                  index = index_values, 
                  columns = column_values)
  
#Displaying the dataframe
print(df)  

[[ 1.6        50.         19.53125   ]
 [ 1.8        60.         18.51851852]
 [ 2.         64.         16.        ]
 [ 2.5        95.         15.2       ]]
[1, 2, 3, 4]
   Height  Weight        BMI
1     1.6    50.0  19.531250
2     1.8    60.0  18.518519
3     2.0    64.0  16.000000
4     2.5    95.0  15.200000


### Print the data  
If you’re using a Jupyter notebook, outputs from simply typing in the name of the data frame will result in nicely formatted outputs. Printing is a convenient way to preview your loaded data, you can confirm that column names were imported correctly, that the data formats are as expected, and if there are missing values anywhere.

In [4]:
df

Unnamed: 0,Height,Weight,BMI
1,1.6,50.0,19.53125
2,1.8,60.0,18.518519
3,2.0,64.0,16.0
4,2.5,95.0,15.2


#### head() and tail()  *pandas* functions
`head()` and `tail()` need to be core parts of your go-to Python *pandas* functions for investigating your datasets.
The `df.head()` function prints out the first five row of the data set

In [5]:
df.head()

Unnamed: 0,Height,Weight,BMI
1,1.6,50.0,19.53125
2,1.8,60.0,18.518519
3,2.0,64.0,16.0
4,2.5,95.0,15.2


### Writing files to your working 'Data' folder
One *pandas* function you will need to know is `to_csv()` to write a data frame to a .csv file in your working 'Data' folder.

In [6]:
df.to_csv('../Data/BMIDataPY.csv', index=False)
# That is the BMI data frame saved to the working 'Data' folder.

If you would like to the the same in R here is the code:

### Using *readr* from *tidyverse* to write the BMI data to file
*tidyverse* is a collection of essential R packages for data science. *readr* provides a fast and friendly way to read rectangular data from delimited files, such as comma-separated values (CSV) and tab-separated values (TSV). A data frame is a table or a two-dimensional array-like structure in which each column contains values of one variable and each row contains one set of values from each column. The function `data.frame()` creates data frames.  

Load the R packages that you will need to run the next few lines of R code.  
`library(tidyverse)`  
`library(readr)`  

Create a data frame from the BMI data.  
`Data<-data.frame(cbind(Height , Weight, BMI))`  
`view(Data)`  
`head(Data)`  
`write_csv(Data, file="./Data/BMIData.csv")`   

#That is the BMI data frame saved to the working 'Data' folder.


I hope these examples help you to improve your Python programming skills. Happy Coding!