# Coding is Just Coding 

Once you learn one coding language, you can read (and sometimes even write) in a lot of other coding languages. Today we're going to go through how to code what we did on the first day in Python (another popular coding language). The syntax is slightly different, but you can still read a lot of it. The point of this is to realize that if someone hands you a code script in a different language, you shouldn't freak out. 

## Getting started 

In R Studio, create a python script. This will require the `retriculate` package and you may also need to install the `IRkernel` package, as well. Install both. 

We're now going to go through what we did on day one. 

Open a .py script

# Basic Data Types

In [32]:
# Andie Creel / January 2024 / Goal: Redo day one in python

# Run basic arithmatic 

2 + 3

# variable assingment is done with an '=' sign, instead of the '<-' sign
a = 2
b = 3

print(a + b)


# Numeric -- integer: no decimal points
myInt = 1

# Numeric -- floating point: decimal points
myNum = 2.4

# logical (Boolean): a true/false statement. Use parentheses to evaluate if something is true or false
myBool_1 = (3 < 4)
myBool_2 = (3 > 4)

# character (string)
myChar_a = "a"
myChar_b = 'b'


5


# Ways to store datatypes  

You will need to install the `numpy` package by running `pip install numpy` in the terminal. NumPy is the main package for scientific computing in python. 

## Vectors and Matrices

Notice that indexing in python starts at 0, rather than 1 (as it did in R). In python, we just use lists instead of vectors. 

In [33]:
import numpy as np

# Lists can contain elements of different data types
myList_n = [1, 2, 3, 4, 5] 
print(myList_n[0])

myList_s = ["str", "b", "c"]
print(myList_s[0])

myList_all = ["str", 1, True]
print(myList_all)



1
str
['str', 1, True]


In [16]:

# NumPy array (similar to R matrix): should contain elements of the same data type
# In this case, we're creating a 2x5 matrix
# the . here works similar to %>% in dplyr 
myMat_n = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10]).reshape(2, 5)
myMat_n

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

## Lists

In [17]:
# Lists: Can contain elements of different data types, including other lists or arrays
myList = [2, "c", myMat_n]

# Accessing the first element of the list
myList[0]  # returns numeric (2 in this case)


2

In [18]:
myList[1] # returns C

'c'

In [19]:
myList[2] # returns the matrix

array([[ 1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10]])

## Data frames

To work with data frames, we need to install and load the `pandas` package. Run `pip install pandas` in your terminal. 

In [21]:
import pandas as pd 

# Create a DataFrame from the NumPy array
myDF = pd.DataFrame(myMat_n)
myDF


Unnamed: 0,0,1,2,3,4
0,1,2,3,4,5
1,6,7,8,9,10


In [22]:
# Print column names (initially they are just integer indices)
print(myDF.columns)

RangeIndex(start=0, stop=5, step=1)


Unlike R, python automatically names unamed columns with integrers. 

In [23]:
# Rename the columns 
myDF.columns = ["age_yr", "weight_lb", "income_$", "height_ft", "height_in"]
myDF

Unnamed: 0,age_yr,weight_lb,income_$,height_ft,height_in
0,1,2,3,4,5
1,6,7,8,9,10


In [24]:
# investigate one column (index is the left column, value is the right column)
myDF['age_yr']

0    1
1    6
Name: age_yr, dtype: int64

In [25]:
# Create a new column
myDF['nonsense'] = myDF['age_yr'] + myDF['weight_lb']
myDF['nonsense'] 

0     3
1    13
Name: nonsense, dtype: int64

In [26]:
# Create a DataFrame
myPpl = pd.DataFrame({
    'gender': ["Male", "non-binary", "Female"],
    'male': [True, False, False],
    'height': [152, 171.5, 165],
    'weight': [81, 93, 78],
    'age': [42, 38, 26]
})

# Reference one column (either of these work)
myPpl['male']



0     True
1    False
2    False
Name: male, dtype: bool

In [27]:
myPpl.male

0     True
1    False
2    False
Name: male, dtype: bool

# Functions

`def` stands for definition. The syntax for writing a function is different, and is a good example of how white space is important in python (notice that there are no parentheses).

In [28]:
def myF(x):
    y = x - x**2
    return y

myF(.5)

0.25

# Loops 

Loops are another example where you can read the code even if you don't know python. However, they have some differnt syntax with the range function, specifically that the last value is excluded. 


In [29]:
# Notice that 5 doesn't print 
for i in range(1, 5):  # range(start, stop) in Python is inclusive of start and exclusive of stop
    print(i)


1
2
3
4


In [30]:
# combining loop and function
for i in range(1,5):
    y = myF(i/4)
    print(y) 

0.1875
0.25
0.1875
0.0


# If Else Statements 

Same example as day one: Our RA did not record men's ages right and all men are actually 3 years younger than what's recorded. 

The major thing we need to be aware with in Python is use of the `.loc` funciton, which let's us reference cells with their column names and row index. 



In [31]:
# Initialize new column 
myPpl['age_new_m'] = myPpl['age']

# Iterating through the DataFrame useing the row (i) and column (male) to reference the location 
for i in range(len(myPpl)):
    if myPpl.loc[i, 'male']:
        myPpl.loc[i, 'age_new_m'] = myPpl.loc[i, 'age'] - 3

print(myPpl)


       gender   male  height  weight  age  age_new_m
0        Male   True   152.0      81   42         39
1  non-binary  False   171.5      93   38         38
2      Female  False   165.0      78   26         26


# Advanced

I wrote this is in a Jupyter Notebook, which is the pythons verion of an R notebook. To write a Jupyter Notebook using R, you need to do a few aditional (and advanced) steps. 

In an R consol, run `install.packages('IRkernel')`

In the Terminal, run `jupyter notebook`

This will open up a Jupyter Notebook host in your web browser. You can create a new .ipynb file (aka a Jupyter Notebook). 

Unlike an R Markdown file, you cannot automatically knit it to an HTML or pdf document. However, you can download it as an HTML, or you can download a LaTex file and compile the LaTex file to a PDF on your computer (which is a more advanced step then downloading the HTML).