# Table of Contents
1. [Basics](#basics)
1. [Data Types](#data_types)
    - Booleans
    - Integers
    - Floats
    - Strings
    - Basic Operations
1. [Data Structures](#data_structs)
    - Lists
    - Tuples
    - Dictionaries
1. [Basic Logic Operations](#logic_ops)
    - If/Else Statements
    - For Loops
    - While Statements
    - List Comprehension
1. [Objects](#objects)
    - What is an object?
    - Initialization
    - Attributes
    - Methods
1. [Advanced Topics](#advanced) 
    - Debugging
    - Importing Libraries
    - Data Manipulation
    - Statistics
    - Big O Notation (Runtime)
1. [Next Steps](#next_steps)  

# 1. Basics
<a id='basics'></a>

In [None]:
# This is an inline comment. This does not do anything, but allows you to make notes for yourself or other readers

In [None]:
# This is a print statement. You will use this extensively when debugging your code

print('Hello, World!')

In [None]:
# By default, Jupyter will also print out the variable in each cell, even without a print statement

'Hello, World!'

To execute a cell, click into it and press Shift + Enter

# 2. Data Types
<a id='data_types'></a>

## Booleans

In [None]:
# True/False is interpreted the same as 1/0

True 

## Integers

In [None]:
1

## Floats

In [None]:
3.1415

## Strings

In [None]:
'Hello, World!'

## Variables

In [None]:
var = 'Hello, World!'

var

## Basic Operations

In [None]:
# Add

3 + 2

In [None]:
# Subtract

3 - 2

In [None]:
# Multiply

3 * 2

In [None]:
# Divide

3 / 2

In [None]:
# Remainder

3 % 2

In [None]:
# Exponent

2 ** 2

In [None]:
# You can also add strings

'a' + 'b'

In [None]:
# Equal to (note the double equals!)

1 == 2

In [None]:
# Not equal to

1 != 2

In [None]:
# Less than

1 < 2

In [None]:
# Greater than

2 > 1

In [None]:
# Less than or equal to

1 <= 2

In [None]:
# Greater than or equal to

2 >= 1

In [None]:
# Some different types don't play nice!!

'a' + 1

# 3. Data Structures
<a id='data_structs'></a>

## Lists

In [None]:
my_list = [1, 2]

my_list

### List Operations

In [None]:
# Get element at position (list indexing)

my_list[0]

In [None]:
# Append item

my_list.append(4) 

my_list

In [None]:
# Edit item

my_list[2] = 3

my_list

In [None]:
# Reverse list

my_list.reverse()

my_list

In [None]:
# Sort list

my_list.sort()

my_list

In [None]:
# Add lists

my_list += [11, 12, 13]

# the above is shorthand for:
# my_list = my_list + [11, 12, 13]

my_list

In [None]:
# There are many many many more list operations

### Tuples

In [None]:
my_tuple = (1, 2)

my_tuple

### Tuple Operations

In [None]:
my_tuple[1]

In [None]:
# Compared to lists, tuples are immutable (cannot be changed once made)!

my_tuple.append(3)

## Dictionaries

In [None]:
my_dictionary = {
    'first': 'John',
    'second': 'Doe',
    'systolic': 120,
    'diastolic': 80
}

my_dictionary

### Dictionary Operations

In [None]:
# Get value

my_dictionary['first']

In [None]:
# Set value

my_dictionary['first'] = 'Jane'

my_dictionary

# 4. Basic Logic Operations
<a id='logic_ops'></a>

## If / Else Statements 

In [None]:
var = 1

if (var == 1):
    print('low')
elif (var == 2):
    print('medium')
else:
    print('high')

## For Loops

In [None]:
for x in [1, 2, 3]:
    print('Do something')
    print(x)


In [None]:
# Another way to do the same thing

for x in range(1, 4):
    print('Do something')
    print(x)

## While Loops

In [None]:
x = 1

while x <= 3:
    print(x)
    
    x += 1
    
print('Final Value:', x)

## String Functions

In [None]:
# Strings come with a whole set of complex operations

'Hello, World!'.replace('World', 'Georgie').split(',')[1].strip()

In [None]:
# As an example, let's pretend we have an amino acid sequence, like the one below for CLK1.

CLK1 = 'HYLESRSINEKDYHSRRYIDEYRNDYTQGCEPGHRQRDHESRYQNHSSKSSGRSGRSSYKSKHRIHHSTSHRRSHG'

# How would we identify the resulting peptides if we split this by a restriction enzyme at K? One way would be to iterate and store a new peptide each time we encounter a K.

peptides = []
new_pep = ''

for letter in CLK1:
    if letter == 'K':
        peptides.append(new_pep)
        new_pep = ''
    else:
        new_pep += letter

# need to add the last one
if len(new_pep) > 0: peptides.append(new_pep)
        
peptides


In [None]:
# Or, we could simply use the built in python function "split". This is the case with any operations in python & CS in general -- readily accessible methods will already have been created for many object types.
CLK1 = 'HYLESRSINEKDYHSRRYIDEYRNDYTQGCEPGHRQRDHESRYQNHSSKSSGRSGRSSYKSKHRIHHSTSHRRSHG'

# Or, we could simply use the built in python function "split". This is the case with any operations in python & CS in general -- readily accessible methods will already have been created for many object types. 
peptides = CLK1.split('K')

peptides

## Nested For Loops

In [None]:
# Loops can be nested - beware of performance!

for x in [1, 2]:
    for y in [10, 11]:
        print('Do something')
        print(x, y)

## List Comprehension (Slightly more complex!)

In [None]:
my_list = [1, 2, 3]

[x * 10 for x in my_list]

# 5. Objects
<a id='objects'></a>

## What is an object?

In [None]:
# Objects help to add structure to CS and are derived from "classes". If we were talking about a chess game, 
# we might say that the chess game were an object. There is a particular structure for defining this class in Python, as below:

class chessGame:
    def __init__(self):
        self.number_moves_made = 0


# Now, let's say I wanted to use a chessGame in another area of code. I could create an object of chess games.
chess1 = chessGame()

# I could even create two. 
chess2 = chessGame()

# And now, I can manipulate these entities as individual objects of the "chessGame" class
my_games = [chess1, chess2]

## Initialization

In [None]:
# Let's pull our code from above down, and expand on it a little bit. 

class chessGame:
    def __init__(self, time_limit):
        self.number_moves_made = 0
        self.time_limit = time_limit
        

# Notice how we added "time_limit", an argument to the init function, which initializes our object. 
# Thus, we can now create chessGame objects with different time limits

chess1 = chessGame(time_limit = 5)
chess2 = chessGame(time_limit = 10)

# Our initialization function takes in these attributes and generates our object for us.

## Attributes

In [None]:
class chessGame:
    def __init__(self, time_limit):
        self.number_moves_made = 0
        self.time_limit = time_limit
        self.whose_turn = 'white'
        

# In the above, attributes refer to values held/posessed by the object. 
chess1 = chessGame(time_limit = 5)

# We can access these attributes and learn about the particular object.
chess1.time_limit

# And we can change them, too.
chess1.whose_turn = 'black'
chess1.whose_turn

## Methods

In [None]:
class chessGame:
    def __init__(self, time_limit):
        self.number_moves_made = 0
        self.time_limit = time_limit
        self.whose_turn = 'white'
        
    def take_turn(self):
        if self.whose_turn == 'white':
            self.whose_turn = 'black'
        else:
            self.whose_turn = 'white'
        
        
# On the above, we added an imaginary "take_turn" function. This function is called a method, because it
# is a function attached to the class chessGame.

chess1 = chessGame(time_limit = 5)

# the current turn
chess1.whose_turn

# make a move
chess1.take_turn()

# now whose turn?
chess1.whose_turn



# 6. Advanced Topics
<a id='advanced'></a>

## Debugging

In [None]:
# Create a list from 1 to 10 with primes removed

l = []
primes = [2, 3, 5, 7]

for x in range(10):
    if (x not in primes):
        l.append(x)
        
print(f'Desired Outcome:\n[1, 4, 6, 8, 9, 10]\n\nActual Outcome:\n{l}')

# Big O Notation (Runtime)

In [None]:
loops = 100

In [None]:
%%timeit -n1 -r1

for x in range(loops):
    z = 1

    continue
    
# O(n)

In [None]:
%%timeit -n1 -r1

x = loops

while int(x) > 0:
    z = 1
    
    x = x / 2

# O(log(n))

In [None]:
%%timeit -n1 -r1

for x in range(loops):
    for y in range(loops):
        z = 1
        
        continue
        
# O(n^2)

## Importing Libraries

In [None]:
import os

os

To install packages, use the Anaconda GUI, or in termal, type: `pip install package`

### Data Manupulation: Pandas

In [None]:
import pandas as pd

df = pd.DataFrame.from_dict({
    'patient_id': ['A1', 'B2', 'C3'],
    'first_initial': ['A', 'B', 'C'],
    'bp_systolic': [120, 135, 110],
    'bp_diastolic': [80, 90, 75],
    'hypertension': [0, 1, 0]
})

df

### Statistics: SciPy

In [None]:
from scipy import stats

x = df.bp_systolic
y = df.bp_diastolic

# Paired t-test
t, p = stats.ttest_rel(x, y)
print(p < 0.05)

# Pearson's correlation
corr, p = stats.pearsonr(x, y)
print(p < 0.05)

#### Framework to Conduct a Data Analysis
1. Data Cleaning
    - Starting with collected data (Excel sheets), combine data by patient ID
    - Remove outliers and handle missing data
    - Standardize and scale data (all data should be numeric or binary)
    - Encode categorical data
1. Exploratory Analysis
    - Evaluate for normality
    - Evaluate for multicollinearity (regression analysis)
1. Data Analysis
    - Descriptive stats
        - Student's t-test for numeric and chi-squared for binary data
    - Regression analysis
        - Linear and logistic regression
1. Report Data
    - Metrics include: means, standard deviations, counts, percentages, p-values, relative risks, odds ratios, confidence intervals, etc.
    - Visualizations include:
        - Descriptive Tables
        - Bar and boxplots
        - Forest plots
        - ROC curves
        - Survival plots

# 7. Next Steps
<a id='next_steps'></a>

#### What do I do with this knowledge?
- Research
    - Many labs and projects need help with statistics
    - Database studies are purely computer-based and do not require an IRB
    - Everything you can do in things like GraphPad Prism, STATA, and SPSS, you can do in Python!
- Clinical tools
    - e.g. create a tool to screen for tumors in brain MRIs
- Academic tools
    - e.g. create a scheduler for volunteering shifts
- Personal use
    - e.g. create a interface to keep track of workouts

#### What happens if I get stuck?
- Copy and paste your error into Google and go from there
    - Important to learn the terminology so you can Google effectively
- https://stackoverflow.com/

#### Continued Learning
- What we covered today is just the tip of the tip of the iceberg. Just like in medicine, learning in programming never ends
    - https://learnpython.org/
    - https://www.codecademy.com/
    - https://google.github.io/styleguide/pyguide.html (Google Python Style Guide)
    - Learn bash scripting to get around your terminal and file system

- Other tools you might want to download: 
    - Visual Studio Code or PyCharm are two free and feature-rich development environment (alternatives to Jupyter)

- Useful libraries to look into:
    - Data Science and Statistics
        - Pandas
        - SciPy
        - Numpy
        - Statsmodels
        - Matplotlib/Plotly/Seaborn for data visualization (e.g. making figures)
    - Machine Learning
        - OpenCV (also computer vision)
        - Scikit-learn
        - Tensorflow
        - Keras
        - PyTorch
    - Software Development
        - Tkinter/PyQT (user interfaces)
        - Django/Flask: (application backends)
    - Biological Computation
        - BioPython
    - Whatever you want to do, there's probably a library to help you do it

#### Potential Future Sessions
- We will potentially host more advanced future sessions based on interest from you guys
- Potential sessions include:
    - Data analysis for research workshop
    - Leetcode/algorithms workshop
    - Intro to scripting
    - Intro to machine learning and computer vision
    - Whatever you guys want to learn!
- Please complete the post-survey to incidate which sessions you're interested in!