# Lecture 1 Python Basics

## Topics:

1. Jupyter notebook and Python
2. Python Operators and Variables
3. Lists
4. FOR loops


# 1. Jupyter

## Why jupyter?

- interactive
- document your work: Code, visualizations, and markdown comments
- Easy to replicate
- Easy to share
- Python and R
- Pick jupyter lab or notebook

Check it out:
https://jupyter.org

Documentation for jupyter lab: https://jupyterlab.readthedocs.io/en/stable/user/interface.html

### Idea 1: Notebook consist of code and markdown cells: 
A notebook consists of cells. Mainly markdown and code cells. You can copy cells, create new ones, delete cells,  run the whole notebook at once.

With these cells, you do two things:

1. Note down ideas, write explanations, etc. in markdown cells. Run these cells to format them.
2. Write Python commands in code cells. Run these cells to execute code.

### Idea 2: The kernel is the engine under the hood
Runs your Python session, keeps track of variables and values.
Restart clears memory of anything that happened in previous session, but does not alter typed text or code in cells.

To oversimplify things, think of Python as made up of two parts: 1) a big table that keeps track of variables and their values, and 2) an 'intepreter' that executes commands and updates the variable table.

### Idea 3: All your notebook work is saved as an .ipynb file
This file keeps code, markdown text, etc. which you can easily share. It does NOT contain the external data files you may have used to generate images, plots, etc. Sharing notebooks is a good way to share code and reproduce analyses! Ipynb files are the way we distribute and turn in homework in this class.

The file does NOT save your active Python session. When you quit, the kernel is shut down and the session memory is wiped clean. You need to rerun your code when you restart a new session.

## Markdown formatting

# Big Header
## Smaller Header
### Even Smaller Header

*Italic* text
**Bold** text

Use backticks for plain text for code examples:
`print('Hello World'\n)`

Code example with syntax coloring:

```python
data = [1,2,3]
print(data)
```


Lists are created by typing a number followed by a period or parentheses:
1. Adenine
2. Cytosine
3. Guanine
4. Thymine

Math (LaTex format) is enclosed by dollar signs:

$sample\_mean = \frac{sum}{n}$

$x = {-b \pm \sqrt{b^2-4ac} \over 2a}$


In [4]:
# This is a code cell. When you hit shift enter, commands are run in line order, top to bottom
# When you run the cell, note In [1]: - input number. 
# Tells you that this cell was run, number is the nth cell run.
# Rerun this cell and it will change to two

x = 'Hello World'
print(x) # prints the value of the variable

Hello World


## Important: The order in which cells are run, not their visual order is what counts.

Under the hood, the Python intepreter doesn't care about the visual order of the cell in your notebook. Only the order in which the cells were run matters. Variable values depend on run order, not notebook order. If you write your code such that it a lower cell requires that a higher cell run first, when you run them out of order you won't get the result you expect.

Note the input numbers on the left - they tell you the order in which things were run.

You can always reset the entire notebook by selecting one of the Restart options from the Kernel menu above. 


Two rules:

1. Lines of code within a cell are executed in strict order, top to bottom.
2. Cells can be run in any order (as long as the result is valid Python)

In [5]:
x = 'Hello World'

In [8]:
print(x)

10000


In [7]:
x = 10000 
# running the cell above will print the new value, 
# even though the new value was set in a cell below

# 2. Python


Very widely used programming language in science and esp. machine learning.

    - 'Scripting language' (interpreted at runtime, high level makes it easy to code and run across systems)
    
    - Very general purpose, can be applied to a wide range of computational problems.
    
    - Many scientific packages for data wrangling, statistics, and visualization
    
     
## What does Python do?

1. Executes commands
2. Interacts with the host system (filesystem, OS)
3. Displays any output
4. Keeps a table of variables and their values

# 3. Python Operators 

In [9]:
#subtraction
10-5

5

In [10]:
# multiplication
3*5

15

In [12]:
# Operators act differently with different data types 
'DNA ' * 5

'DNA DNA DNA DNA DNA '

In [13]:
# Parentheses group operations just as in math

(10*3) + 2

32

In [14]:
# Comparison
10 > 5

True

### Ask Wustl ChatGPT what the binary Python operators are

# 4. Python Data Types
(Ask ChatGPT)

In [15]:
# integer
10

10

In [16]:
# floating point number

3.14

3.14

In [17]:
# Boolean

True

True

In [18]:
# String (any characters)

'DNA'

'DNA'

In [19]:
# Also a string

'10'

'10'

In [20]:
# operators work only on correct data types
# Integer and float:

10/3.2

3.125

In [21]:
# Try 10/'DNA'. Does it work?

# 5. The Assignment Operator

Python would just be a basic calculator if you couldn't save values to variables.

Key idea: we *assign* values to a variable with the assignment operator

The `=` is one of the most important operators in Python. It *assigns* values to variables.

Name of the variable is on the left of the assignment operator and an expression is on the right.

Deep cut: Don't think of a variable as a container that holds a value. Instead, variables are pointers that point to objects in memory. So `n` points to the `int` object 25 in the below example. You can reassign `n` to point to a different object, and the same object can have multiple variables pointing at it. This latter phenomenon can lead to some unexpected results if you don't understand the concept of a vaiable as pointer. For example:

In [22]:
# Assigning values to a variable 
# note variable names in black

n = 25 # assigning an integer
nucleotide = 'adenine' # assigning a character string
pi_approx = 3.14 # assigning a float
is_detected = True # boolean type

In [23]:
# You can write expression that operate on data values

answer = 25 + 3.14
answer

28.14

In [24]:
# More often we write expressions that operate on variables

answer = n + pi_approx
answer

28.14

### Rules for naming variables:
1. Names are case sensitive = `DNA`, `Dna`, and `dna` are all distinct.
2. Don't begin with a number
3. Generally stick to upper and lower case letters, and numbers
4. Avoid these reserved names that are already used by Python:

`and, as, assert, break, class, continue, def, del, elif, else,
except, False, finally, for, from, global, if, import, in, is, 
lambda, None, nonlocal, not, or, pass, raise, return, True, try, 
while, with, yield`

In [25]:
# To remember the list, try
help()
# then type keywords. TYPE QUIT IN THE HELP INTERFACE TO CONTINUE.

Welcome to Python 3.12's help utility! If this is your first time using
Python, you should definitely check out the tutorial at
https://docs.python.org/3.12/tutorial/.

Enter the name of any module, keyword, or topic to get help on writing
Python programs and using Python modules.  To get a list of available
modules, keywords, symbols, or topics, enter "modules", "keywords",
"symbols", or "topics".

Each module also comes with a one-line summary of what it does; to list
the modules whose name or summary contain a given string such as "spam",
enter "modules spam".

To quit this help utility and return to the interpreter,
enter "q" or "quit".



help>  q



You are now leaving help and returning to the Python interpreter.
If you want to ask for help on a particular object directly from the
interpreter, you can type "help(object)".  Executing "help('string')"
has the same effect as typing a particular string at the help> prompt.


## 6. Built-in functions

Function has a name. To call the function, type the function name followed by parentheses. Within the parentheses go zero or more arguments.

In [26]:
print(answer)

28.14


In [27]:
type(nucleotide)

str

In [28]:
len('AGCTAGCGATCG')

12

## 7. Lists
Lists are one of the most important data types in this course and in data analysis. 

List properties:

1. Defined by enclosing zero or more values in **square brackets**, with individual items separated by comma.

2. Lists are **ordered**: values are organized by their position in the list. We'll see other data types for which this isn't true.

3. Lists elements can be of any type, or multiple types in a single list. But best practice is to stick with a single data type in the list.

4. Lists are **mutable**: values can be changed


## Key things to know working with lists:

1. Indexing/slices
2. Appending
3. Other list functions (`sum()`, `len()`, `zip()`, etc.)
4. Iteration with `for` loops

In [30]:
# A list of floats - say four measurements from an experiment.
data = [10.2, 11.1, 11.0, 9.5]

In [31]:
# How many items are in the list? Use the lenght function len() to find out

len(data)

4

In [32]:
# Access individual values in a list using indexing. Remember, lists are ordered!
# COMPUTER SCIENTISTS BEGIN COUNTING FROM 0
# Rerun this cell a few times with different position values.

data[0]

10.2

In [33]:
# Access by counting backwards from the end
data[-1]

9.5

In [34]:
# Extract a range of positions. This is called a "slice".
# To take a slice, the first number is the start position, the second is the *last position plus one*:

data[2:4]

# Try this a few times with different values. 
# Why is this numbering useful? What do you get if you subtract the last index from the first in a slice?

[11.0, 9.5]

In [35]:
# You can change what's inside a list. In tech terms, lists are mutable.
# Replace values in a list using indexing.

data[1] = 11.3
print(data)

[10.2, 11.3, 11.0, 9.5]


In [36]:
# Test for membership in a list with 'in'

11.3 in data

True

In [37]:
# Add items to a list using append() - note the dot notation as we use this function.

data.append(10.6)
print(data)

[10.2, 11.3, 11.0, 9.5, 10.6]


In [38]:
# Very useful function - count occurrences of a value

count_data = [0,1,4,2,4,0,0,2,6,3]
count_data.count(0) # how many zeros in our data?

3

# 8. Quick introduction to FOR loops 

`For` loops repeat a block of code. They are a good way to **iteratively** perform actions on items in data structures like lists and dictionaries. You can also set `for` loops to run a fixed number of times.

**FOR loop SYNTAX TO REMEMBER:**
Two ways to code a FOR loop to repeat a block of code:


# 11. Quick introduction to FOR loops 

`For` loops repeat a block of code. They are a good way to **iteratively** perform actions on items in data structures like lists and dictionaries. You can also set `for` loops to run a fixed number of times.

**FOR loop SYNTAX TO REMEMBER:**
Two ways to code a FOR loop to repeat a block of code:


In [40]:
# Syntax option 1: iterate directly over a list
for data_point in data:
    print(data_point)

10.2
11.3
11.0
9.5
10.6


In [41]:
# Example of a FOR loop
# Note the indents: indented lines are executed each time through the for loop
# Non-indented lines are executed *after* the for loop

total = 0
for i in data:
    total += i
print(total/len(data))

10.52


# Use ChatGPT to generate code for a list of the squares of te first 10 integers

# ACTIVITY: Complete Activity 1 to practice working with Python variables

To learn more on this topic see:

https://python.swaroopch.com/basics.html

https://python.swaroopch.com/op_exp.html

# ACTIVITY: Complete The activity to practice working with Python variables

To learn more on this topic see:

https://python.swaroopch.com/basics.html

https://python.swaroopch.com/op_exp.html

## Supplementary reading:

This was just a brief overview of basic Python syntax. **If you are unfamiliar with Python and need to reinforce your understanding, work through the supplementary reading.**

If you're new to Python:

Read the chapters Basics, Operators & Expressions from the online textbook *A Byte of Python*
https://python.swaroopch.com


If you know Python:
**I strongly recommend that you build your Python knowledge by reading the official documentation, which presents more rigorous explanations of the implementationof Python syntax.**

A more advanced introduction is in the official Python documentation:

Variables: https://docs.python.org/3/tutorial/introduction.html
