# Lecture 3: Expressions and Data Types

# REPL (read-eval-print loop)
* Type an **expression** into a code cell.
* The python interpreter **evaluates** the expression.
* The notebook **displays the value** of the (last) expression in the cell.

In [None]:
print('Hello world!')

In [None]:
2 * 3

In [None]:
# display two expresssions
3
4

# Numbers and Arithmetic


<img src="./data/arithmetic_table.png"  width="80%" align="middle"/>

## Python uses typical order of operations

In [None]:
3*2**2

In [None]:
(3*2)**2

# Assignment: names and variables

$$ \overbrace{\texttt{myvariable}}^{\text{name}} = \overbrace{\texttt{2 + 3}}^{\text{any expression}} $$

* Assignment statements like above don't have a value.
* An assignment statement changes the meaning of the name to the left of the `=` symbol.
* `myvariable` is bound to `5` (value) not `2 + 3` (expression).

In [None]:
more_than_1 = 2 + 3 #assignment statement

In [None]:
more_than_1 #typing the name of a variable displays its value (contents of a box in memory)

In [None]:
more_than_1 * 2

In [None]:
more_than_1

### Aside: hit ```tab``` to autocomplete a set name

### A variable's value is set at the time of assignment

In [None]:
x = 2
y = 3 + x
y

In [None]:
x = 3

In [None]:
y

# Call Expressions
* Call expressions invoke functions
* Functions are called in Python just like in standard mathematics:
$$ y = f(x) $$
* Inputs are called arguments

In [None]:
abs(-12)

### Some functions can take a variable number of arguments

In [None]:
max?

In [None]:
max(3, -4)

In [None]:
max(2, -3, -6, 10, -4)

### use the ```?``` after a function to see the documentation for a function
* or use the `help` function.

In [None]:
# round
my_number = 1.22
round(my_number)

In [None]:
round?

In [None]:
round(1.22222, 3)

### What functions are available for use? [builtin functions](https://docs.python.org/3/library/functions.html)

<img src="./data/python_builtins.png"  width="900" align="middle"/>

## Import functions from python modules
* Modules are roughly collections of python functions.
* Access these functions via an *import statement*.
* Call the functions using `module.function()` syntax.

### Import the `math` module and look around
* sqrt, log, etc...

In [None]:
import math

In [None]:
math.sqrt(9)

In [None]:
math.pow(3,2)

In [None]:
# what base is log?
math.log?

In [None]:
# tab completion for browsing
math.

<center><img src="data/q2.png"  width="1000"/></center>

In [None]:
x=3 
y=-2

In [None]:
abs(x, y)

In [None]:
math.pow(x, abs(y))

In [None]:
round(x, max(abs(y**2))))

In [None]:
math.pow(x, math.pow(y,x))

# Data Types
* Every value in python has a type (use the `type` function!)
* All data analyzed in this class is stored as a python data type.
* Understanding the data often requires understand how the data was stored.

# Two data types: ```int``` and ```float``` 
* ```int``` : an integer of any size
* ```float```: a number with an optional fractional part

### ```int```
* integer arithmetic: `+`, `-`, `*`, `**`
* ints have arbitrary precision

In [None]:
type(3+5)

In [None]:
2**300

In [None]:
2**3000

### ```float```
* a float is specified using a decimal point
* a float might be printed using scientific notation

In [None]:
type(2.0 + 3.2)

In [None]:
2.0**300

### ```float```
* floats have limited size (but the limit is huge)
* floats have limited precision of 15-16 decimal places
* after arithmetic, the final decimal few places can be wrong (limited precision!)

In [None]:
3.0*4.2

In [None]:
2.0**3000

## Type coercion between ```int``` and ```float```
* by default, python changes an int to float in a mixed expression
* an value can be explicity coerced using ```int``` and ```float``` functions.
* division of two integers automatically returns a float value

In [None]:
2.0 + 3

In [None]:
2/1

In [None]:
# want an integer back
int(2/1)

In [None]:
# int rounds DOWN
int(3.9)

### Be careful converting between ```int``` and ```float```

In [None]:
2.51 * 100

In [None]:
int(2.51 * 100)

### The consequences of `float` to `int` conversion error

The Ariane I exploded on launch in 1996 due to floating point conversion errors: 
[see story here](https://itsfoss.com/a-floating-point-error-that-caused-a-damage-worth-half-a-billion/)

<center><img src="data/ariane.jpg" width="400"/></center>

# Text, Strings, and Types

## A string value is a snippet of text of any length
* enclose a string in either single or double quotes

In [None]:
'oink'

In [None]:
"oink"

In [None]:
"12.0"

### String arithmetic

In [None]:
s1 = 'baby'
s2 = 'porcupine'

In [None]:
s1 + s2

In [None]:
s1 + ' ' + s2

In [None]:
s1*3

### string methods
* Strings are associated with certain functions called *string methods*.
* Access string methods with a `.` after the string.
* e.g. `.upper()`, `.replace()`,...

In [None]:
my_cool_string = 'data science is super cool!'

In [None]:
my_cool_string.upper()

In [None]:
my_cool_string.replace('super', 'super-duper')

### Special characters in strings
* apostrophes, quotes, new-lines, etc...

In [None]:
'my string's full of apostrophes!'

In [None]:
"my string's full of apostrophes!"

In [None]:
# escape the apostrophe with a backslash!
'my string\'s "full" of apostrophes!'

In [None]:
print('my string\'s "full" of apostrophes!')

## Digression: ```print()```
* By default Jupyter notebooks displays the "raw" value of the expression of the last line in a cell.
* The function ```print```, displays the value in human readable text when it's evaluated.

In [None]:
12 # 12 won't be displayed
23

In [None]:
print(12)
print(23)

In [None]:
my_newline_str = 'here is a string with two lines.\nhere is the second line'  # '\n' inserts a new line
my_newline_str

In [None]:
print(my_newline_str)  # notice the quotes disappear!

## Type conversion to and from strings
* Any value can be converted to a string using ```str```
* Strings can be converted to ```int``` and ```float``` when possible

In [None]:
str(3)

In [None]:
float('3')

In [None]:
int('4')

In [None]:
int('bunnies')

<center><img src="data/q3.png"  width="1000"/></center>

In [None]:
x=3
y='4'
z='5.6'

In [None]:
x+y #like 3+"bunnies"

In [None]:
x+int(y+z)

In [None]:
str(x)+int(y)

In [None]:
str(x)+z

## Type inference: #MachineFail
</br>
<center><img src="./data/type_inference_2.png"  width="700"/></center>

### Type conversion causes messy data!

Genomics data (string-to-date):
> "Geneticists use MARCH1 as shorthand for membrane associated ring-CH-type finger 1. But Excel interprets MARCH1 as a date, automatically converting it to 1-Mar or another designation for the first of March."

[Excel Is Autocorrecting Scientific Research. And That's Not Cool](https://science.howstuffworks.com/innovation/scientific-experiments/excel-is-autocorrecting-scientific-research-thats-not-cool.htm)

### Type conversion causes messy data!

Genomics data (string-to-float):

> "Excel misidentifies some other gene names as coordinates or floating points. You might be able to suss out that 1-Mar is actually MARCH1, but how about 2.31E+13? That's how Excel converts the RIKEN identifier 2310009E13."

[Excel Is Autocorrecting Scientific Research. And That's Not Cool](https://science.howstuffworks.com/innovation/scientific-experiments/excel-is-autocorrecting-scientific-research-thats-not-cool.htm)