# Lecture 2

This notebook contains more than may be covered in lecture. Please use this to explore and understand the fundamentals of the language.

---

## Jupyter Notebooks, expressions, and data types

![image](./logos.png)

---

# Programming languages help extract information from data

* Python is popular for both data science and software development
* Learn through practice!
* Learn just enough to use it as you need it!
* Follow along at [jupyterhub.ucsd.edu](https://jupyterhub.ucsd.edu)

---

# Jupyter notebooks allow you to mix text with code

* Perfect for experimenting with code
    - Annotate your code experimentation for others to learn from
* Perfect for presentations about data
    - Annotate your data analysis with explanations
* See the documentation for references:
    - [Jupyter Documentation](https://jupyter-notebook.readthedocs.io/)

# Change the cell below to a "markdown cell"

In [None]:
# Header

* list item 1
    - sublist item1
    - sublist item2
* list item 2

---

## Sub-Header

You can link to websites:

[Jupyter Documentation](https://jupyter-notebook.readthedocs.io/)

You can display images:

![this is an image](./file/path)

## You can also use math equations:

Embedded math $e^{i\pi} + 1 = 0$ in text inline, or on it's own line:

$$e^{i\pi} + 1 = 0$$

## You can also write tables easily:

|header 1|header 2|header 3|
|--------|--------|--------|
|value 1 |value 2 |value 3 |

# Getting started with python: expressions

Write an expression in a "code cell" and either hit "Shift-Enter" or press the "Run" button to evaluate the code!

In [None]:
print("hello world!")

In [None]:
2 * 3

In [None]:
# this is a comment
1 + 2 # this code demonstrates addition

# Numbers and Arithmetic

![](./arithmetic_table.png)

## Python uses typical order of operations

In [None]:
3*2**2

In [None]:
(3*2)**2

# Two data types: ```float``` and ```int```
* ```int``` : an integer of any size
* ```float```: a number with an optional fractional part

### ```int```
* ints have arbitrary precision

In [None]:
2 + 3

In [None]:
2**5

In [None]:
2**4000

### ```float```
* a float is specified using a decimal point
* a float might be printed using scientific notation
* floats have limited size (but the limit is huge)
* floats have limited precision of 15-16 decimal places
* after arithmetic, the final decimal few places can be wrong (limited precision!)

In [None]:
2.0 + 3.2

In [None]:
3.0**400

In [None]:
3.0*4.2

In [None]:
3.0**4000

In [None]:
float(3)

## Type coercion between ```int``` and ```float```
* by default, python changes an int to float in a mixed expression
* an value can be explicity coerced using ```int``` and ```float``` functions.
* division of two integers automatically returns a float value

In [None]:
2.0 + 3

In [None]:
2/1

In [None]:
# want an integer back
int(2/1)

In [None]:
# int rounds float down to the nearest integer
int(3.9)

### Be careful switching between ```int``` and ```float```

In [None]:
2.51 * 100

In [None]:
int(2.51 * 100)

### Ariane I: the consequences of floating point to integer conversion error

The Ariane I exploded on launch in 1996 dues to float point conversion errors: 
[see story here](https://itsfoss.com/a-floating-point-error-that-caused-a-damage-worth-half-a-billion/)

<img src="ariane.jpg" alt="drawing" width="600"/>

# Assignment: names and variables

$$ \overbrace{\texttt{myvariable}}^{\text{name}} = \overbrace{\texttt{2 + 3}}^{\text{any expression}} $$

* Assignment statements like above don't have a value; they perform an action.
* An assignment statement changes the meaning of the name to the left of the ```=``` symbol
* The name is bound to a value (not an equation).
    - ```myvariable``` is bound to the value ```5``` not the expression ```2 + 3```.

In [None]:
more_than_1 = 2 + 3

In [None]:
more_than_1

In [None]:
more_than_1 * 2

### hit ```tab``` to autocomplete a set name

In [None]:
mor

In [None]:
x = 2
y = 3 + x

In [None]:
x + y

In [None]:
x = 3

In [None]:
y

# Call Expressions
Functions are called in python just like in standard mathematics:
$$ y = f(x) $$

In [None]:
abs(-12)

In [None]:
f = abs
x = -12
y = f(x)

In [None]:
y

### Functions can take variable number of arguments

In [None]:
min(3, -4)

In [None]:
max(2, -3, -6, 10, -4)

### Discussion Question

Assume you have run the following statements:
```
x = 3
y = -2
```

Which of these examples results in the error?

* A. ```abs (x, y)```
* B. ```math.pow(x, abs(y))```
* C. ```round(x, max(abs(y**2))))```
* D. ```math.pow(x, math.pow(y, x))```
* E. More than one of these

### use the ```?``` after a function to see the documentation for a function

In [None]:
my_number = 1.22

In [None]:
round(my_number)

In [None]:
round?

In [None]:
round(1.22222, 3)

# What functions are available for use?

## Builtin functions 
* see the [documentation](https://docs.python.org/3/library/functions.html)

![](./python_builtins.png)

In [None]:
help?

## Import functions from python modules

In [None]:
import math

In [None]:
math.sqrt(9)

In [None]:
math.log?

In [None]:
math.

# Text, Strings, and Types
---

## A string value is a snippet of text of any length
* enclose a string in either single or double quotes
* some arithmetic operations work on strings
* strings have certain transformations associated to them.

In [None]:
'a'

In [None]:
"word"

In [None]:
"here is a full sentence. Here is another sentence."

In [None]:
"12.0"

In [None]:
s1 = 'hello'
s2 = 'world'

In [None]:
s1 + s2

In [None]:
s1 + ' ' + s2

In [None]:
s1*3

In [None]:
my_cool_string = 'data science is super cool!'

In [None]:
my_cool_string.

In [None]:
my_cool_string.upper()

In [None]:
my_cool_string.replace('super', 'super-duper')

In [None]:
bbm = 'Benoit B. Mandlebrot'
joke = 'The B in Benoit B. Mandlebrot stands for Benoit B. Mandlebrot'
joke

In [None]:
for k in range(6):
    print(joke, end='\n\n')
    joke = joke.replace('B.', bbm)

## Special characters in strings

In [None]:
'my string's full of apostrophes!'

In [None]:
"my string's full of apostrophes!"

In [None]:
'my string\'s "full" of apostrophes!'  # escape the apostrophe with a backslash!

In [None]:
print('my string\'s "full" of apostrophes!')

## Digression: ```print()``` vs. ```__repr__()```
* By default Jupyter notebooks displays the string represenation (```__repr__```) of the value of the expression of the last line in a cell.
* The function ```print```, displays the value in human readable text when it's evaluated.

In [None]:
12 # 12 won't be displayed
23

In [None]:
print(12)
print(23)

In [None]:
my_newline_str = 'here is a string with two lines.\nhere is the second line'  # '\n' inserts a new line
my_newline_str

In [None]:
print(my_newline_str)  # notice the quotes disappear!

## Type conversion to and from strings
* Any value can be converted to a string using ```str```
* Strings can be converted to ```int``` and ```float``` when possible

In [None]:
str(3)

In [None]:
float('3')

In [None]:
int('4')

In [None]:
int('chicken!')

In [None]:
'6.0' + 3.0

In [None]:
int('4.0')

In [None]:
int(float('4.0'))

### Discussion Question:

Assume you have run the following statements:
```
x = 3
y = '4'
z = '5.6'
```
Choosing the expression that will be evaluated without error:
* A. ```x + y```
* B. ```x + int(y + z)```
* C. ```str(x) + int(y)```
* D. ```str(x) + z```
* E. All of them have errors

# Arrays
* An array contains a sequence of values.
* All elements of an array should have the same type.
* Arithmetic is applied to each element individually
* When two arrays are added, they must have the same size; corresponding elements are added in the result.
    - Unless one of the arrays has size one.

In [None]:
from datascience import *        # datascience library for course
import numpy as np               # 'numerical python library' for working with arrays

In [None]:
# Number of friends:

Nancy = 10
Max = 4
Rob = 2
Tom = 7
Sarah = 0

In [None]:
friends = make_array(10, 4, 2, 7, 0)
friends

In [None]:
friends + 4

### use ```.item()``` to access an array element by index
* Warning: array indices start with zero!

In [None]:
friends.item(0)

In [None]:
friends.item(3)

## Arrays make working with data easy

In [None]:
a1 = make_array(1,2,3)
a2 = make_array(3,2,1)

In [None]:
a1

In [None]:
a2

In [None]:
a1 + a2

In [None]:
a1 - a2

In [None]:
a1 * a2

In [None]:
a1/a2

In [None]:
a1**a2

## Arrays for basic statistics: newborn birth weight

In [None]:
# four girls with weight in kg:
g1 = 3.405
g2 = 3.207
g3 = 2.42
g4 = 3.984

# average weight of a newborn girl (in kg)
girl_av_weight = 3.3

Load the weights into an array of floats

In [None]:
weights_kg_g = make_array(g1, g2, g3, g4) 

weights_kg_g

Calculate the deviation of weights from the average weight

In [None]:
weights_kg_g - girl_av_weight

Convert the weights to pounds (lbs)

In [None]:
weights_lbs_g = weights_kg_g * 2.2
weights_lbs_g

How many girls are recorded in the array?

In [None]:
len(weights_lbs_g)

## Arrays for basic statistics: daily temperatures

Below is an array of daily high temperatures in San Diego from August 2018

In [None]:
temps = make_array(86, 85, 85, 84, 85, 86, 91, 89, 90, 88, 88, 85, 83, 82, 79, 81, 82, 83, 82, 79, 81, 83, 83, 79, 80, 80, 79, 80, 82, 82, 80)

Numbers of days temperatures are collected in August:

In [None]:
temps.size

Average temp

In [None]:
temps.sum() / temps.size  # use sum and size

In [None]:
temps.mean() # build the mean method

In [None]:
min(temps), max(temps) # builtin functions work on array

In [None]:
temps.min(), temps.max() # the array has it's own min/max method (faster)

Sorted array of temps

In [None]:
np.sort(temps)

Temperature differences by day

In [None]:
np.diff(temps)

# Ranges
* A range is an array of consecutive numbers
* ```np.arange(end)```: An array of increasing integers from 0 up to end
* ```np.arange(start, end)```: An array of increasing integers from start up to end
* ```np.arange(start, end, step)```: A range with step between consecutive values
* The range always includes start but excludes end (i.e. a half-open interval)

In [None]:
np.arange(5)

In [None]:
np.arange(3, 9)

In [None]:
np.arange(3, 30, 5)

In [None]:
np.arange(-3, 2, 0.5)

In [None]:
np.arange(1, -3)

In [None]:
np.arange(1, -3, -1)

### Discussion Question
Assume you have run the following commands:
```
x = make_array(2,3,4)
y = np.arange(2,3,4)
z = np.arange(3)
```
Which of the following expressions will cause and error?
* A. ```x + y```
* B. ```x + z```
* C. ```x.item(0) + y.item(0)```
* D. ```x.item(1) + y.item(1)```