[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/remingtonsexton/PHYS-050-Introduction-to-Applied-Data-Science/blob/master/notebooks/1-A-Crash-Course-In-Python.ipynb)


# Introduction to Applied Data Science - PHYS 050
# 1 - A Crash Course in Python

This notebook is based on Chapters 1-3 of a book called *A Primer on Scientific Programming with Python* by Hans Petter Langtangen.

This notebook will serve as a basic introduction to the language, its syntax, and features.

### `print` Statements
The most basic of things to do with a programming language is to print something.  This is done using the `print()` statement, with whatever you want to print inside the parentheses:

In [1]:
print('\n Hello, world! \n') # This is a comment


 Hello, world! 



The `\n` symbol tells Python to skip a line before and after "Hello, world!".  Things that are enclosed in parentheses (single '' or double "") are called ***strings***.  The `#` symbol indicates that anything after it is "commented-out", and cannot be read by Python.  Comments are useful for annotating your code to remind you what things do.  You can check the `type` of an object by printing its type as so

In [2]:
print(type('cat')) 

<class 'str'>


An integer is an ***int*** type:

In [3]:
print(type(42))

<class 'int'>


A number with a decimal point is called a ***float***:

In [4]:
print(type(234.43))

<class 'float'>


We can assign anything as a variable so we can easily reference it anywhere in the code.  For example

In [5]:
a = 'cat' # assign 'cat' as a variable a
print(a)

cat


You can also turn an `int` or `float` type into a string:

In [6]:
a = 3.1415 # make a float
print(a) # print it
print('a is a %s' % type(a)) # check its type
a = str(3.1415) # turn the float into a string
print(a) # print it
print('a is now a %s' % type(a)) # check its type again

3.1415
a is a <class 'float'>
3.1415
a is now a <class 'str'>


Above we used some ***string formatting*** to print out the string we assigned to `a` inside of a string statement.  The `%s` inserts a string to whatever we assigned it outside of the parentheses.  Let's see another example, where we insert a string with `%s`, an `int` with `%d`, and a `float` with `%f`. 



In [7]:
a = 45346.945345345
b = 4.0
c = 'cat'
print('\n We can print a using only two decimal places like this: %0.2f' % a )
print('\n or four decimal places: %0.4f' % a)
print('\n We can also print a as an int but it will be rounded down to the nearest int: %d' % a)
print('\n We can force a to be rounded up using round(a): %d' % round(a))
print('\n We can print multiple arguments like %0.2f and %d and %s' % (a,b,c) )


 We can print a using only two decimal places like this: 45346.95

 or four decimal places: 45346.9453

 We can also print a as an int but it will be rounded down to the nearest int: 45346

 We can force a to be rounded up using round(a): 45347

 We can print multiple arguments like 45346.95 and 4 and cat


### Mathematical Operations

Probably the most useful thing you can do with a programming language is math:

In [8]:
a = 7
b = 6
c = a + b # add the ints
print(a)
print(b)
print(a+b)
print(c)

7
6
13
13


**Warning**: if you convert these things to strings using `str()`, look what happens:

In [9]:
a = str(7) # convert int to string
b = str(6) # convert int to string
c = a + b # add the strings
print(a)
print(b)
print(a+b)
print(c)

7
6
76
76


We can also do other types of mathematical operations:

In [10]:
a = 7
b = 6
print('Addition: a + b = %d' % (a+b) ) # Addition
print('Subtraction: a - b = %d' % (a-b) ) # Subtraction
print('Multiplication: a * b = %d' % (a*b) ) # multiplication
print('Division: a / b = %0.2f' % (a/b) )
print('Powers: a^b = %d' % (a**b))

Addition: a + b = 13
Subtraction: a - b = 1
Multiplication: a * b = 42
Division: a / b = 1.17
Powers: a^b = 117649


What about square root? For that, we need to import the `math` module.  In general, we can import any module by typing `import module-name`, or we can import the module as some variable, like so:

In [11]:
import math as m

Now we can call any number of mathematical `functions` supported by the `math` module.  You can find them all here [(Python math module)](https://docs.python.org/3/library/math.html).  Here are some examples:

In [12]:
print('Square Root: sqrt(a) = %0.2f' % (m.sqrt(a)))
print('Exponential: e^a = %0.2f' % (m.exp(a)))
print('Base-10 Logarithm: log10(a) = %0.2f' % (m.log10(a)))
print('Sine function: sin(a) = %0.2f' % (m.sin(a)) ) # radians 
print('pi = %0.100f' % m.pi)

Square Root: sqrt(a) = 2.65
Exponential: e^a = 1096.63
Base-10 Logarithm: log10(a) = 0.85
Sine function: sin(a) = 0.66
pi = 3.1415926535897931159979634685441851615905761718750000000000000000000000000000000000000000000000000000


You'll notice above where we tried to calculate the first 100 decimal places of $\pi$, it eventually just printed zeros.  This is because most programming languages represent floating point numbers (you can read all about it [here](https://en.wikipedia.org/wiki/Floating-point_arithmetic)). All you need to know is that no number is 100% exact when represented as a float:

In [13]:
0.1 + 0.1 + 0.1 

0.30000000000000004

This can be problematic if you were expecting the exact value:

In [14]:
0.1 + 0.1 + 0.1 == 0.3

False

This seems kind of dumb, but in practice you'll probably never notice it.  Just be aware that this happens and that if a float undergoes enough operations, eventually this ***truncation error*** can add up, causing a calculated value to deviate significantly from the exact one.


Now that we've introduced mathematical functions, what if we want to make our own functions?

### Functions

Let's say you wanted to compute a formula that converts Fahrenheit to Celsius:

\begin{equation}
F = \frac{9}{5}C+32
\end{equation}

Let's perform that calculation with $C = 25$: 

In [15]:
C = 25
F = 9/5*C+32
print('%0.1f Celsius is %0.1f degrees Fahrenheit.' % (C,F))

25.0 Celsius is 77.0 degrees Fahrenheit.


This is cool and all, but we're going to  have to type out the equation every time we want to calculate $F$ for a new value of $C$.  Luckily, we can define a `function`, which we can pass a value `C`, perform an operation, and return a value `F`, like so:

In [16]:
def celsius_to_fahrenheit(C): # you can name the function whatever you want, in this case 'celsius_to_fahrenheit', and it takes 1 argument C
    print(' Input: %0.1f C' % C)
    F = 9/5*C+32 # perform operation
    print(' Output: %0.1f F' % F)
    return F # have the function return a value F

Now we can call our function by giving it a value and it will return F:

In [17]:
C = 44.2
F = celsius_to_fahrenheit(C)

 Input: 44.2 C
 Output: 111.6 F


The output of the function is stored in `F`:

In [18]:
print(F)

111.56


### Lists & Tuples, Indexing, and Slicing

A ***tuple*** is any sequence enclosed with parentheses.  A ***list*** is any sequence enclosed with brackets:

In [19]:
a = (1,2,3,4,5) # a tuple
print(a, type(a)) # print a and its type
b = [6,7,8,9,10] # a list
print(b,type(b)) # print b and its type

(1, 2, 3, 4, 5) <class 'tuple'>
[6, 7, 8, 9, 10] <class 'list'>


Each element in a tuple or list has an ***index***.  Lists and tuples in Python are "zero-indexed", which means the first index in a list/tuple is zero, the second index is 1, the third index is 2, etc..  You can think of an index as an element's location in the list.  We can print the $i^{th}$ index in a list like so:

In [20]:
print( a[0]) # The first element in a has an index of 0
print( a[2]) # The third element in a has an index of 2
print( a[-1]) # The last element in a has an index of -1; think of it like counting backwards
print( a[-2]) # The second to last element has an index of -2, and so on...

1
3
5
4


We can manipulate lists.  Let's say we only wanted the last three elements of `a`.  We can ***slice*** the list using the `:` symbol:

In [21]:
print(a) # before manipulation
print(a[:]) # the full list
print(a[2:]) # last three elements: start from the second index (third element) and keep the rest
print(a[-3:]) # last three elements: start from the third to the last element, and keep the rest

(1, 2, 3, 4, 5)
(1, 2, 3, 4, 5)
(3, 4, 5)
(3, 4, 5)


In the above, `a[2:]` is the same as `a[-3:]`. 

Or if we wanted the middle three elements:

In [22]:
print( a[1:4] )
print( a[1:-1] ) 

(2, 3, 4)
(2, 3, 4)


The `a[1:-1]` method is useful if we don't know the length of `a`.  But we can check the legnth of `a` using `len()`:

In [23]:
print( len(a)) # len prints the length of a tuple/list

5


You can slice both lists and tuples.  The only way tuples are different from lists, is that we cannot change the values of a tuple, but we can in a list.  Say we wanted to change the 0th index (first element) of `b` (which is a list) to a value of 9:

In [24]:
print(b)
b[0] = 9 # change of first value of b
print(b)

[6, 7, 8, 9, 10]
[9, 7, 8, 9, 10]


You cannot do this same change with `a` because `a` is a tuple.  If you try, you'll get an error:

In [25]:
print(a)
a[0] = 9 # change of first value of a
print(a)

(1, 2, 3, 4, 5)


TypeError: 'tuple' object does not support item assignment

In general, lists are usually more useful than tuples. Keep in mind, lists and tuples are ***not*** arrays/matrices.  If you add two lists or tuples, all you do is ***concatenate*** them:

In [26]:
c = b+b
print(c)
d = a+a
print(d)

[9, 7, 8, 9, 10, 9, 7, 8, 9, 10]
(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)


### Dictionaries

A dictionary is kind of list whose elements you can assign names.  You can create a dictionary using braces `{}`:

In [27]:
my_dict = {
    'a':1,
    'b':(2,3,4),
    'c':[5,6,7],
    'd':'cat'
}
print(my_dict)

{'a': 1, 'b': (2, 3, 4), 'c': [5, 6, 7], 'd': 'cat'}


You can call the elements by name:

In [28]:
print (my_dict['a'])
print (my_dict['d'])

1
cat


### `while()` Loops

Recall the function we made earlier, copied below for convenience:

In [29]:
def celsius_to_fahrenheit(C): # you can name the function whatever you want, in this case 'celsius_to_fahrenheit', and it takes 1 argument C
    F = 9/5*C+32 # perform operation
    return F # have the function return a value F

Now say instead of calling this function to calculate $F$ for one value of $C=25$, you would like to calculate it for a range or list of values such as 

In [30]:
C = [-20,-10,0,10,20,30,40,50]

If you try giving `celsius_to_fahrenheit()` a list, you'll get an error.  Instead, we can use a ***loop***, to iterate through each value in the list $C$ and send it to the function one at a time.  The first kind of loop we can use is a **`while`** loop.  

A while loop is used to repeat a set of statements a long as a condition is true.  For example:

In [31]:
C = -20 # Start C at -20
dC = 10 # this will be the value that we increment C by each time we go through a loop
while C<=50: # while the value of C is less than or equal to 50,
    F = celsius_to_fahrenheit(C) # call the function 
    print(C, F)
    C = C + dC # increase C by dC before the start of the next iteration

-20 -4.0
-10 14.0
0 32.0
10 50.0
20 68.0
30 86.0
40 104.0
50 122.0


Let's go through the above `while` loop, step-by-step:
1. We started by choosing a value `C=-20` to start.  This is also the first value in our list `C` above.
2. We chose a value `dC=10` which is how much we increase `C` by each iteration.
3. We enter the `while` loop with `C=-20`
4. We send `C=-20` to `celsius_to_fahrenheit()`, and get a value `F`.
5. The last thing we do is increase the original value of `C` by `dC` using `C = C + dC`.  We overwrite the old `C` and create a new `C` whose value increases by `dC`.  Now `C=-10`. 
6. Return to the beginning of the loop with `C=-10`
7. Repeat until `C>50`, at which point the loop stops.

Alternatively, we can *increment* C by saying `C+=dC`, which is the same as saying `C = C + dC`, as shown below:

In [32]:
C = -20 # Start C at -20
dC = 10 # this will be the value that we increment C by each time we go through a loop
while C<=50: # while the value of C is less than or equal to 50,
    F = celsius_to_fahrenheit(C) # call the function 
    print(C, F)
    C += dC # increase C by dC before the start of the next iteration

-20 -4.0
-10 14.0
0 32.0
10 50.0
20 68.0
30 86.0
40 104.0
50 122.0


The shorthand `C+=dC` also applies to other operations:

In [33]:
# Addition
C = -20
C+=10 # same as C = C + dC
print(C)
# Subtraction
C = -20
C-=10 # same as C = C - dC
print(C)
# Multiplication
C = -20
C*=10 # same as C = C * dC
print(C)
# Division
C = -20
C/=10 # same as C = C / dC
print(C)

-10
-30
-200
-2.0


We also encountered ` C<=50` in our `while` loop, which is a ***boolean*** expression.  "Boolean" refers to whether a statement is `True` or `False`.  For example

In [34]:
C = 10
print(C < 0)
print(C > 0)

False
True


Some other boolean expressions we could have used are:

In [35]:
print(C == 40) # does C = 40? (notice the double == )
print(C != 40) # does C not equal 40?
print(C >= 40) # is C greater or equal to 40?
print(C > 40) # is C greater than 40?
print(C < 40) # is C less than 40?

False
True
False
False
True


Finally, we can have multiple boolean expressions in our loops, to satisfy multiple conditions.  We do this using `and` or `or`:

In [36]:
C = 20
print(C>0 and C>-20) # multiple conditions using "and"
print(C<0 or C>10) # multiple conditions using "or"

True
True


`while` loops are useful in only a handful of situations.  For most applications, we can avoid `while` loops, and use `for` loops instead.

### `for()` Loops

With the `while` loop, we totally forgot about the list we made `C = [-20,-10,0,10,20,30,40,50]`, and we had to make the `while` loop knowing where the list started, ended, and the spacing between each element (`dC=10`).

With a `for()` loop, we can iterate through the list `C` itself:


In [37]:
C = [-20,-10,0,10,20,30,40,50]
# using a for loop to iterate through the list C
for degree in C: 
    print(degree)

-20
-10
0
10
20
30
40
50


Now we can iterate through our function using the `for()` loop:

In [38]:
for degree in C:
    F = celsius_to_fahrenheit(degree)
    print(degree, F)

-20 -4.0
-10 14.0
0 32.0
10 50.0
20 68.0
30 86.0
40 104.0
50 122.0


Using the `for()` loop, we avoided the increment `dC` and `C+=dC`.  Furthermore, our list no longer has to be evenly spaced, since we just go through each value in the list.

What if we have two lists, and we want to iterate through both of them at the same time.  We can use the `range()` function to iterate through the length of the list using indices:

In [39]:
for index in range(len(C)): # i is an index
    F = celsius_to_fahrenheit(C[index])
    print(index, C[index], F) # print out index, C, and F for each iteration

0 -20 -4.0
1 -10 14.0
2 0 32.0
3 10 50.0
4 20 68.0
5 30 86.0
6 40 104.0
7 50 122.0


In [40]:
A = [1,2,3,4,5]
B = [6,7,8,9,10]

for i in range(len(A)):
    print(i, A[i], B[i]) # prints index, A, and B for each index

0 1 6
1 2 7
2 3 8
3 4 9
4 5 10


### Conditional Statements: `if`, `elif`, and `else` statements

We can also evaluate any value using a conditional `if` and/or `else` statement.  These statements are useful in idenfying certain values in a list.

Let's iterate through a list of numbers 1-10.  If the number is odd, print 'cat', if the number is even print, 'dog'.  A number is even if there is no remainder when divided by 2, i.e., the modulus of that number is zero.  Otherwise, it is odd:

\begin{align*}
n\;&\%\;2 = 0, \quad\text{even}\\
n\;&\%\;2 !=0, \quad\text{odd}
\end{align*}
where $n$ is the number, and $\%$ is the modulus operator. So, 
1. Iterate through the list using a `for()` loop
2. Determine if the number is even or odd
3. Print 'cat if odd, print 'dog' if even.

In [41]:
A = [1,2,3,4,5,6,7,8,9,10]

for n in A:
    if n%2==0: # if even
        print(' dog')
    else:
        print(' cat')

 cat
 dog
 cat
 dog
 cat
 dog
 cat
 dog
 cat
 dog


Notice that we did not have to specify the 'odd' statement since the only other choice is 'odd'.  We could have also been more specific and added an `elif` statement:

In [42]:
A = [1,2,3,4,5,6,7,8,9,10]

for n in A:
    if n%2==0: # if even
        print(' dog')
    elif n%2!=0:
        print(' cat')

 cat
 dog
 cat
 dog
 cat
 dog
 cat
 dog
 cat
 dog


## Other Useful Tutorials and Resources

This was a very basic crash course in Python.  We still haven't covered other more-useful packages such as Matplotlib, Numpy, or Scipy (yet!). If you didnt find this tutorial helpful, the internet is full of Python tutorials, probably more helpful than this one:
- [The official Python 3 Tutorial (online)](https://docs.python.org/3/tutorial/)
- [DataQuest Jupyter Notebook turorial (online)](https://www.dataquest.io/blog/jupyter-notebook-tutorial/)
- [Learn Python - Full 4 Hour Course - No Ads (video)](https://www.youtube.com/watch?v=rfscVS0vtbw)
- [Data Science from Scatch (textbook)](https://www.amazon.com/Data-Science-Scratch-Principles-Python/dp/1492041130/ref=sr_1_3?dchild=1&keywords=data+science+from+scratch&qid=1592988287&sr=8-3)
- [A Primer on Scientific Programming with Python (textbook PDF)](https://hplgit.github.io/primer.html/doc/pub/half/book.pdf)