# Intro to Python: Essential Essentials

Now that we've added the syllabus to our class GitHub repo and created a conda environment, it's time to get moving with our first python notebook. The overall goal is to learn all of the basics of python, with an emphasis on code transparency; efficiency comes later.

We begin by importing some of the libraries we'll use in this section.

In [None]:
import matplotlib as mp
import seaborn as sb
import numpy as np
import scipy as sy
import pandas as pd

## Basic data types

the essential data types in python are integers, floats, strings, and booleans. Let's play with them.

In [None]:
# Comments must contain a hash at the beginning of the line
a = 1.2
b = 1
c = 'hello world'
d = True
print(a,'\n',b,'\n',c,'\n',d,'\n',type(a),type(b),type(c),type(d))

In [None]:
# Assignment can occur in one line
a, b, c, d = 1.2, 1, 'hello world', True

## Typecasting
You can cast variables across types. Some operations may be possible for one datatype and not another.

In [None]:
int(a)

In [None]:
int(d)

In [None]:
float(b)

In [None]:
int(c)

In [None]:
c + ' ' + str(a)

## Indexing
Indexing is a way of referring to one element of a larger set. You may index into a list, string, dataframe, etc.

In [None]:
c[1:5]

In [None]:
c = "hello world, this is a computational physics class!"
c.split(' ')

In [None]:
c.split(' ')[2]

In [None]:
c.split(' ')[0]

## Conditionals and Loops

Conditionals are the bread and butter of programming. They allow us to make decisions based on the state of our program. The basic syntax is

In [None]:
if 7*7 == 49:
    print("I know basic math")
else:
    print("I need to go back to school")

In [None]:
course = "statmech"
if course == "qft":
    print("I'm in the quantum field theory course")
elif course == "statmech":
    print("I'm in the statistical mechanics course")
elif course == "comp":
    print("I'm in the computational physics course")
else:
    print("You didn't list my course!")

Loops are also very important. They allow us to repeat a task many times.

In [None]:
counter, result = 0, 0
while counter < 10:
    result += counter**2 # this is the same as result = result + counter**2
    counter += 1
print(result,1**2+2**2+3**2+4**2+5**2+6**2+7**2+8**2+9**2)

In [None]:
for i in range(4):
    print(i)

In [None]:
list(range(10))

In [None]:
result = 0
for counter in range(10):
    result += counter**2
    
print(result)

## Collections

Essential types of collections are lists, tuples, dictionaries, and sets. We'll go through each of them.

In [None]:
a, b, c = [1,2,2,4], (1,2,2,4), {1,2,2,4} # list, tuple, set
teachers = {'joe':'physics', 'jane':'math', 'jill':'english'} # dictionary
a[0], b[0], teachers['joe']

In [None]:
c[0]

In [None]:
a[-1],a[-2]

In [None]:
print(c)

In [None]:
for idx, val in enumerate(a):
    print(idx,val,val**2)

In [None]:
for val in a: print(val**2)

You're a physicist and probably think sequences are cool.

In [None]:
fibonacci = [1,1] 
for i in range(10):
    fibonacci.append(fibonacci[-1] + fibonacci[-2])
    print(fibonacci)
print(fibonacci)

This one is a little complicated because the variable refers to itself. But for ones that don't we have some clever tricks

In [None]:
squares = []
for i in range(10):
    squares.append(i**2)
print(squares)

**But there is a cuter pythonic way to do it**

Let's check out a difference between tuples, lists, and sets according to their lookup time.

In [None]:
maxval = 10000000
biglist = list(range(maxval))

import random
randoms_between_0_maxval = [random.randint(0,maxval) for _ in range(250)]
for rand in randoms_between_0_maxval:
    rand in biglist


In [None]:
for k in biglist[:10]:
    print(k)

In [None]:
maxval = 10000000
bigset = set(range(maxval))

for rand in randoms_between_0_maxval:
    rand in bigset

Sets aren't subscriptable because they are unordered. But they are very useful for checking membership because they have O(1) lookup time.

Lists, on the other hand, are ordered and subscriptable. But they have O(n) lookup time.

In [None]:
biglist[700000]

In [None]:
bigset[700000]

# Plotting

We need to learn how to plot stuff.

In [None]:
from matplotlib import pyplot as plt

#reduces dpi so figures are smaller
plt.rcParams['figure.dpi'] = 75

# let's make a parabola 
x, y = [i for i in range(-10,11)], [i**2 for i in range(-10,11)]
plt.plot(x,y)

In [None]:
import seaborn as sb
import pandas as pd

sb.set_style('darkgrid')

df = pd.DataFrame({'x axis':x, 'y axis':y})
sb.lineplot(x='x axis', y='y axis', data=df)

Other types of plots in matplotlib and seaborn are histograms, scatter plots, and 3D plots.

Let's make a histogram of the normal distribution.

In [None]:
import random

for num_samples in [10**k for k in [2,3,4,5]]:
    gaussian_samples = [random.gauss(0,1) for _ in range(num_samples)]
    sb.displot(gaussian_samples)

You might notice that graphs are drawn on top of each other! This is a common thing that occurs. We get around it by `mp.show()`.

# Code Lab

Let's try some stuff.

Let's start by computing pi by a series expansion. We'll use the fact that

$$\frac{\pi}{4} = \sum_{n=0}^\infty \frac{(-1)^n}{2n+1}$$

In [None]:
def pi_from_k_terms(k):
    return sum([(-1)**i/(2*i+1) for i in range(k)])*4

How do we check convergence of this series?

Cool, let's try e. We know e from its Taylor series

$$e^x = \sum_{n=0}^\infty \frac{x^n}{n!}$$

May need factorial function from `math`

In [None]:
from math import factorial



Let's do some physics. We'll start with ballistic motion. The equation is

$$y(t) = y_0 + v_0 t - \frac{1}{2} g t^2$$

where $y_0$ is the initial height, $v_0$ is the initial velocity, and $g$ is the acceleration due to gravity. Let's plot this for a few different initial velocities.

In [None]:
def plot_motion(y0, v0):
    g= 9.8
    ts = list(range(10))
    y = [y0 + v0*t - 0.5*g*t**2 for t in ts]
    plt.plot(ts,y)
    plt.xlabel('time (s)')
    plt.ylabel('height (m)')
    plt.title("Our Cool Ballistic Motion Plot, Yay, :-D")
    plt.show()

In [None]:
plot_motion(10,0)

In [None]:
plot_motion(10,50)

Okay now let's do some math that appears often in physics: the Central Limit Theorem (CLT). Recall that the CLT states that the sum of a large number of independent random variables is approximately normally distributed. Let's test this by summing a bunch of random variables. We'll use the random library, which has a number of distributions we can sample from, including: uniform, normal, binomial, and poisson.

In [None]:
sum([random.uniform(-1,1) for _ in range(500)])/math.sqrt(500)

In [None]:
import random
import math 

def sum_of_random_variables(num_in_sum):
    return sum([random.uniform(-1,1) for _ in range(num_in_sum)])/math.sqrt(num_in_sum)

In [None]:
import seaborn as sb
num_samples = 10000

sb.displot([sum_of_random_variables(1) for _ in range(num_samples)])

In [None]:
sb.displot([sum_of_random_variables(2) for _ in range(num_samples)])

In [None]:
import matplotlib.pyplot as pyp

In [None]:
for num_in_sum in range(1,11):
    print(num_in_sum)
    sb.displot([sum_of_random_variables(num_in_sum) for _ in range(num_samples)])
    pyp.show()

In [None]:
sb.displot([sum_of_random_variables(1000) for _ in range(num_samples)])