Adapted from a notebook from Dr. Micheal Marty's Analytical Chemistry class

Welcome to Python!

Python is a very powerful yet easy to use programming language that has become one of the most commonly used languages, especially in science. There are a variety of ways to use Python. You can run Python in a command line structure and type out commands. You can also write scripts and then run the whole script at once.

Here we are going to use a Python notebook run in the cloud. This is sometimes called iPython or Jupyter.

Some cells, like this, are text only (in a format called Markdown). I'll use these to type out things to know.

Other cells are executable blocks of code. You can run them in different orders and rerun bits of code, but the results of the output will liven in the memory, as you will see later.

First, replace the ... in the print command below with your name, and hit Shift + Enter.

One important note: this needs to be Python 3. If you see Python 2 in the upper right corner, switch it using Kernel > Change Kernel.

This tutorial is meant to be run in order, if things are skipped around variables might change and break things. The number to the left of the cell represents the order in which things are called. The cell where a variable is set last by that order is the cell that value will be until reset by another call.

Let's run our first line of python! Hit Shift + Enter or Ctrl + Enter

In [None]:
print("Hello World! This is how I print things out in the code!")

Let's start learning how to program. This block of code shows you how to define variables. Once you store these, you can recall them later.

Run this block of code by clicking run or Shift + Enter or Ctrl + Enter.

In [None]:
school = "University of Arizona"
name = "Tom Purcell"

Now lets print out these variables out in a message

In [None]:
print(name, "currently works at", school)

We can also use functions coded into the string objects to manipulate them

In [None]:
lower_name = name.lower()
print(lower_name, "currently works at", school)
print(name, "currently works at", school.upper())

While this is will work, it is not the easiest code to read. Let's introduce the f-string concept to make this easier to understand

In [None]:
print(f"{name} currently works at {school}")

The advantage of f-strings is that it can easily include non-string components inside the string

In [None]:
n_months = 3
print(f"{name} currently works at {school}. He has been there for {n_months} months.")

And this allows for easy formatting using the codes found here: https://docs.python.org/3/library/string.html#format-specification-mini-language

In [None]:
print(f"{name} currently works at {school}. He has been there for {n_months:03d} months.")
print(f"{name} currently works at {school}. He has been there for {n_months:+03d} months.")
print(f"{name} currently works at {school}. He has been there for {n_months:>3} months.")
print(f"{name} currently works at {school}. He has been there for {n_months:^3} months.")
print(f"{name} currently works at {school}. He has been there for {n_months:<3} months.")
print(f"{name} currently works at {school}. He has been there for {n_months:f} months.")
print(f"{name} currently works at {school}. He has been there for {n_months:.2f} months.")

**Problem 1**

Replace name, school and months with the correct values for yourself and print the above statements such that the nubmer of months is printed as a 5 digit number keeping precedeing zeroes

Note: anything after a # is a comment and will not be interpeted as code

In [None]:
# Your code goes here
name = "Student A"
school = "Univesrity of Arizona"
months = 4
print(f"{name} is currently enrolled at {school}. He has been there for {months:05d}.")

While string manipulation is important for describing information, chemistry problems normally are numeric in nature, so lets now do some Math!

In [None]:
a = 1
b = 2
c = a + b
print(f"c = {c}")

**Problem 2**

Define a new variable d that is the product of c and b

In [None]:
# Your code goes here
d = c * b

What happens with division? What is a / b?

In [None]:
a / b

Here python auto converts the integers into a float when doing division, not all coding languages will do this. If we want to see what happens in these languages use the // operator

In [None]:
a // b

The // operator is integer division. This will do the division problem and delete anything after the decimal point. To keep a remainder use the modulus operator %

In [None]:
print(a % b)
print(19 % 7)

We can use this to parse longer integers into sub groupings like what Gaussian does for the range seperation values

Notice how inside the f-strings we can also include expressions not just variables

In [None]:
# Reject the leading zero python won't accept it
omega = 110002200

int_lr = omega // 100000
int_sr = omega % 10000

print(f"lr: {int_lr / 10000:.5f}; sr: {int_sr / 10000:.5f}")

**Problem 3**

What are the division, integer division, and modulus of c over d and d over c

In [None]:
# Your code goes here
print(c / d)
print(c // d)
print(c % d)

print(d / c)
print(d // c)
print(d % c)

Doing single operations is good in python, but that is something that can be done in excel or a calculator. The power of python is it's ability to act on a collections of data. 

The main python collection of data is the list, but this is not as useful for scientific processing because it does not keep track of what type the data is and allow us to easily use mathematical operations on that collection. However the array class from the numpy library solves this issue for us

To use this library though we need to import it

In [None]:
import numpy as np

x = np.array([0, 1, 2, 3, 4])
y = x ** 2
print(y)

What happens if  we accidently made one of the entires in x a string though?

In [None]:
x = np.array([0, 1, "2", 3, 4])
y = x ** 2
print(y)

This raises an error (fails for a specified reasion) because x was automatically cast as (converted into) a string and not a numeric value because the list contains strings and ints and any int can be converted into a string, but not all strings can be converted into ints

In [None]:
print(x)

In [None]:
x = np.array([0, 1, 2, 3, 4])
y = x ** 2.0
print(y)

Here we see that by raising x to the 2.0 power instead of 2 all of our data points became floating point numbers signified by the "." at the end of them. In python if any valid operation contains a float then all values resulting from that operation will also become a float.

To convert this back to ints use `astype` member function

In [None]:
y = y.astype(int)
print(y)

Note that if we convert to integers on non-int floats then we simply remove the decimal point and not round up or down

In [None]:
y = x ** 0.5
print(y)
print(y.astype(int))

We can now plot the data using matplotlib with some easy plots

In [None]:
import matplotlib.pyplot as plt

plt.plot(x, y)
plt.show()

**Problem 4**

plot the curve y = x ** 2 for the values of x = -10 to 10 with interval of 1

In [None]:
# Your code here
x4 = np.array([-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
y4 = x4 ** 2

plt.plot(x4, y4)
plt.show()

While specifying things from lists can give the programmer a greater amount of control over what is inside the array, it becomes tedious very fast (imagine if in the last problem the interval was 0.001 instead of 1). Luckily numpy has a solution for that

In [None]:
min = 0
max = 4
interval = 0.001
x_fine = np.arange(min, max + interval / 2.0, interval)
print(x_fine)

The arange function will return an array of all numbers in the range [min, max) with an interval specified by interval. Here the notation means the min will be included in the list while the max will not be. We can now get more finte detail over the plot in the square root plot from above

In [None]:
plt.plot(x_fine, x_fine ** 0.5)
plt.plot(x, y, 'o')
plt.show()

Here I plotted the coarse data with 'o' markers. This is the power/issue of matplotlib it can give quick and simple plots easily, but publication quality figures can be much harder to generate. Here is an example of what I would do to make this publicaiton quality (Not expecting you to do this right away)

In [None]:
# Make a new figure with the size of 3.5 inches and a height of 2.5 inches
fig = plt.figure(figsize=[3.5,2.5])

# Add an axes to the figure starting at 0.1 * 3.5, 0.1 * 2.5, 0.95 *  3.5, 0.95 * 2.5 inches
# for the bot, left, top, right edges
ax = fig.add_axes(rect=[0.1, 0.1, 0.95, 0.95])

# Make the ticks go inside the figure
ax.tick_params(which="both", direction="in", left=True, right=True, bottom=True, top=True)

# Add x-axis label
ax.set_xlabel("x (unit)")

# Set x-axis limits
ax.set_xlim([-0.1, 4.1])

# Set x-axis major ticks
ax.set_xticks(np.arange(0, 4.1, 1.0))

# Set x-aaxis minor ticks
ax.set_xticks(np.arange(0, 4.1, 0.1), minor=True)

# Do the same for the y-axis
# Matplotlib understands LaTeX
ax.set_ylabel("$\\sqrt{x}$ (unit$^{0.5}$)")
ax.set_ylim([-0.05, 2.05])
ax.set_yticks(np.arange(0, 2.1, 0.5))
ax.set_yticks(np.arange(0, 2.1, 0.1), minor=True)

# Easy to use color set generator: https://coolors.co/
# c sets color with hex code, and label sets legend lables
ax.plot(x_fine, x_fine ** 0.5, c="#1A5D89", label="fine")
ax.plot(x, y, 'o', c="#EB5160", label="coarse")

# Add the legend
ax.legend(frameon=False)

fig.show()

Writing functions can make life easier for doing repetitive tasks over and over agin.

A function is defined by the key word **def** followed by a function name and a list on input types in parenthesis. To get an output from the fuction we use a **return** statement.

Notice the format here a **:** after the function call and the indented block for the function text. What ends the function is not the return statement but the breaking of the indentation block. *IMPORTANT: four spaces and one tab are NOT THE SAME THING! This will cause indentation errors as we will see in the a later cell*

Let's make a function that adds two numbers together

In [None]:
def add(a, b):
    return a + b

c = add(4, 5)
print(c)

We can also define the function with an internal variable `c` that we return, but this is not reccommended as it adds unecessary lines of code and allocations of memory

In [None]:
# This is fine as we are only using spaces
c = 0
def add(a, b):
    c = a + b
    return c

d = add(4, 5)
print(c, d)

If done this way CHECK INDENTATION!

In [None]:
# This fails due to mixed indentation
def add(a, b):
	c = a + b
    return c

c = add(4, 5)
print(c)

When we have two functions with the same name the later one will overwrite the earlier one (unless defined as part of a class).

This also applies to variables as we see with c

In [None]:
def add(a, b):
    return a + b

c = add(4, 5)
print(c) 

def add(a, b):
    return a + b + 1

c = add(4, 5)
print(c)

While the addtion function is easy to read on it's own, it is always good to give docstrings for each function. A docstring is a simple description of what the code does and is maked by `'''text'''` or `"""text"""` as shown below. The reason why we do this is that if we don't know what a function does while in python we can access the docstrig using the `help` function. The more details you add the more details you get from `help`

For a style guide for docstrings look at the examples here: https://stackabuse.com/common-docstring-formats-in-python/

I am using the google docstring format below 

In [None]:
def add(a, b):
    """Adds two numbers a and b together"""
    return a + b

help(add)

print("\nMore in depth docstring\n")
def add(a, b):
    """Adds to numbers a and b together

    Args:
        a (float): First number to add
        b (float): Second number to add

    Returns:
        float: The sum of the two numbers
    """
    return a + b

help(add)

**Problem 5**

Write a function with a docstring to calculate the square of a list of numbers starting from a value `min` ending at a value `max` with an increment `inc`. And test the results.


In [None]:
# You're code here
def get_sq_arr(min_val, max_val, inc):
    """Calculates the square of all numbers from min to max, in an increment inc

    Args:
        min (float): The min value of the array to take a sqaure of
        max (float): The max value of the array to take the square of
        inc (float): Increment between the numbers

    Returns:
        np.array[float]: The square of all numbers
    """
    return np.arange(min_val, max_val + inc, inc) ** 2.0

print(get_sq_arr(-1, 1, 0.5))


The above function works well, but if we are to look for a square root this function will lead to a `nan` value (not a number)

This is a result that can happen whenever something that is invalid mathematically is attempted.

In [None]:
# You're code here
def get_sqrt_arr(min_val, max_val, inc):
    """Calculates the square root of all numbers from min to max, in an increment inc

    Args:
        min (float): The min value of the array to take a sqaure of
        max (float): The max value of the array to take the square of
        inc (float): Increment between the numbers

    Returns:
        np.array[float]: The square root of all numbers
    """
    return np.arange(min_val, max_val + inc / 2, inc) ** 0.5

print(get_sqrt_arr(-1, 1, 0.5))

We can prevent this from happening by using a condtional statement inside the function and raising an error when a negative number is passed to min or max.

*Raising an error*

Raising an error (formally exception) early is a way of cleanly exiting a code if something wrong without spending more computational time to calculate an invalid result. It works by using the `raise` keyword and requires sometype of error type (https://docs.python.org/3/library/exceptions.html). Normally a `ValueError` will work.

*Conditional Statements*

A condtional is also known as an if-then or if-else statement that specfies if condition X is True, then do A.

We can chain multiple conditions together using the elif (else if) keyword, elif Y is True, then do B

Finally we can set a default condtion using else, if no other condition is true, then do C

In [None]:
def get_sqrt_arr(min, max, inc):
    """Calculates the square root of all numbers from min to max, in an increment inc

    Args:
        min (float): The min value of the array to take a sqaure of
        max (float): The max value of the array to take the square of
        inc (float): Increment between the numbers

    Returns:
        np.array[float]: The square root of all numbers
    """
    if min < 0:
        raise ValueError("min value is < 0")
    elif max < min:
        raise ValueError("The max value is less than the min value.")
    else:
        return np.arange(min, max + inc, inc) ** 0.5

We can now test the function to see what happens in different cases

In [None]:
get_sqrt_arr(-1, 1, 1)

In [None]:
get_sqrt_arr(0, -1, 1)

In [None]:
get_sqrt_arr(-1, -2, 1)

In [None]:
get_sqrt_arr(0, 1, 0.5)

We can see from the above examples that each condtion is checked sequentially and if one condition is True then none of the other conditions will be tested and the function will terminate at the raise or return line. 

We can also condense the function down into a single if statement by using the `or` keyword. As a note one n also use `and` and `not` keywords

Here the final else statement is not needed because the function will exit if the condition is met.

In [None]:
def get_sqrt_arr(min, max, inc):
    """Calculates the square root of all numbers from min to max, in an increment inc

    Args:
        min (float): The min value of the array to take a sqaure of
        max (float): The max value of the array to take the square of
        inc (float): Increment between the numbers

    Returns:
        np.array[float]: The square root of all numbers
    """
    if min < 0 or max < min:
        raise ValueError("Invalid input variable")
    
    return np.arange(min, max + inc, inc) ** 0.5

In [None]:
get_sqrt_arr(-1, 1, 1)

In [None]:
get_sqrt_arr(0, -1, 1)

In [None]:
get_sqrt_arr(0, 1, 0.5)

Note here we do not need an else statement at the end of the the updated function because it already exits it at the raise statement

While these small functions are good learning tools, they aren't particularly useful coding wise since functions for basic mathematical operators already exist. Let's build on these topics and create a function for a Gaussian function:

$f\left(x\right)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^2\right)$

Where $\sigma$ is the width of the Gaussian and $\mu$ is the center point.


In [None]:
def gaussian(x, mu, sigma):
    """Calculate the probability distribution function (PDF) for a Gaussian for a given set of x-points

    Args:
        x (np.array[float]): The set of x-points to get the PDF for
        mu (float): The mean value of the distribution (center point)
        sigma (float): The standard devation of the distribution (width)

    Returns:
        np.array[float]: The values of the PDF for the specified Gaussian
    """
    norm_fact = 1.0 / (np.sqrt(2 * np.pi) * sigma)
    return norm_fact * np.exp(-0.5 * ((x - mu) / sigma) ** 2.0)

Now let's plot the results to make sure taht we did everything correctly

In [None]:
x = np.arange(-3, 3.01, 0.01)

plt.plot(x, gaussian(x, 0.0, 1.0), label=1)
plt.plot(x, gaussian(x, 0.0, 2.0), label=2)
plt.plot(x, gaussian(x, 0.0, 0.5), label=0.5)
plt.legend()
plt.show()

**Problem 6**

The Lorentzian Distribution

$f\left(x\right)=\frac{1}{\pi} \frac{\frac{1}{2} \Gamma}{\left(x-x_0\right)^2 + \left(\frac{1}{2} \Gamma\right)^2}$

1) Write a function of a Lorentzian function
2) Plot two lorentzian's with a ($x_0 = 1.0$, $\Gamma = 0.5$) and ($x_0 = -1.0$, $\Gamma = 1.0$)
3) What does $x_0$ reprsent and what does $\Gamma$ represent?

In [None]:
# Your code here
def lorentzian(x, x0, Gamma):
    """Calculate the probability distribution function (PDF) for a Lorentzian for a given set of x-points

    Args:
        x (np.array[float]): The set of x-points to get the PDF for
        x0 (float): The peak location
        Gamma (float): The width of the peak

    Returns:
        np.array[float]: The values of the PDF for the specified Lorentzian
    """
    norm_fact = 0.5 * Gamma / (np.pi)
    return norm_fact / ((x - x0) ** 2.0 + 0.25*Gamma**2.0)

x = np.arange(-3, 3.01, 0.01)

plt.plot(x, lorentzian(x, -2.0, 1.0), label="($x_0 = -1.0$, $\Gamma = 1.0$)")
plt.plot(x, lorentzian(x, 1.0, 0.5), label="($x_0 = 1.0$, $\Gamma = 0.5$)")
plt.legend(frameon=False)
plt.show()

One way we can see if we have implemented the functions correctly is to ensure that the distributions are properly normalized. We can do this with the scipy integrate package using both numerical and quadrature based integration

To see how to use these functions look at: https://docs.scipy.org/doc/scipy/tutorial/integrate.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.quad.html#scipy.integrate.quad
https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.simpson.html#scipy.integrate.simpson

In [None]:
from scipy.integrate import quad

I = quad(gaussian, -np.inf, np.inf, args=(0.0, 1.0))
print(f"{I[0]} ± {I[1]}")

I = quad(gaussian, -np.inf, np.inf, args=(0.0, 2.0))
print(f"{I[0]} ± {I[1]}")

I = quad(gaussian, -np.inf, np.inf, args=(0.0, 0.5))
print(f"{I[0]} ± {I[1]}")

Here we see some artifacts of numerics and floating point numbers, but all of these values are essentially 1.0, now lets try numerical integration

In [None]:
from scipy.integrate import simpson

x = np.arange(-3.0, 3.0, 0.01)
I = simpson(y=gaussian(x, 0.0, 1.0), x=x)
print(I)

I = simpson(y=gaussian(x, 0.0, 2.0), x=x)
print(I)

I = simpson(y=gaussian(x, 0.0, 0.5), x=x)
print(I)

Here we see numerical integration is slightly off due to not exetending the range far enough, as seen below

In [None]:
from scipy.integrate import simpson

x = np.arange(-30.0, 30.0, 0.01)
I = simpson(y=gaussian(x, 0.0, 1.0), x=x)
print(I)

I = simpson(y=gaussian(x, 0.0, 2.0), x=x)
print(I)

I = simpson(y=gaussian(x, 0.0, 0.5), x=x)
print(I)

**Problem 7**

Do the same integration for the two lorentzian functions from Problem 6.

In [None]:
# Your code here
I = quad(lorentzian, -np.inf, np.inf, args=(-2.0, 1.0))
print(f"{I[0]} ± {I[1]}")

I = quad(lorentzian, -np.inf, np.inf, args=(1.0, 0.5))
print(f"{I[0]} ± {I[1]}")