# Python crashcourse for chemists

<div class="alert alert-block alert-info">
<h2>Overview</h2>

Questions:

* What is the Python programming language, and what is it used for?

* What are Jupyter notebooks, and how are they run?

* What is the basic syntax of the Python programming language?
    
* What is a Python list?
    
* What is a `for` loop?

* What are logic statements?

Objectives:

* Run a Jupyter notebook on Longleaf On Demand.

* Assign values to variables.

* Use the `print` function to check how the code is working.
    
* Use a `for` loop to perform the same action on all the items in a list.
    
* Use the append function to create new lists in `for` loops.

</div>

## What is Python and why use it?

All of the software you use on a regular basis is created through the use of programming languages. 
Programming languages allow us to write instructions to a computer. 
There are many different programming languages, each with their own strengths, weaknesses, and uses. 
Some popular programming languages you might hear about are Javascript (used on the web - any website with interactive content likely uses javascript), Python (scientific programming and many other applications), C++ (high performance applications - much of computational chemistry, self-driving cars, etc), SQL (databases), and many more.

Python is a computer programming language that has become ubiquitous in scientific programming. 
The Python programming language was first introduced in the year 1991, and has grown to be one of the most popular programming languages for both scientists and non-scientists. 

Compared to other programming languages, Python is considered more intuitive to start learning and is also extremely versatile. 
Python can be used to build web applications, interact with databases, and calculate and analyze data.
Python also has many libraries focused on science and machine learning.

In computational chemistry, we will see that we can use Python to calculate molecular properties, analyze results, and visualize data. 
We can also use some of Python's many libraries to fit data and predict properties.

## Getting Started

Our initial lessons will run python interactively through a Python interpreter. 
We will use an environment called a Jupyter notebook. 
The Jupyter notebook is an environment in your browser that can be used to write an execute Python code interactively.

Follow the steps in the [longleaf access](../longleaf_access.md) document to open a Jupyter notebook hosted on Longleaf.


## Refresher on running code in Jupyter Notebook

You run a Jupyter notebook one cell at a time. 
To execute a cell, click inside the cell and press `shift+enter`.


Let's start learning some Python!

## Basic math

Any Python interpreter can work just like a calculator using basic arithmetic operators.

| Operator | Name           | Example |
| -------- | -------------- | ------- |
| +        | Addition       | `x + y`   |
| -        | Subtraction    | `x - y`   |
| *        | Multiplication | `x * y`   |
| /        | Division       | `x / y`   |

Run the cell below by selecting it and pressing (`Shift + Enter`).

In [1]:
3 * 500 + 289


1789

<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

UNC is obviously better than Dook, but how much better? In the cell below, use the provided values to calculate the recent win rate of UNC basketball.
</div>

| UNC wins | Duke wins |
| :------: | :-------: |
|    8     |     6     |


win rate = UNC wins / total games


In [3]:
8/14


0.5714285714285714

## Assign a variable

To save a value for later use we can assign it as a variable. The syntax for assigning a variable is:

```python
variable_name = variable_value
```

Let's define some variables for our calculation.

In [4]:
unc_motto = 'Beat Duke!'
unc_wins = 8 #UNC basketball wins since 2018
duke_wins = 6 #Duke basketball wins since 2018

# win rate of UNC basketball since 2018
win_rate = unc_wins/(unc_wins + duke_wins)

### Code comments
Notice that some lines in the code above contain a `#` comment line. 
The computer does not do anything with these comments. 
Comments are often used to explain what the code is doing or leave information for future people who might use the code.

<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

Calculate how many more NCAA basketball championships UNC has won compared to Dook and save the result as a variable called `championship_difference`. 
</div>

| UNC championships | Duke championships |
| :---------------: | :----------------: |
|         6         |         5          |

In [5]:
championship_difference = 6-5

## Using Functions

Functions are reusable pieces of code that perform certain tasks. 
Examples include printing, opening files, performing a calculations, and many others.

Functions have a name that is followed by parenthesis containing the function inputs separated by commas (also called *arguments*).

```python
function_name(argument1, argument2...)
```

A useful function that is built into python is the `print()` function. Let's print a string (text) and one of our variables.

In [6]:
print('Beat Duke!')
print(championship_difference)

Beat Duke!
1


## Modify a variable

This season UNC will obviously beat Duke twice. 

We can add 2 points to the variable `unc_wins` to account for these extra wins. However, to permanently change a variable we need to re-save the variable after modifying it. 

Compare what happens if you do or do NOT re-save a variable after making a change.

In [7]:
print('Variable NOT re-saved:')
print(unc_wins)
unc_wins + 2
print(unc_wins)

print('Variable re-saved:')
print(unc_wins)
unc_wins = unc_wins + 2
print(unc_wins)

Variable NOT re-saved:
8
8
Variable re-saved:
8
10


<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

UNC might win a championship this year. Modify the `championship_difference` variable to add an additional championship win and print the result. 
</div>

In [9]:
championship_difference = championship_difference + 1
print(championship_difference)

3


## Basic Data Types

In Python, data types define the kind of data a variable can hold, and they influence how you can use and manipulate that data. Here are three of the most common data types.

| Date type (abbreviation)        | Description                          | Example         |
| ------------------------------- | ------------------------------------ | --------------- |
| string (`str`)                  | plain text                           | `'Hello World'` |
| integer (`int`)                 | whole number                         | `23`            |
| floating point number (`float`) | rational number with a decimal point | `3.1416`        |

You can easily identify the data type of any variable with the `type()` function by inputting a variable as an argument like this:

```python
type(variable_name)
```

In [10]:
print(unc_wins)
print(win_rate)
print(unc_motto)

10
0.5714285714285714
Beat Duke!


<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

Try to determine the type of each variable before running the following three cells.
</div>

In [11]:
type(unc_wins)

int

In [12]:
type(win_rate)

float

In [13]:
type(unc_motto)

str

## Functions that Return Values

Most functions do more than just perform tasks—they can also send information back to you. When a function sends back a result, we say it "returns a value." You can then save this returned value as a variable for future use like this:

```python
variable = function_name(argument1, arguement2...)
```

Let's run some math related functions that return values.

In [14]:
# max() returns the highest value.
max_value = max(-13.4, -2.7, 5.4, 42.1)

# min() returns the lowest value.
min_value = min(-13.4, -2.7, 5.4, 42.1)

print(max_value)
print(min_value)

42.1
-13.4


## Functions with Additional Arguments
Arguments are used to input values into a function, and they can also be used to adjust how the function behaves.

To understand how to modify a function, we can look at the documentation for that function.
In Jupyter, you can use the `help()` function to see its documentation, which will show available arguments and what each one does.

Let's look at the documentation for the `round()` function.

In [15]:
help(round)

Help on built-in function round in module builtins:

round(number, ndigits=None)
    Round a number to a given precision in decimal digits.

    The return value is an integer if ndigits is omitted or None.  Otherwise
    the return value has the same type as the number.  ndigits may be negative.



The documentation show us that `round()` takes two arguments:
* *number* (inputted value)
* *ndigits* (number of digits past the decimal point)

Let's see how the *ndigits* argument affects the behavior of the `round()` function.

In [16]:
# First 12 digits of pi
pi = 3.14159265359

# Round pi using the default precision (ndigits = None).
pi_rounded = round(pi)

# Round pi to 2 decimal places (ndigits = 2).
pi_2_decimal = round(pi, ndigits=2)


print(pi_rounded)
print(pi_2_decimal)


3
3.14


## Lists

A common data structure in python is the list.

Lists in Python are useful when you want to store multiple pieces of data in a single variable.
You can think of a list as a collection of values, where each value can be accessed by its position, or index.

Lists are defined using square brackets `[ ]` and each element is separated by a commas. 

```python
list_name = [element_1, element_2, element_3]
```

**Key Features of Lists:**
- Ordered: Lists maintain the order of items as you add them.
- Mutable: You can change the content of a list (add, remove, or modify items).
- Indexable: You can access specific items in a list by their index (position).

In [32]:
# This is a list
energy_kcal = [-13.4, -2.7, 0.0, 5.4, 42.1]
temperature_F = [50.1, 82.6, 69.4, 74.8, 52.8, 67.9]

print(energy_kcal)
print(temperature_F)

[-13.4, -2.7, 0.0, 5.4, 42.1]
[50.1, 82.6, 69.4, 74.8, 52.8, 67.9]


Python has several built in functions which can be used on lists. 

```python
function_name(list_name)
```

The built-in function `len()` can be used to determine the length of a list.

The built-in function `sum()` can be used to determine the sum of a list.

<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

Use the `len()` and `sum()` functions to determine the average of the `energy_kcal` list and print your result.
</div>

average = sum / number of values

In [20]:
average = (sum(energy_kcal))/ (len(energy_kcal))
print(average)


6.28


## Indexing

In Python, indexing allows us to access individual elements in a list.

```python
element = list_name[<position_of_element>]
```

In python counting starts at zero, so the first element of a list is in position zero:

```python
first_element = list_name[0]
second_element = list_name[1]
```

In [21]:
# Print the first element of the list
print(energy_kcal[0])

-13.4


You can use an element of a list as a variable in a calculation.

In [22]:
# Convert the second list element to kilojoules.
energy_kilojoules = energy_kcal[1] * 4.184
print(energy_kilojoules)

-11.296800000000001


## Define a Function

If you plan to perform the same coding task multiple time consider making your own function.

To define a function in Python, you use the `def` keyword, followed by the `function_name`, parentheses `( )` (which may contain arguments), and a colon `:`. The function's code block is indented beneath the definition. Finally, use the `return` command (optional) to return a value from the function.

```python
def function_name(arguments):
    # Code to execute
    return result  # Optional: returns a value
```

Let's define our own function and use it.

In [28]:
# Define the get_rounded_average() function
def get_rounded_average(values, round_ndigits):
    average = sum(values) / len(values)
    rounded_average = round(average, ndigits = round_ndigits)
    return rounded_average

# Number of UNC basketball wins per season under head coach Roy Williams
roy_williams_wins = [19, 33, 23, 31, 36, 34, 20, 29, 32, 25, 24, 26, 33, 33, 26, 29, 14, 18]

# Number of Duke basketball wins per season under head coach Mike Kryzewski
mike_krzyzewski_wins =[17, 10, 11, 24, 23, 37, 24, 28, 28, 29, 32, 34, 24, 28, 13, 18, 24, 32,
                       37, 29,35, 31, 26, 31, 27, 32, 22, 28, 30, 35, 32, 27, 30, 26, 35, 25,
                       28, 29, 32, 25, 13, 32]

# Run the get_rounded_average() function
unc_avg_wins = get_rounded_average(roy_williams_wins, round_ndigits=1)
duke_avg_wins = get_rounded_average(mike_krzyzewski_wins, round_ndigits=1)

print('UNC wins per season:', unc_avg_wins)
print('Duke wins per season:', duke_avg_wins)

UNC wins per season: 26.9
Duke wins per season: 27.0


I guess Dook is ok....

<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

Define a function called `convert_fahrenheit_to_celsius` that converts temperature in freedom units (fahrenheit) to celsius and returns the celsius value. Test your function with the value 77.0 degrees fahrenheit, you should get 25 degrees celsius.
</div>

celsius = (fahrenheit - 32) * 5/9

## Repeating an operation many times: `for` loops

Often, you will want to repeat a block of code on multiple elements of a list. The structure to do this is called a `for` loop. The general structure of a for loop is:

```python

for each_item in your_list:
    do things using each_item

```

There are two very important pieces of syntax for the for loop. 

1) Notice the colon `:` after the word `list`. You will always have a colon at the end of a for statement. If you forget the colon, you will get an error when you try to run your code.

2) Notice the lines of code under the for loop (the things you want to do several times) are indented. Indentation is very important in python and consist of 4 spaces. Simply stop indenting to end the code included in the `for` loop.

Let’s use a loop to change all of our energies in kcal to kJ.


In [23]:
# Convert Kcal to KJ
for number in energy_kcal:
    kJ = number * 4.184
    print(kJ)
print('finished')

-56.0656
-11.296800000000001
0.0
22.593600000000002
176.1464
finished


Now it seems like we are really getting somewhere with our program!
But it would be even better if instead of just printing the values, it saved them in a new list.

To do this, we are going to use the `append()` function which adds a new item to the end of an existing list.

The general form of the `append()` function is:

```python
list_name.append(new_thing)
```

In [24]:
energy_kJ = []
for number in energy_kcal:
    kJ = number * 4.184
    energy_kJ.append(kJ)

print(energy_kJ)

[-56.0656, -11.296800000000001, 0.0, 22.593600000000002, 176.1464]


## Error messages

If you make a mistake in your code or syntax you will likely receive an error message from Python.

Let's see what kind of error message we receive if we forget to pre-define the list we are appending to.

In [31]:
for number in energy_kcal:
    kJ = number * 4.184
    kj_values.append(kJ)

print(kj_values)


NameError: name 'kj_values' is not defined

This is an example of an error message. If you learn to interrupt these messages they can be extremely helpful for troubleshooting. 

In Python, read the last line of the error message first to understand what went wrong in the program execution.

This code doesn’t work because on the first iteration of our loop, the list `kJ_values` doesn’t exist. To make it work, we have to define the list outside of the loop.

<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

Use a for loop to convert all the fahrenheit temperature values in `temperature_F` list into celsius values. Try to reuse the function that you previously created.
</div>

celsius = (fahrenheit - 32) * 5/9

## Making choices: logic statements

Within your code, you may need to evaluate a variable and then do something if the variable has a particular value. This type of logic is handled by an `if` statement. In the following example, we only append the negative numbers to a new list.

In [34]:
pH_values = [3.5, 0.8, 7.0, 8.2, 6.1, 7.0, 0.4, 12.4, 2.9, 6.5, 9.1, 13.4, 7.4]
acidic_values = []

for pH in pH_values:
    if pH < 7:
        acidic_values.append(pH)
        
print(acidic_values)

[3.5, 0.8, 6.1, 0.4, 2.9, 6.5]


Other logic operations include:

**Comparison operators**

| Operator | Name                     | Example  |
| -------- | ------------------------ | -------- |
| ==       | Equal                    | `x == y` |
| !=       | Not equal                | `x != y` |
| >        | Greater than             | `x > y`  |
| <        | Less than                | `x < y`  |
| >=       | Greater than or equal to | `x >= y` |
| <=       | Less than or equal to    | `x <= y` |

**Logical operators**

| Operator | Description                                             | Example                 |
| -------- | ------------------------------------------------------- | ----------------------- |
| and      | Returns True if both statements are true                | `x < 5 and  x < 10`     |
| or       | Returns True if one of the statements is true           | `x < 5 or x < 4`        |
| not      | Reverse the result, returns False if the result is true | `not(x < 5 and x < 10)` |

In [38]:
weak_acid = []
for pH in pH_values:
    if pH > 1 and pH < 7:
        weak_acid.append(pH)

print(weak_acid)

[3.5, 6.1, 2.9, 6.5]


## `if`, `elif`, `else` Statements

For more complex logic you may need more options. If `if` isn't enough you can also use `elif` and `else`.

- `if`: The `if` statement checks a condition. If the condition is `True`, the code block under `if` runs.
- `elif`: Short for "else if," `elif` checks another condition if the previous `if` (or `elif`) condition was `False`. You can use multiple elif statements to check multiple conditions.
- `else`: The `else` statement runs if none of the `if` or `elif` conditions are `True`. It’s like a "catch-all" for any remaining cases.

Let's see an example.

In [42]:
weak_acid = []
strong_acid = []
not_acid = []

for pH in pH_values:
    if pH > 1 and pH < 7:
        weak_acid.append(pH)
    elif pH <= 1:
        strong_acid.append(pH)
    else:
        not_acid.append(pH)
        

print('Weak acid:', weak_acid)
print('Strong acid:', strong_acid)
print('Not_acid:', not_acid)

Weak acid: [3.5, 6.1, 2.9, 6.5]
Strong acid: [0.8, 0.4]
Not_acid: [7.0, 8.2, 7.0, 12.4, 9.1, 13.4, 7.4]


<div class="alert alert-block alert-warning"> 
<h3>Challenge</h3>

Modify the logic statements below to include categories for:
- `neutral` (pH 7)
- `weak_base` (7.01-12.99)
- `strong_base` (pH 13+)

Append the pH values to the appropriate lists.
</div>

In [None]:
weak_acid = []
strong_acid = []
neutral = []
weak_base = []
strong_base = []

for pH in pH_values:
    if pH > 1 and pH < 7:
        weak_acid.append(pH)
    elif pH <= 1:
        strong_acid.append(pH)
        

print('Weak acid:', weak_acid)
print('Strong acid:', strong_acid)
print('Neutral:', neutral)
print('Weak base:', weak_acid)
print('Strong base:', strong_acid)