# **ASGSA Tools of the Trade** - Basic Python

In this Tools of the Trade, we will be exploring how to do some basic coding in Python, including:

- How to make and work with lists and Numpy arrays
- How to use `for` loops
- How to use conditional statements (`if`, `elif`, and `else`)
- How to write functions
- How to plot data with matplotlib

The only libraries required today are **Numpy** and **Matplotlib**, both of which are included in the 'base' Anaconda environment. Documentation for these modules can be found at the following websites:

Numpy: https://numpy.org/doc/stable/
<br>Matplotlib: https://matplotlib.org/stable/index.html

For questions relating to this code, please contact Alec Sczepanski (alec.sczepanski@und.edu).

Without further ado, here we go!

***

To begin, we will import necessary modules. 

Python will only use the modules that are imported into the code you are writing. In other words, even if the module is installed on your PC or called in your Anaconda environment, your Python code will not know to use it unless it is imported.

When importing, you can shorten or alias the module name to make it easier to call as you code:
- `import pandas as pd`

You can use any alias that you would like! Such as:
- `import pandas as express`

You can also import a specific sub-module. A common sub-module to import is `pyplot` within the `matplotlib` module:
- `import matplotlib.pyplot as plt`

Let's import the modules we will need for today's Tools of the Trade:
- `numpy`: will be used to process and manipulate lists and arrays
- `matplotlib.pyplot`: will be used to create a figure with axes and plot data

In [None]:
import numpy as np
import matplotlib.pyplot as plt

Next, we will make a couple of lists.

A **list** is a collection of items that can be of differing data types (e.g. numbers and letters, integers and floats, etc).

 ***Examples of lists:***<br>
 *List with just integers:*<br>
 `[2, 4, 6, 8, 10]`

 *List with combination of integers and strings:*<br>
 `[one, 3, five, 7, nine]`
 
 *List with combination of integers with one index being a list itself (list inside of a list):*<br>
 `[6, 2, 3, [8, 9, 100], 12]`

<br>

A similar thing in Python is an **array**, which is a vector of elements of the *same data type* (most commonly numbers). 

 ***Example of 2x3 array:***<br>
 `[[2, 4, 6]`<br>
 `[1, 3, 5]]`
<br>

Lists are generally more intuitive and flexible to use, but become computationally expensive for larger datasets or difficult/impossible to use for data of differing dimensions.

Arrays are overall computationally less expensive and can make manipulating data a breeze, but are overall more strict in their formatting and require importing libraries to create (e.g. Numpy). We will look at arrays shortly.

With that said, let's make a couple of lists:

In [None]:
x = [2, 4, 5, 6, 8] #Numbers can be put into numerical order
y = [5, 3, 7, 4, 2] #Number can also be put in a random order!

Now let's calculate some basic statistics. Numpy is a powerful tool as it contains many of the functions we need to work with lists and arrays. `math` is another module that is useful for calculations, particularly of the trigonometric variety. Basic Python has built-in functions for determining other basic stats.

Let's look at some statistics for the lists we created:

In [None]:
# Determine the length of each list:
length_x = len(x)
length_y = len(y)

print("The length of list x is: ", length_x)
print("The length of list y is: ", length_y)

In [None]:
# Calculate the average of each list:
avg_x = np.average(x)
avg_y = np.average(y)

print("This is the average value of list x: ", avg_x)
print("This is the average value of list y: ", avg_y)

***
We can find the minimum and maximum value of each list, but writing the code to find each value can be cumbersome and repetitive, especially if working with more lists than we are today. Thankfully, we can define functions.

A **function** is a block of code that is defined under a name. The code will only run when the function is called. When defining a function, at least one argument must be included, but an infinite amount of arguments can be included.

Let's make a basic function to find the minimum and maximum values of each list and print them to the screen:

In [None]:
# Here, we define a function called "min_max_values" where a necessary argument is the input of a list

def min_max_values(input_list):
    
    # Find the max value of the input list:
    max_value = np.max(input_list)
    
    # Find the min value of the input list:
    min_value = np.min(input_list)
    
    print("\nName of input list: ", input_list)
    print("This is the maximum value of the input list: ", max_value)
    print("This is the minimum value of the input list: ", min_value)
    

# Let's use this function on our two lists:
min_max_values(input_list = x)
min_max_values(input_list = y)

***

While lists are handy, intuitive, and work in a pinch, arrays are generally faster and more productive to work with. When dealing with data in a list, you must go element-by-element. When working with data in an array, you can work on an entire array at once. This is made quicker with Numpy since this module is built on C, which is faster than Python. For example, a list with 1 million elements will take a hot minute or eight to loop through, whereas executing an operation on an array with 1 million elements should only take a few seconds.

A few important things to remember about arrays: 
- Each row must be the same length 
- Each column must be the same length
- Each element in the array must be of the same data type (e.g. float, integer, datetime64, etc)

This is different from lists where you can mix and match data types, and even throw in another list as one of the elements.

Another thing to know:
- An *x* by *y* array is ***x* rows tall** and ***y* columns wide**. Example: a 3x2 array (3,2) is 3 rows, 2 columns.

Let's try building an array and messing with the data inside of it.

In [None]:
# Build a 2x3 array:

r = np.array([
    [1, 2, 3],
    [4, 5, 6]])

print(r)

In [None]:
# Find the min, max, and mean of the array:
r_min = np.min(r)
r_max = np.max(r)
r_mean = np.mean(r)

print('The minimum, maximum, and mean values of r are, respectively: {0}, {1}, and {2}.'.format(r_min, r_max, r_mean))

In [None]:
# Multiply each element by 2, add 3, and take the square root:
q = np.sqrt((r*2)+3)

print(q)

And with that example, we performed three operations on one line without having to use any loops. What are loops? Let's look at that next.

Note: for the remainder of this workshop, we will be using lists to highlight other Python functions.

***
Lists (and arrays) can be looped through to manipulate the data. To do this, we will use a `for` loop. 

A **`for` loop** loops through the data in a list or array, one element at a time, until either there is no more data to go through or a prescribed amount of data has been looped through.

Let's make a `for` loop to multiply the elements of list `x` by 2: 

In [None]:
# First, create a new, empty list to append the output of the 'for' loop to:
x_modified = []

# Use a 'for' loop to modify list x:
for i in x:
    
    #Perform the operation:
    j = i*2
    
    #Append the new value to x_modified:
    x_modified += [j]
    
print("This was the original list 'x': ", x)
print("Here is the modified list 'x': ", x_modified)

***

Another way to process data is via **conditional statements**. These include `if`, `elif`, and `else`. 

The `if` statement defines a block of code that will only be executed if a specific condition is `True`.

If you have multiple conditions, you can use the `elif` statement, which is short for 'else-if'. This means if the initial `if` statement is not met, then the program tries the next condition ("else-if").

If no condition is met, you can assign a default output using `else`.

Let's see this in action by dividing 1 by a variety of numbers.

In [None]:
# Establish a list of numbers to divide 1 by:
numbers = [-2, -1, 0, 1, 2]

# Use a 'for' loop to loop through numbers:
for i in numbers:
    
    # If i > 0, print that the result is positive, then print the result:
    if i > 0:
        
        print('\nWhen 1 is divided by {0:1d}, the result will be positive.'.format(i))
        
        result = 1/i
        
        print('The resulting number is: ', result)
        
    # If i < 0, print that the result is negative, then print the result:
    elif i < 0:
        
        print('\nWhen 1 is divided by {0:1d}, the result will be negative.'.format(i))
        
        result = 1/i
        
        print('The resulting number is: ', result)
        
    # Assume any other number is 0, which cannot be in the denominator:
    else:
        
        print('\nYou cannot divide by zero. That is undefined. No. Just... no.')

Conditional statements can also be written all on one line, increasing code efficiency. Here's a quick example:

In [None]:
# Will use same list of numbers as previous cell:
for i in numbers:
    
    # If i < 0, square it. Otherwise, add 1 and cube it.
    z = i**2 if i < 0 else (i+1)**3    ###### NOTE: THIS VARIABLE CHANGED FROM x TO z
    
    print(i, '-->', z)

***
Next, let's plot some data. 

We will use matplotlib to plot our two lists from before, `x` and `y`, in a basic line plot. The output will be printed to the screen, or can be saved as a .png.

Let's plot the line plot:

In [None]:
# Create a new figure:
fig = plt.figure(figsize = (8,6)) #figsize = (horizontal, vertical) [in inches]

# Add a fresh axis to the figure:
ax = fig.add_subplot(111) #(111) = 1x1 panel plot, 1st panel. (221) would be a 2x2 panel plot, 1st panel (upper-left)

# Plot the data:
plt.plot(x, y, color = 'red')

# Label the axes:
plt.xlabel('List x')
plt.ylabel('List y')

# Set axes limits:
plt.xlim(0,10)
plt.ylim(0,10)

# Draw gridlines:
plt.grid(which = 'major', axis = 'both')

# Set figure title: 
ax.set_title('List x vs List y', size = 14)

# Show the finished result:
plt.show()

#plt.savefig('plot.png', dpi=350)

# Close the plot (VERY IMPORTANT WHEN MAKING MULTIPLE PLOTS!!! If plots aren't closed, PC may use up all of its memory and crash!!!)
#plt.close()

***

**Exercise:** Below is a list of temperatures in Fahrenheit. Using the tools we learned today, convert the temperatures into Celsius and plot a graph of Celsius vs Fahrenheit temperatures.

Recall:
<br>
$C = (F-32)*(5/9)$