# Data Structures

When I first picked up Python, I did so coming from Matlab.  Most of my attempts to use it started with a Google Search of "how do I solve a system of equations in Python" or "How do I solve Initial Value Problems in Python."

These types of questions are probably the reason most of us want to use a programming language.  Both of those questions will tend to push you toward two Python Packages in particular: `Numpy` and `Scipy`.  Both of them are pretty awesome -- these are the tools that will replicate most of the tools you're familiar with from probably Matlab.  It is tempting to try to restrict yourself to the Numpy environment because, in Numpy, things behave very similar to matrices, rows, etc. to those in Matlab.  However, Numpy and Scipy are subspaces within the considerably more general Python language.  If you work in Python more broadly, you are likely to encounter places where others aren't containing themselves strictly to Numpy-like operations, and they work more in the base Python environment, where matrices really don't exist.  Moreover, since Numpy and Scipy routines are still Python at heart, some of their syntax may require you to know a bit about the base Python environment. 

Initially, when I picked up Python, I was confused by the different types of Data structures, why so many existed, how they were different, and where each could be used. My confusion eased considerably after I learned how these different data structures behave, so I think it is worth teaching `lists`, `tuples`, and `dictionaries` in additon to the `numpy arrays` that most of us will probably generally want to use for math and engineering programs.

## Lists

A list is an indexed, **mutable** set of "items" that are ordered, i.e., each entry has a unique index. The "items" in a list can include any type of scalar or any type of collection and there are really few restrictions on what can be stored in a list structure. Because lists are **mutable**, we can we can change their contents after they are created. 

### Creating a list - use brackets `[]`

In [None]:
G = [1, 2, 3] #A list of integers
H = ['dog','cat','bat'] #A list of strings
I = [[1.6, 2.7, 3.4], [4.4, 5.2, 6.8]] #A list of lists; each list is a list of floats.
OH_WOW = [G, H, I] #Lists in python are neat because the elements don't have to be consistent type or size -- "structure" 

In [None]:
print(G)
print(H)
print(I)
print(OH_WOW)

#### The List Constructor

In [None]:
J = list([1, 2, 3])  
L = list(range(0, 11, 1)) #range(start, stop, stepsize)
print(J, '\n')
print(type(J), '\n')
print(L, '\n')
print(type(L))

### Lists are mutable

In [None]:
print(J)
J[1] = 4
print(J)

### The many ways to manipulate Lists

You can find a guide to these methods at the very awesome https://docs.python.org/3.10/tutorial/datastructures.html

Here is just a basic example of the above `listname.methodname()` convention; we will use the append method to add the number 443 to the list called **J** that we previously modified. We will then sort J in ascending order.  We print the results of each manipulation. Enter the following commands to get a feel for how list methods work:

<div class = "alert alert-block alert-warning">
    <b>Mutability:</b> As you will see in the cell below, each of these methods will modify a list in place, i.e., they change the contents of a list without you specifically redefining the list. Be careful with them because lists are mutable!
    </div>

In [None]:
print(J)
J.append(443)
print(J)
J.sort()
print(J)

### Dimensionality and shape of lists

Now lets take a closer look at indexing in lists.  ***Lists are not matrices***. 

In [None]:
print(I)
print(len(I))

### List Indexing

Let's say I want to extract the number 3.4 out of this list.  I would do so by referencing its index:  It is the 3rd element inside of the first list; therefore, I would call its value by typing:

In [None]:
print(I[0][2]) #Indexing the 3rd element in the first list (index 0 --> index 2)
# print(I[0, 2]) #(row, column) Matrix-like indexing is not supported with lists

### Slicing a List

In [None]:
print(I)
print(I[0][:])   #All elements in first list in I
print(I[0][0:2]) #Index 0 and 1 in the first list in I, excludes last index
print(I[0][1:])  #Index 1 to end of first list in the first list in I
print(I[-1][:])  #All elements in last list in I

## Tuples

A tuple is an immutabel set of "items" that are ordered and arranged by index. Lack of mutability means that, like a string, we **cannot** change the contents of a tuple once it is defined. If you need to replace an element in that tuple, you have to redefine the tuple.

### Creating a Tuple


In [None]:
A = (1.0, 2.0, 3.0, 4.0)                 #creates a tuple called A that contains four integers,
B = ('apples', 'oranges', 'bananas')     #creates a tuple called B that contains 3 different strings,
C = tuple(range(3, 22, 3))               #creates a tuple that is comprised of the numbers from 3 to 21 in increments of 3, 
D = (('one', 'two', 'three'), [4, 5, 6]) #creates a tuple that is comprised of two tuples
E = (A, B, C, D)                         #creates a tuple that includes four other tuples: A, B, C, and D. 

In [None]:
print(A)
print(B)
print(C)
print(D)
print(E)

### Indexing with tuples

In [None]:
print(B[0]) #First element of B

In [None]:
All_Her_Favorite_Fruit = (B[0],B[2]) #Create a new tuple from elements of B
print(All_Her_Favorite_Fruit)

### The immutability of tuples


A very important property of tuples is that they are **immutable**, so if I see that I mistakenly included bananas instead of kiwis in my list, I cannot redefine it as:

```python

```

which will produce an error.  Instead, I would have to define a new tuple:

In [None]:
print(B)
# B[2] = 'kiwis'
oops = ('apples', 'oranges', 'kiwis')
print(oops)

## Dictionaries

Dictionaries are arranged as a set of *key:value* pairs (just like a real dictionary has a word:definition pair). You store key:value pairs when you create a dictionary, and when you need to recall a specific value, you do so by passing the key associated with that value into the dictionary name.

### Creating a Dictionary using `{}`

In [None]:
GD_77 = {'lead_guitar': 'Jerry', 'rhythm_guitar': 'Bobby', 'bass': 'Phil', 'keys': 'Keith', 'drums': 'Kreutzmann', 'other_drums': 'Mickey', 'vox': 'Donna'}
print(GD_77)

### Creating a Dictionary using `dict([(), (), ...])`

In [None]:
GD_1977 = dict([('lead_guitar', 'Jerry'), ('rhythm_guitar', 'Bobby'), ('bass', 'Phil'), ('keys', 'Keith'), ('drums', 'Kreutzmann'), ('other_drums', 'Mickey'), ('vox', 'Donna')])
print(GD_1977)

### Indexing in Dictionaries

In [None]:
GD_77['bass']

### Dictionaries are mutable

In [None]:
GD_77['lyrics'] = 'Hunter'
print(GD_77)

In [None]:
GD_77['bass'] = 'Oteil'
print(GD_77)

## NDArrays/Numpy Arrays

NDarrays are **mutable**, ordered collections in which each element has a specific index. They can include any type of scalar or collection, so they are superficially similar to lists, but they have some important differences. One of the major ones, in simple terms is that you can *generally* perform mathematical operations directly on an array, whereas you cannot *generally* perform mathematical operations directly on a list. If you are familiar with Matlab, the "array" class in Python behaves similarly to the vectors and matrices that you're used to working with in Matlab (whereas lists, tuples, and dictionaries do not). To me, coming from a Matlab background, working with lists *feels* awkward, but working with arrays feels familiar. We probably can get away with mostly working with arrays in engineering courses; however, there are going to be certain cases where we need to deal with a lists, tuples, or dictionaries, so it is important to know how they differ.

In Python, the most robust array support is provided by the **numpy** package, so this is a good place to introduce an aspect of Python that differs from something like Matlab. In a commercial math software package like Matlab, more often than not, we mostly use features that are built into the language and accessible by default in the base of that language. In contrast, we frequently will need to add functionality to Python that is missing from the base. We do this by adding packages. 

<div class = "alert alert-block alert-info">
    <b>Note:</b>You should be aware that Anaconda and Colab install most of the packages we'll need by default, we just need to add them to a particular session using the import command. If you find yourself using a standalone Python installation, or if you need packages outside of what would be included in Anaconda, you may need to install packages manually using either **pip** or **conda** package managers. 
    </div>
    
We will use it now to import the numpy package, which includes very nice array support. To me, numpy *feels* like importing a matlab-like environment into Python. It basically enables matrix support and linear algebra, so it provides an environment that is similar to Matlab, and it includes a lot of commands and modules that you may already be familiar with. Module 03 provides a detailed look at numpy arrays, but we introduce their construction here alongside other common types of data structures that we'll need to use in Python

### Importing Numpy

At its most basic, we import a package (numpy in this case) as follows:

In [None]:
import numpy

### Creating a Numpy array (ndarray) using `np.array([])`

Now that I've imported numpy, I can use it to create an array, I can do so with the `array()` constructor in numpy--this constructor takes either a list [] or a tuple () as its argument. You'll note similarities with the `list()` and `tuple()` constructors introduced earlier.

In [None]:
K = numpy.array([0, 10, 20])
# K = numpy.array((0, 10, 20)) #The two are equivalent in their output

In [None]:
print(K)
print(type(K))
print(len(K))

### Aliasing your imports

In [None]:
import numpy as np

In [None]:
M = np.array([1, 2, 3, 4])
print(M)

### Importing only selected  features or functions from a package


In [None]:
from numpy import array

In [None]:
N = array([1, 2, 3, 4])
print(N)

### Indexing in arrays

Generally, np.arrays index support all list indexing conventions.  They also support the "(row, column)" indexing you might be used to with matrices; we'll cover this with 2D arrays later.

In [None]:
N[3]

### Mutability of arrays

In [None]:
print(N)
N[3] = 10
print(N)

### Built in functions that manipulate and mutate arrays 

There are many modules that allow manipulation of an array (an overwhelming amount).  My suggestion: look them up when you need them.

https://numpy.org/doc/stable/reference/routines.array-manipulation.html

In [None]:
print(N)
N = numpy.append(N, 25)
print(N)

## A significant difference between a list and a numpy array

In [None]:
O = [1, 2, 3, 4, 5]
P = np.array([1, 2, 3, 4, 5])
print(O, '', len(O))
print(P, '', len(P))

In [None]:
print(O + 5)   #error
# print(O - 5)   #error
# print(O + O)   #concatenate
# print(O + [5]) #concatenate
# print(3*O)     #concatenate
# print(O**2)    #error
# print(P + 5)
# print(P - 5)
# print(P + P)
# print(2*P)
# print(P**2)