![Energy Analytics Logo](https://github.com/jessepisel/energy_analytics/blob/master/EA_logo.jpg?raw=true)

# Introduction to Data Structures

## Freshman Research Initiative Energy Analytics CS 309

#### Jesse Pisel, Assistant Professor of Practice, University of Texas at Austin

**[Twitter](http://twitter.com/geologyjesse)** | **[GitHub](https://github.com/jessepisel)** | **[GoogleScholar](https://scholar.google.com/citations?user=Z4JzYgIAAAAJ&hl=en&oi=ao)** | **[LinkedIn](https://www.linkedin.com/in/jesse-pisel-70519430/)**

Now that we understand how variables work, and that we can assign different types of objects to variables, let's go ahead and learn about Python data structures. In this notebook we are going to cover the following data structures:
+ `list`
+ `tuple`
+ `dictionary`
+ `set`

Let's start with the `list`. Imagine for a second that you have a collection of numbers that you would like to store in memory, but don't want to have to create a variable for each number. This is where a `list` comes in handy. Let's make our first `list`

In [1]:
first_list = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 10000]
# note the square brackets

Great! now we have a `list` that contains 12 integers. Just like variables we can create lists with `int`, `float`, `str`, or `bool` data types. Now that we have `list`, we want to be able to access the different values in the list. This is done with something called an `index`. Each item in the list is assigned an `index` value that starts at zero and is proportional to the length of the list. So in `first_list` value 10000 has an index of 11. So let's use the index to get that value

In [2]:
first_list[11]

10000

Now let's make a second list with strings

In [3]:
second_list = ["Texas", "New Mexico", "Colorado", "Wyoming", "Montana"]
# and let's use the index to get the third item 'Colorado'
second_list[2]  # remember the list index starts at 0

'Colorado'

We have seen how to build lists, but what if we want to add to `second_list`? We use `append`

In [4]:
second_list.append("Alberta")
second_list.append("Northwest Territories")

In [5]:
print(second_list)

['Texas', 'New Mexico', 'Colorado', 'Wyoming', 'Montana', 'Alberta', 'Northwest Territories']


We can also get items by using the index from the end of the list. Let's say we want to get the second to last item in the list (`Alberta`) we use:

In [6]:
second_list[-2]

'Alberta'

Our next data structure type is the `tuple`. Tuples are like lists, however they are immutable. This means that you cannot modify them. 

In [7]:
first_tuple = (3.14, 2.78, -1, 1000)
# note the parentheses, not necessary but good to use

In [8]:
# we can use indexing to call values from a tuple
first_tuple[1]

2.78

Tuples are used in cases where we can assume that the collection of values will not change. 

Next up on our Python data structures is a `dictionary`. Dictionaries use key-value pairs to store the data in memory. This means that instead of using an index to call a value, we can use the key to call the value.

In [9]:
first_dict = {"State": "Texas", "City": "Austin", "Temperature": 95}
# note curly brackets

Let's call the value `Texas` from the `dict` using its key

In [10]:
first_dict["State"]

'Texas'

So instead of using an index value, we simply use a key to look up the value. But how do we add additional values to the `dict` like we did to the `list` above? It's actually pretty easy, we just add the key in square brackets and assign it to the value

In [11]:
first_dict["Hot"] = True

In [12]:
print(first_dict)

{'State': 'Texas', 'City': 'Austin', 'Temperature': 95, 'Hot': True}


Lastly let's talk about the `set`. A set is an unordered collection of objects. We use sets when we are more interested in the existence of an object in a collection rather than the order or number of times it appears

In [13]:
first_set = set(["porosity", "permeability", "velocity", "api"])
# note set() and square brackets

In [14]:
"porosity" in first_set

True

We can add or remove items from the set using `add` and `remove`

In [15]:
first_set.add("gamma ray")
first_set  # note it sorts alphabetically

{'api', 'gamma ray', 'permeability', 'porosity', 'velocity'}

In [16]:
first_set.remove("api")
first_set

{'gamma ray', 'permeability', 'porosity', 'velocity'}

Now that we have covered the `list`, `dict`, `tuple`, and `set`, let's take a look at how we can use a `for` loop to go through data structures. We will start with list iteration on our `second_list` from above

In [17]:
for location in second_list:
    print(location)

Texas
New Mexico
Colorado
Wyoming
Montana
Alberta
Northwest Territories


How about the `first_dict`?

In [18]:
for key in first_dict:
    print("%s --> %s" % (key, first_dict[key]))

State --> Texas
City --> Austin
Temperature --> 95
Hot --> True


We can also use the `iteritems` method as well

In [19]:
for key, value in first_dict.items():
    print("%s --> %s" % (key, value))

State --> Texas
City --> Austin
Temperature --> 95
Hot --> True


Can you create a `for` loop to loop through the `first_tuple` above to print out the values?