# Introduction to Python Programming Language: Data Structures

## What are Data Structures?
Data structures are containers that organize and group data types together in different ways.

In this lesson, you'll learn about:

* Lists
* Tuples
* Sets
* Dictionaries


## Lists

A list is one of the most common and basic data structures in Python.

We can create a list with square brackets. Lists can contain any mix and match of the data types you have seen so far.

In [273]:
list_of_random_things = [1, 3.4, 'a string', True]

This is a list of 4 elements. All ordered containers (like lists) are indexed in python using a starting index of 0. Therefore, to pull the first value from the above list, we can write:

In [274]:
list_of_random_things[0]

1

You can index from the end of a list by using negative values, where -1 is the last element, -2 is the second to last element and so on.

In [275]:
list_of_random_things[-1]

True

In [276]:
list_of_random_things[-2]

'a string'

### Slice and Dice with Lists

We we can pull more than one value from a list at a time by using slicing. When using slicing, it is important to remember that the lower index is inclusive and the upper index is exclusive.

In [277]:
list_of_random_things = [1, 3.4, 'a string', True]

list_of_random_things[1:2]

[3.4]

will only return 3.4 in a list. Notice this is still different than just indexing a single element, because you get a list back with this indexing. The colon tells us to go from the starting value on the left of the colon up to, but not including, the element on the right.

If you know that you want to start at the beginning, of the list you can also leave out this value.

In [278]:
list_of_random_things[:2]

[1, 3.4]

or to return all of the elements to the end of the list, we can leave off a final element.

In [279]:
[3.4, 'a string', True]

[3.4, 'a string', True]

This type of indexing works exactly the same on strings, where the returned value will be a string.

### Are you `in` OR `not in`?
You saw that we can also use `in` and `not in` to return a bool of whether an element exists within our list, or if one string is a substring of another.

In [280]:
'this' in 'this is a string'

True

In [281]:
'isa' in 'this is a string'

False

In [282]:
5 not in [1, 2, 3, 4, 6]

True

In [283]:
5 in [1, 2, 3, 4, 6]

False

### Mutability

Mutability is about whether or not we can change an object once it has been created. If an object (like a list or string) can be changed (like a list can), then it is called mutable. However, if an object cannot be changed with creating a completely new object (like strings), then the object is considered immutable.

In [284]:
my_lst = [1, 2, 3, 4, 5]
my_lst[0] = 'one'
print(my_lst)

['one', 2, 3, 4, 5]


As shown above, you are able to replace 1 with 'one' in the above list. This is because lists are mutable.

However, the following does not work:

In [285]:
greeting = "Hello there"
# greeting[0] = 'M'

This is because strings are immutable. This means to change this string, you will need to create a completely new string.

### List Methods

`len()` returns how many elements are in a list.


In [286]:
fruits = ["apple", "eggplant", "banana", "mango", "dragonfruit", "cherry"]

print(len(fruits))

6


`max()` returns the greatest element of the list. How the greatest element is determined depends on what type objects are in the list. The maximum element in a list of numbers is the largest number. The maximum elements in a list of strings is element that would occur last if the list were sorted alphabetically. This works because the the max function is defined in terms of the greater than comparison operator. The max function is undefined for lists that contain elements from different, incomparable types.


In [287]:
print(max(fruits))

mango


`min()` returns the smallest element in a list. min is the opposite of max, which returns the largest element in a list.


In [288]:
print(min(fruits))

apple


`sorted()` returns a copy of a list in order from smallest to largest, leaving the list unchanged.

In [289]:
print(sorted(fruits))

['apple', 'banana', 'cherry', 'dragonfruit', 'eggplant', 'mango']


`join()` is a string method that takes a list of strings as an argument, and returns a string consisting of the list elements joined by a separator string.

In [290]:
name = "-".join(["E", "mail"])
print(name)

E-mail



`append()` adds an element to the end of a list.

In [291]:
letters = ['a', 'b', 'c', 'd']
letters.append('e')
print(letters)

['a', 'b', 'c', 'd', 'e']


## Tuples
A tuple is another useful container. It's a data type for immutable ordered sequences of elements. They are often used to store related pieces of information. Consider this example involving latitude and longitude:

In [292]:
location = (13.4125, 103.866667)
print("Angkor wat is at latitude: ", location[0])
print("Angkor wat is at longtitude: :", location[1])

Angkor wat is at latitude:  13.4125
Angkor wat is at longtitude: : 103.866667


Tuples are similar to lists in that they store an ordered collection of objects which can be accessed by their indices.

Unlike lists, however, tuples are immutable - you can't add and remove items from tuples, or sort them in place.

Tuples can also be used to assign multiple variables in a compact way.

In [293]:
dimensions = 52, 40, 100
length, width, height = dimensions
print("The dimensions are " + str(length) + " x " + str(width) + " x " + str(height))

The dimensions are 52 x 40 x 100


The parentheses are optional when defining tuples, and programmers frequently omit them if parentheses don't clarify the code.

n the second line, three variables are assigned from the content of the tuple dimensions. This is called tuple unpacking. You can use tuple unpacking to assign the information from a tuple into multiple variables without having to access them one by one and make multiple assignment statements.

If we won't need to use dimensions directly, we could shorten those two lines of code into a single line that assigns three variables in one go!



In [294]:
length, width, height = 52, 40, 100
print("The dimensions are " + str(length) + " x " + str(width) + " x " + str(height))

The dimensions are 52 x 40 x 100


## Sets

A set is a data type for mutable unordered collections of unique elements. One application of a set is to quickly remove duplicates from a list.

In [295]:
numbers = [1, 2, 6, 3, 1, 1, 6]
unique_nums = set(numbers)
print(unique_nums)

{1, 2, 3, 6}


Sets support the in operator the same as lists do. You can add elements to sets using the add method, and remove elements using the pop method, similar to lists. Although, when you pop an element from a set, a random element is removed. Remember that sets, unlike lists, are unordered so there is no "last element".

In [296]:
fruit = {"apple", "banana", "orange", "grapefruit"}  # define a set

print("watermelon" in fruit)  # check for element

False


In [297]:
fruit.add("watermelon")  # add an element
print(fruit)

{'banana', 'watermelon', 'grapefruit', 'apple', 'orange'}


In [298]:
print(fruit.pop())  # remove a random element
print(fruit)

banana
{'watermelon', 'grapefruit', 'apple', 'orange'}


## Dictionaries

A dictionary is a mutable data type that stores mappings of unique keys to values. Here's a dictionary that stores elements and their atomic numbers.

In [299]:
elements = {"hydrogen": 1, "helium": 2, "carbon": 6}

Dictionaries can have keys of any immutable type, like integers or tuples, not just strings. It's not even necessary for every key to have the same type! We can look up values or insert new values in the dictionary using square brackets that enclose the key.

In [300]:
print(elements["helium"])  # print the value mapped to "helium"
elements["lithium"] = 3  # insert "lithium" with a value of 3 into the dictionary

2


We can check whether a value is in a dictionary the same way we check whether a value is in a list or set with the in keyword. Dicts have a related method that's also useful, get. get looks up values in a dictionary, but unlike square brackets, get returns None (or a default value of your choice) if the key isn't found.

In [301]:
print("carbon" in elements)
print(elements.get("dilithium"))

True
None


Carbon is in the dictionary, so True is printed. Dilithium isn’t in our dictionary so None is returned by get and then printed. If you expect lookups to sometimes fail, get might be a better tool than normal square bracket lookups because errors can crash your program.

### Identity Operators

| Keyword  |  Operator |
|---|---|
| is | evaluates if both sides have the same identity |   
| is not | evaluates if both sides have different identities |   

You can check if a key returned None with the `is` operator. You can check for the opposite using `is not`.

In [302]:
n = elements.get("dilithium")
print(n is None)

True


In [303]:
print(n is not None)

False
