<a href="https://colab.research.google.com/github/unt-iialab/UNT-INFO5717-Fall2019/blob/master/Lesson_five-chen.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Lesson 5 - Data Structures: List, Tuples, Dictionaries, and sets

# 1. Lists

Lists can be thought of the most general version of a sequence in Python. In this section we will learn about:

1.   Creating lists
2.   Indexing and Slicing Lists
3.   Basic List Methods
4.   Nesting Lists
5.   Introduction to List Comprehensions

# 1.1 Creating lists

Lists are constructed with brackets [] and commas separating every element in the list.

Let's go ahead and see how we can construct lists!

In [0]:
# Assign a list to an variable named my_list
my_list = [1,2,3]
print(my_list)

[1, 2, 3]


We just created a list of integers, but lists can actually hold different object types. For example:



In [0]:
my_list = ['A string',23,100.232,'o']
print(my_list)

['A string', 23, 100.232, 'o']


Just like strings, the len() function will tell you how many items are in the sequence of the list.


In [0]:
len(my_list)

4

# 1.2 Indexing and Slicing
Indexing and slicing work just like in strings. Let's make a new list to remind ourselves of how this works:


In [0]:
my_list = ['one','two','three',4,5]
# Grab element at index 0
my_list[0]

'one'

In [0]:
# Grab index 1 and everything past it
my_list[1:]

['two', 'three', 4, 5]

In [0]:
# Grab everything UP TO index 3
my_list[:3]

['one', 'two', 'three']

We can also use + to concatenate lists, just like we did for strings.

In [0]:
my_list + ['new item']

['one', 'two', 'three', 4, 5, 'new item']

Note: This doesn't actually change the original list!

In [0]:
my_list

['one', 'two', 'three', 4, 5]

You would have to reassign the list to make the change permanent.

In [0]:
# Reassign
my_list = my_list + ['add new item permanently']
print(my_list)

['one', 'two', 'three', 4, 5, 'add new item permanently']


We can also use the * for a duplication method similar to strings:

In [0]:
# Make the list double
my_list * 2

['one',
 'two',
 'three',
 4,
 5,
 'add new item permanently',
 'one',
 'two',
 'three',
 4,
 5,
 'add new item permanently']

In [0]:
# Doubling not permanent
my_list

['one', 'two', 'three', 4, 5, 'add new item permanently']

# 1.3 Basic List Methods
If you are familiar with another programming language, you might start to draw parallels between arrays in another language and lists in Python. Lists in Python however, tend to be more flexible than arrays in other languages for a two good reasons: they have no fixed size (meaning we don't have to specify how big a list will be), and they have no fixed type constraint (like we've seen above).

Let's go ahead and explore some more special methods for lists:

In [0]:
# Create a new list
list1 = [1,2,3]

Use the append method to permanently add an item to the end of a list:

In [0]:
# Append
list1.append('append me!')
print(list1)

[1, 2, 3, 'append me!']


Use pop to "pop off" an item from the list. By default pop takes off the last index, but you can also specify which index to pop off. Let's see an example:

In [0]:
# Pop off the 0 indexed item
list1.pop(0)
print(list1)

[2, 3, 'append me!']


In [0]:
# Assign the popped element, remember default popped index is -1
popped_item = list1.pop()
print(popped_item)
print(list1)

append me!
[2, 3]


It should also be noted that lists indexing will return an error if there is no element at that index. For example:

In [0]:
list1[100]

IndexError: ignored

In [0]:
new_list = ['a','e','x','b','c']

# Use reverse to reverse order (this is permanent!)
new_list.reverse()
print(new_list)

['c', 'b', 'x', 'e', 'a']


In [0]:
# Use sort to sort the list (in this case alphabetical order, but for numbers it will go ascending)
new_list.sort()
print(new_list)

['a', 'b', 'c', 'e', 'x']


# 1.4 Nesting Lists
A great feature of of Python data structures is that they support nesting. This means we can have data structures within data structures. For example: A list inside a list.

Let's see how this works!

In [0]:
# Let's make three lists
lst_1=[1,2,3]
lst_2=[4,5,6]
lst_3=[7,8,9]

# Make a list of lists to form a matrix
matrix = [lst_1,lst_2,lst_3]

print(matrix)

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]


We can again use indexing to grab elements, but now there are two levels for the index. The items in the matrix object, and then the items inside that list!

In [0]:
# Grab first item in matrix object
matrix[0]

[1, 2, 3]

In [0]:
# Grab first item of the first item in the matrix object
matrix[0][0]

1

# 1.5 List Comprehensions

Python has an advanced feature called list comprehensions. They allow for quick construction of lists. To fully understand list comprehensions we need to understand for loops. Here are a few examples!

In [0]:
# Build a list comprehension by deconstructing a for loop within a []
first_col = [row[0] for row in matrix]
print(first_col)

[1, 4, 7]


For more advanced methods and features of lists in Python, check the **Python library**.

# 2. Tuples
In Python tuples are very similar to lists, however, unlike lists they are immutable meaning they can not be changed. You would use tuples to present things that shouldn't be changed, such as days of the week, or dates on a calendar.

In this section, we will get a brief overview of the following:



1.   Constructing Tuples
2.   Basic Tuple Methods
3.   Immutability
4.   When to Use Tuples


You'll have an intuition of how to use tuples based on what you've learned about lists. We can treat them very similarly with the major distinction being that tuples are immutable.

# 2.1 Constructing Tuples

The construction of a tuples use () with elements separated by commas. For example:

In [0]:
# Create a tuple
t = (1,2,3)

In [0]:
# Check len just like a list
len(t)

3

In [1]:

# Can also mix object types
t = ('one',2)
print(t)

('one', 2)


In [2]:
# Use indexing just like we did in lists
t[0]

'one'

In [3]:
# Slicing just like a list
t[-1]

2

# 2.2 Basic Tuple Methods

Tuples have built-in methods, but not as many as lists do. Let's look at two of them:


In [4]:
# Use .index to enter a value and return the index of the value if found, if not found, return 0
t.index('one')

0

In [0]:
# Use .count to count the number of times a value appears
t.count('one')

1

# 2.3 Immutability

It can't be stressed enough that tuples are immutable. To drive that point home:

In [0]:
t[0]= 'change'


TypeError: ignored

In [0]:
t.append('nope')

AttributeError: ignored

# 2.4 When to use Tuples

You may be wondering, "Why bother using tuples when they have fewer available methods?" To be honest, tuples are not used as often as lists in programming, but are used when immutability is necessary. If in your program you are passing around an object and need to make sure it does not get changed, then a tuple becomes your solution. It provides a convenient source of data integrity.

You should now be able to create and use tuples in your programming as well as have an understanding of their immutability.

# 3. Dictionaries

**If you're familiar with other languages you can think of these Dictionaries as hash tables.**

This section will serve as a brief introduction to dictionaries and consist of:



1.   Constructing a Dictionary
2.   Nesting Dictionaries
3.   Basic Dictionary Methods
4.   Advanced Dictionaries

So what are mappings? Mappings are a collection of objects that are stored by a key, unlike a sequence that stored objects by their relative position. This is an important distinction, since mappings won't retain order since they have objects defined by a key.

A Python dictionary consists of a key and then an associated value. That value can be almost any Python object.





# 3.1 Constructing a Dictionary

In [0]:
# Make a dictionary with {} and : to signify a key and a value
my_dict = {'key1':'value1','key2':'value2'}

In [0]:
# Call values by their key
my_dict['key2']

'value2'

Its important to note that dictionaries are very flexible in the data types they can hold. For example:

In [0]:
my_dict = {'key1':123,'key2':[12,23,33],'key3':['item0','item1','item2']}

In [0]:
# Let's call items from the dictionary
my_dict['key3']

['item0', 'item1', 'item2']

In [0]:
# Can call an index on that value
my_dict['key3'][0]

'item0'

In [0]:
# Can then even call methods on that value
my_dict['key3'][0].upper()

'ITEM0'

We can affect the values of a key as well. For instance:


In [0]:
# Subtract 123 from the value
my_dict['key1'] = my_dict['key1'] - 123
print(my_dict['key1'])

# I have run this code for several times, every time it will subtract 123, that's why it is -984 now!

-984


A quick note, Python has a built-in method of doing a self subtraction or addition (or multiplication or division). We could have also used += or -= for the above statement. For example:


In [0]:
# Set the object equal to itself minus 123 
my_dict['key1'] -= 123
my_dict['key1']

-1107

We can also create keys by assignment. For instance if we started off with an empty dictionary, we could continually add to it:

In [0]:
# Create a new dictionary
d = {}

In [0]:
# Create a new key through assignment
d['animal'] = 'Dog'

# Can do this with any object
d['answer'] = 42

In [0]:
print(d)

{'animal': 'Dog', 'answer': 42}


# 3.2 Nesting with Dictionaries

Hopefully you're starting to see how powerful Python is with its flexibility of nesting objects and calling methods on them. Let's see a dictionary nested inside a dictionary:

In [0]:
# Dictionary nested inside a dictionary nested inside a dictionary
d = {'key1':{'nestkey':{'subnestkey':'value'}}}

That's a quite the inception of dictionaries! Let's see how we can grab that value:



In [0]:
# Keep calling the keys
d['key1']['nestkey']['subnestkey']

'value'

# 3.3 A few Dictionary Methods

There are a few methods we can call on a dictionary. Let's get a quick introduction to a few of them:

In [0]:
# Create a typical dictionary
d = {'key1':1,'key2':2,'key3':3}

In [0]:
# Method to return a list of all keys 
d.keys()

dict_keys(['key1', 'key2', 'key3'])

In [0]:
# Method to grab all values
d.values()

dict_values([1, 2, 3])

In [0]:
# Method to return tuples of all items  (we'll learn about tuples soon)
d.items()

dict_items([('key1', 1), ('key2', 2), ('key3', 3)])

# 3.4 Advanced Dictionaries


**Dictionary Comprehensions:**

Just like List Comprehensions, Dictionary Data Types also support their own version of comprehension for quick creation. It is not as commonly used as List Comprehensions, but the syntax is:


In [0]:
{x:x**2 for x in range(10)}

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

One of the reasons it is not as common is the difficulty in structuring key names that are not based off the values.


**Iteration over keys, values, and items:** 

Dictionaries can be iterated over using the keys(), values() and items() methods. For example:



In [6]:
d = {'k1':1,'k2':2}
for k in d.keys():
    print(k)

k1
k2


In [7]:
for v in d.values():
    print(v)

1
2


In [8]:
for item in d.items():
    print(item)

('k1', 1)
('k2', 2)


**Viewing keys, values and items:**

By themselves the keys(), values() and items() methods return a dictionary view object. This is not a separate list of items. Instead, the view is always tied to the original dictionary.

In [9]:
key_view = d.keys()

key_view

dict_keys(['k1', 'k2'])

In [0]:
d['k3'] = 3

d

{'k1': 1, 'k2': 2, 'k3': 3}

In [0]:
key_view

dict_keys(['k1', 'k2', 'k3'])

# 4. Sets

Sets are an unordered collection of unique elements. We can construct them by using the set() function. Let's go ahead and make a set to see how it works

In [0]:
x = set()

In [0]:
# We add to sets with the add() method
x.add(1)

print(x)

{1}


**add:**

Note the curly brackets. This does not indicate a dictionary! Although you can draw analogies as a set being a dictionary with only keys.

We know that a set has only unique entries. So what happens when we try to add something that is already in a set?

In [0]:
# Add a different element
x.add(2)
print(x)

{1, 2}


In [0]:
# Try to add the same element
x.add(1)
print(x)

{1, 2}


Notice how it won't place another 1 there. That's because a set is only concerned with unique elements! We can cast a list with multiple repeat elements to a set to get the unique elements. For example:

In [0]:
# Create a list with repeats
list1 = [1,1,2,2,3,4,5,6,1,1]

In [0]:
# Cast as set to get unique values
set(list1)

{1, 2, 3, 4, 5, 6}

**clear:** removes all elements from the set

In [0]:
x.clear()
print(x)

set()


**copy:** returns a copy of the set. Note it is a copy, so changes to the original don't effect the copy.

In [0]:
s = {1,2,3}
sc = s.copy()
print(sc)

s.add(4)
print(s)

{1, 2, 3}
{1, 2, 3, 4}


**difference:** difference returns the difference of two or more sets. 

The syntax is: set1.difference(set2)

For example:

In [0]:
s.difference(sc)


{4}


**difference_update:**

difference_update syntax is: set1.difference_update(set2)

the method returns set1 after removing elements found in set2

In [0]:
s1 = {1,2,3}
s2 = {1,4,5}
s1.difference_update(s2)
print(s1)

{2, 3}


**discard:** Removes an element from a set if it is a member. If the element is not a member, do nothing.

In [0]:
s

{1, 2, 3, 4}

In [0]:
s.discard(2)
print(s)

{1, 3, 4}


**intersection and intersection_update:** Returns the intersection of two or more sets as a new set.(i.e. elements that are common to all of the sets.)

In [0]:
s1 = {1,2,3}
s2 = {1,2,4}
s1.intersection(s2)

print(s1)

{1, 2, 3}


intersection_update will update a set with the intersection of itself and another.

In [0]:
s1.intersection_update(s2)
print(s1)

{1, 2}


**isdisjoint:** This method will return True if two sets have a null intersection.

In [0]:
s1 = {1,2}
s2 = {1,2,4}
s3 = {5}

In [0]:
s1.isdisjoint(s2)

False

In [0]:
s1.isdisjoint(s3)

True

**issubset:** This method reports whether another set contains this set.

In [0]:
s1

{1, 2}

In [0]:
s2

{1, 2, 4}

In [0]:
s1.issubset(s2)

True

**issuperset:** This method will report whether this set contains another set.

In [0]:
s2.issuperset(s1)

True

In [0]:
s1.issuperset(s2)

False

**symmetric_difference and symmetric_update:** Return the symmetric difference of two sets as a new set.(i.e. all elements that are in exactly one of the sets.)

In [0]:
s1

{1, 2}

In [0]:
s2

{1, 2, 4}

In [0]:
s1.symmetric_difference(s2)

{4}

**union:** Returns the union of two sets (i.e. all elements that are in either set.)

In [0]:
s1.union(s2)

{1, 2, 4}

**update:** Update a set with the union of itself and others.

In [0]:
s1.update(s2)
print(s1)

{1, 2, 4}


**This data structure is extremely useful and is underutilized by beginners, so try to keep it in mind!**

# 5. Exercises 

# Exercise 10.1: 
Write a function called nested_sum that takes a nested list of integers and add up the elements from all of the nested lists.


In [0]:
def nested_sum(nestedList):
        '''
        nestedList: list composed of nested lists containing int.
        Returns the sum of all the int in the nested list
        '''
        newList = []
        #Helper function to flatten the list
        def flatlist(nestedList):
                '''
                Returns a flat list
                '''
                for i in range(len(nestedList)):
                        if type(nestedList[i]) == int:
                                newList.append(nestedList[i])
                        else:
                                flatlist(nestedList[i])
                return newList

        flatlist(nestedList)
        print (sum(newList))

In [0]:
nestedList = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
nested_sum(nestedList)

45


# Exercise 10.3:

Write a function that takes a list of numbers and returns the cumulative sum.




In [0]:
def cumulative(list):
    cumulative_sum = 0
    new_list = []
    for i in list:
        cumulative_sum += i
        new_list.append(cumulative_sum)
    return new_list

In [16]:
list = [1,2,3,4]

cumulative(list)

[1, 3, 6, 10]

# Exercise 11.1:

Write a function that reads the words in original_papers.txt and stores them as keys in a dictionary. It doesn’t matter what the values are. Then you can use the in operator as a fast way to check whether a string is in the dictionary.

In [0]:
fin = open('original_papers.txt')
englishdict = dict()


def create_diction():
    counter = 0
    dictionairy = dict()
    for line in fin:
        word = line.strip()
        dictionairy[word] = counter
        counter += 1
    return dictionairy

In [0]:
create_diction()

{'A general evaluation measure for document organization tasks': 0,
 'ChatNoir a search engine for the ClueWeb09 corpus': 2,
 "ERD'14 entity recognition and disambiguation challenge": 1,
 'Entity query feature expansion using knowledge base links': 6,
 'Extending average precision to graded relevance judgments': 5,
 'Learning to personalize query auto-completion': 3,
 'On building a reusable Twitter corpus': 4,
 'System effectiveness, user models, and user utility a conceptual framework for investigation': 7,
 'Time-sensitive query auto-completion': 8,
 'Toward whole-session relevance exploring intrinsic diversity in web search': 9}