# 2. Python data structures

This notebook follows Chapter 2 in the [Python Workshop textbook](https://search.ebscohost.com/login.aspx?direct=true&db=edsool&AN=edsool.9781804610619&site=eds-live&scope=site&authtype=shib&custid=s8516548). An electronic version of this book is freely available from the library after logging in with TAMU credentials!

In [None]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

There are four types of basic data structures in Python: **list, tuple, dictionary** and **set**.  Each data structure has a set of operations that can be performed on data contained in the structure.  In the following sections, we will introduce these structures and how to manipulate them within Python.

## 2.1 Lists

Lists are used to store multiple data sets at the same time and may contain different data types within the same list.

Lists are similar to arrays encountered in other programming languages, but differ in that they can contain mixed data types.

### 2.1.1 Nested lists

Can we have a 'list of lists'? Yes.  For example, matrices can be stored as nested lists.

In [None]:
#each element in the list of m is a list 
#containing the row elements for that index

#can access elements using [row][column] notation
#remember that indices start at 0

To iterate over elements of a nested list using for loops, consider the following examples.

Note that the first example explicitly states the row and column indices included in the iterations.  This is helpful when you do not want to access every element.  However, the second example spans all rows and columns and does not require us to know the number of rows or columns.

### 2.1.2 Basic list operations

In [None]:
#length

#concatenation

#repeating elements

### 2.1.3 Accessing items in a list

**Note**: In Python, a positive index counts forward, and a negative index counts backward.  Use negative indices to access elements from the back

### 2.1.4 Adding and inserting items into a list

## 2.2 Dictionaries

A Python dictionary is an unordered collection.  They are written with curly brackets and consist of pairs of **keys** and **values**.

In [None]:
email = {
    'subject': 'YoUr CaR\'s WaRrAnTy is about to EXPIYERR!!',
    'from': 'dubious_email_address@gmail.com',
    'to': 'your_email_address@yahoo.com',
    'cc': '',
    'bcc': '',
    'date_received':'2021-09-30',
    'body':"""We\'ve been trying to reach you about your car\'s 
              warranty.  Please call us at 555-555-5555 
              to maintain coverage!"""
}
print(email)

**Note**: You might have noticed a resemblance between Python dictionaries and JSON.  Although you can load JSON directly into Python, a Python dictionary is a complete data structure with its own set of algorithms and operations.  JSON is just a string written in a similar format.

Dictionaries are like lists and share the following properties:

- Both can be used to store values
- Both can be changed in place and can grow and shrink on demand
- Both can be nested (i.e. lists of lists, dictionaries of dictionaries, lists of dictionaries, dictionaries of lists)

The main difference is how elements are **accessed**. List elements are accessed by their position index, while dictionary elements are accessed via keys. 

Therefore, dictionaries are more suitable for representing collections of labeled items. However, there are some rules to remember:

- Keys must be unique (no duplicate keys)
- Keys must be immutable (can be a string, number, or a tuple)

In [None]:
#print value for a given key

#update a value in place


Dictionaries are flexible in terms of size

In [None]:
#create dictionary one key-value pair at a time
email = {}
email['subject'] = 'Your car\'s warranty is about to expire!'
email['from'] = 'dubious_email_address@gmail.com'
email['to'] = 'your_email_address@yahoo.com'
email['cc'] = ''
email['bcc'] = ''
email['date_received'] = '2021-09-30'
email['body'] = 'We\'ve been trying to reach you about your car\'s warranty.  Please call us at 555-555-5555 to maintain coverage!'
print(email)

In [None]:
#list in a dictionary
email['cc'] = ['email1@gmail.com','email2@hotmail.com','email3@aol.com']
print(email['cc'])

#dictionary in a dictionary
email['metadata'] = {
    'sender_ip': '132:123:1231:12',
    'attachment': True,
    'fraud_score': 10
}
print(email['metadata'])

By combining lists and dictionaries sensibly, we can store complex real-world information and model structures directly and easily.  This is one of the main benefits of scripting languages such as Python.

For example, a table in a database can be stored as a list with nested dictionaries for each record (see Activity 7).

### 2.2.1 Zipping and unzipping dictionaries using zip()

Sometimes you obtain information from multiple lists that need to be aggregated for analysis.  In this case you can use the `zip()` method to aggregate lists and create a zip object, which can then be unzipped into a list, tuple, or dictionary.  This can be useful, especially when converting a list to a dictionary.

### 2.2.2 Dictionary methods

In [None]:
 #access values
 #store values in list (can be for later use)
 #store keys in list (can be for later use)

**Note**: You can't directly iterate a dictionary.  You can first convert it to a list of tuples using the `items()` method, then iterate the resulting list and access it.

## 2.3 Tuples

A tuple object is similar to a list, but **it cannot be changed after initialization**.  Use typles to represent fixed collections of items.

In [None]:
weekdays_list = ['Monday', 'Tuesday',
                 'Wednesday','Thursday',
                 'Friday','Saturday','Sunday']

However, this does not guarantee that values will remain unchanged throughout its lifetime.  Lists are **mutable**.

Tuples on the other hand are **immutable**.  

In [None]:
weekdays_tup = ('Monday', 'Tuesday',
                 'Wednesday','Thursday',
                 'Friday','Saturday','Sunday')

You can't append to a tuple, but you can create a new tuple by concatenating an existing tuple with new items.

Tuples also support mixed data types and nesting.

## 2.4 Sets

Sets are relatively new to Python. Sets are unordered collections of **unique** and **immutable** objects that support operations mimicking mathmematical set theory.  Sets can also be used to prevent duplicate values.

In [None]:
#initialize by passing in a list


In [None]:
#initialize with curly brackets directly


While the objects in a set are immutable, sets are mutable.

### 2.4.1 Set operations

In [None]:
s5,s6 = {1,2,3,4},{3,4,5,6}

#unions

#intersections


In [None]:
s5,s6 = {1,2,3,4},{3,4,5,6}

#set minus

#check if subset

#check if proper subset

#check if superset


**Note**: How are sets different from lists and dictionaries?
 - Sets are different from lists in that they are unordered.  
 - Sets are different from dictionaries in that they do not map keys to values.  
 - Sets are neither a sequence or a mapping type; they are a type by themselves.

## 2.5. Choosing the right structure

Here are some general guidelines for choosing the right structure (based on their unique characteristics):
 
 - Lists are used to store multiple objects and retain a sequence
 - Dictionaries store unique key-value pair mappings
 - Tuples are immutable
 - Sets only store unique elements
 
Choosing an inappropriate data structure could lead to low efficiency when running code, security issues, and data loss.  Choose wisely!