# Python data structures     
## Author: Erika Duan

![](../02_figures/01_lists-header.jpg)

# Lists  

A list is a sequential container for data values (whether logical, integers, floats, strings or other lists) and has some similarities to vectors in R. 

Properties of lists include:  

+ Lists can store different primitive types and even other lists.    
+ Lists have an integer and 0-based index, which allows for list slicing (i.e. subsetting).  
+ Lists can be appended using the methods `append()` or `insert()`.  
+ Values inside a list can be removed using the methods `remove()` or `pop()` or using the keyword `del`.  
+ The function `len()` can calculate the number of items in a list.  
+ Two lists can be concatenated with the operator `+`.  

In [1]:
#-----example 1-----  
list_a = [1, 2.4, "hello world", [0, 1, 2, 3]]
print(list_a)  

type(list_a) 

# Python allows lists containing different primitive types  

In [2]:
#-----example 2-----
list_b = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]  

print(list_b[0]) # the integer 1 occupies position 0   
print(list_b[1]) # the integer 2 occupies position 0 + 1  
print(list_b[-2]) # the integer 9 occupies the second to last position i.e. position -2  
print(list_b[0:3+1]) # extract values from position 0 to position 3
print(list_b[0::2]) # start from position 0 and extract values from every subsequent second position
print(list_b[::2]) # the same as list_b[0::2]  
print(list_b[:]) # the same as list_b.copy()

In [3]:
#-----example 3----- 
list_c = ["apple", "bear", "donut", "elephant", "guava"] 

list_c.append("horse") # always appends onto the last position in a item
list_c.insert(2, "cat") # appends "cat" to position 2 of the new list 
list_c.insert(5, "french toast") # appends "cat" to position 5 of the new list

print(list_c)

In [4]:
#-----example 4----- 
list_d = list_c.copy() 
print("Original list: {}".format(list_d)) 

del list_d[0:1+1] # del removes objects in a list by index (accepts integers and slices)
print("Objects in positions 0 to 1 are deleted: {}".format(list_d)) 

list_d.append("cat") # append "cat" to last position in the list
list_d.remove("cat") # only removes the first reference to an object in the list
print("First reference to cat is removed: {}".format(list_d))  

In [5]:
#-----example 5-----  
french_toast = list_d.pop(2)
print("The function pop does two things. It removes the object from the original list: {}. It also stores the removed item: {}."
      .format(list_d, french_toast))

![](../02_figures/01_tuples-and-sets-header.jpg)

# Tuples    

A tuple is an object container that behaves like a list but is immutable.   

Properties of tuples include:  
+ Tuples can be created by enclosing objects inside round brackets `()`. 
+ You can subset tuples (tuples can be referenced by index i.e. `[n]`).   
+ You cannot alter tuples after they are created.      
+ You can check whether an item exists in a tuple with tuple position `in` tuple (i.e. returns a logical).  
+ You can iterate through a tuple using a `for loop`.     

**Note:** Tuples are not commonly used for data manipulation tasks (although their property of being immutable can make them more useful than lists in special circumstances).  

In [6]:
#-----example 1-----  
tuple_1 = ("apple", "bear", "cat", "donut", 1, 2, 3, 4)
print("Object tuple_1 is of type {} and can contain different primitive types in the same tuple: {}."
      .format(type(tuple_1), tuple_1)) 

In [7]:
#-----example 2----- 
list_1 = list(tuple_1)
print(tuple_1[0] == list_1[0])

# tuple_1[0] = "apple_red" produces an error

list_1[0] = "apple_red"
list_1

# lists are mutable but tuples are immutable  

In [8]:
#-----example 3-----
for index, object in enumerate(tuple_1, 1): # start index at 1 instead of 0  
    print("Item {}: {}".format(index, object))

# Sets  

A set is a container that behaves like a tuple but is unindexed and has no order (i.e. like a mathematical set).   

Properties of sets include:   
+ Sets can be created by including objects inside `{}`.    
+ Duplicate objects enclosed inside a set will be removed.   
+ You cannot subset tuples (tuples cannot be referenced by index i.e. `[n]`).    
+ You can check if an item is in a set using `in`.    
+ You can iterate through a set using a `for loop`.    

**Note:** Sets are not commonly used for data manipulation tasks.  

In [9]:
#-----example 1-----  
set_1 = {"maths", "physics", "chemistry", "maths", "biology"}
print(set_1) # no duplicate values

In [10]:
#-----example 2-----  
for value in set_1:
    print(value)

![](../02_figures/01_dictionaries-header.jpg)

# Dictionary  

A dictionary can be thought of as an unordered list with a customised index.  
+ The index values are called **keys**.  
+ A dictionary therefore contains **key-value pairs** with the format `{index: value(s)}`.      

Properties of dictionaries include:  
+ Dictionaries can be created using `dict()` or by listing key-values pairs inside `{}` i.e. `{"key_1": ["value_1", "value_2"]}`.  
+ Dictionary key-value pairs can be accessed by subsetting on the key or by using the `get()` method.   
+ You can check whether a key exists in a dictionary using the keyword `in` i.e. `"key_2" in dict_1` should return `True`.  
+ You can retrieve dictionary keys using `dict_1.keys()`.  
+ You can retrieve dictionary values using `dict_1.values()`.  
+ You can retrieve dictionary items using `dict_1.items()`.    

You can modify key-value pairs in a dictionary.    
+ New items can be added through `dict_1["key_2"] = ["value_1", "value_2"]`   
+ You can delete items using the `del` keyword or the `pop()` method (both methods will modify the dictionary in place).  

In [11]:
#-----example 1-----

In [12]:
#-----example 2-----

In [13]:
#-----example 3-----

In [14]:
#-----example 4-----  

# Numpy arrays  

In [15]:
#-----example 1-----

In [16]:
#-----example 2-----

In [17]:
#-----example 3-----

In [18]:
#-----example 4-----

# Pandas DataFrame  

In [19]:
#-----example 1-----

In [20]:
#-----example 2-----

In [21]:
#-----example 3-----

In [22]:
#-----eample 4-----