# Data Structures

```{tip}
**DOWNLOAD THE NOTEBOOK TO RUN LOCALLY**

Click the download button (![](../assets/img/site/dl-nb.png)) on the upper right to download the notebook and run them locally.
```

To analyze census data, we need more than just simple variables. We need **Containers** to hold collections of data.

**Source:** [Python 3 Documentation: Data Structures](https://docs.python.org/3/tutorial/datastructures.html)

## Lists
Think of a list as a n array of items stored in sequential order. Lists are identified/created using **square brackets [ ]**.

Lists are:
- **Ordered**: the items have a defined order, and that order will not change
- **Mutable**: they can be modified (change, add, remove items) after they are created
- **Allows duplicates**: list items can have the same value

A list can contain objects of different data types (int, float, char, list). In fact, a list can contain other lists. This list within a list is known as a nested list.

In [1]:
# Creating a list of crops
crops = ["Palay", "Corn", "Coconut", "Mango"]

### Accessing list items/elements
List items are accessed using their index. The first item on the list always has an index of 0.

In [2]:
# Getting first item in list
crops[0]

'Palay'

A negative index (e.g. -1, -2) means to start from the end. -1 refers to the last item.

In [3]:
# Getting the 2nd to the last item in the list
crops[-2]

'Coconut'

You can also get a range of values by specifying a range of indexes of where to start and where to end the range.

In [4]:
# Getting 2nd to 3rd item in list
crops[1:3]

['Corn', 'Coconut']

### Accessing items in nested lists

In [5]:
# get the 2nd item inside the 4th item in the list
crops[3][1]

'a'

### Modifying lists
Since lists are mutable, we can update the data.

In [6]:
# Changing an item (Correction)
crops[1] = "Yellow Corn"
print(crops)

['Palay', 'Yellow Corn', 'Coconut', 'Mango']


**Removing items in a list**
- `remove()`: removes the first occurance of an item
- `pop()`: removes the specified index

In [7]:
crops = ["Palay", "Corn", "Coconut", "Mango"]

crops.remove("Mango") # Removes the first 'Mango' it finds
print(crops)

['Palay', 'Corn', 'Coconut']


In [8]:
crops.pop(-1)
print(crops)

['Palay', 'Corn']


### Sort list items
- `sort()`: sort list alphanumerically, by default

In [9]:
crops = ["Palay", "Corn", "Coconut", "Mango"]

crops.sort()
print(crops)

['Coconut', 'Corn', 'Mango', 'Palay']


**NOTICE THAT THE LIST IS EDITED IN PLACE**

### Joining lists
- `+`
- `append()`: appending all the items from one list to another using a for loop
- `extend()`: add elements from one list to another list

In [10]:
# +
crops1 = ["Palay", "Corn", "Coconut", "Mango"]
crops2 = ["Eggplant", "Ampalaya"]

crops3 = crops1 + crops2
print(crops3)

['Palay', 'Corn', 'Coconut', 'Mango', 'Eggplant', 'Ampalaya']


In [11]:
# append()
crops1 = ["Palay", "Corn", "Coconut", "Mango"]
crops2 = ["Eggplant", "Ampalaya"]

for x in crops2:
    crops1.append(x)

print(crops1)

['Palay', 'Corn', 'Coconut', 'Mango', 'Eggplant', 'Ampalaya']


**NOTICE THAT LIST1 IS EDITED**

In [12]:
# extend()
crops1 = ["Palay", "Corn", "Coconut", "Mango"]
crops2 = ["Eggplant", "Ampalaya"]

crops1.extend(crops2)
print(crops1)

['Palay', 'Corn', 'Coconut', 'Mango', 'Eggplant', 'Ampalaya']


**NOTICE THAT LIST1 IS EDITED**

### Looping lists and list comprehensions

In [13]:
# Using a for loop
crops_list = ["Palay", "Corn", "Coconut", "Mango"]

for crop in crops_list:
    print(crop)

Palay
Corn
Coconut
Mango


In [14]:
# Using a for loop
crops_list = ["Palay", "Corn", "Coconut", "Mango"]

# List comprehension
[print(crop) for crop in crops_list]

Palay
Corn
Coconut
Mango


[None, None, None, None]

## CHALLENGE 01 

How do you change a range of values for a list?

In [15]:
this_list = ['Maligaya Farm', 3.0, 'Palay', 'Mango']

# try to replace the 3.0 and 'Palay' with 3.34 and 'Corn' using one command

## Tuples
Tuples are **immutable** listsâ€”meaning they cannot be modified once created. Items in a tuple can be accessed similar to that of lists. A tuple is identified/created using a **parenthesis ( )**.

Tuples are:
- **Ordered**: the items have a defined order, and that order will not change
- **Immutable**: they cannot be modified after they are created
- **Allows duplicates**: tuple items can have the same value

You can convert between tuples in lists using the ***tuple*** and ***list*** functions.

**Why use tuples instead of lists?** They are more efficient than lists so it's best to use them for data that we know won't change.

In [16]:
# convert list to a tuple 
crops = ["Palay", "Corn", "Coconut", "Mango"]

crops_tuple = tuple(crops)
crops_tuple

('Palay', 'Corn', 'Coconut', 'Mango')

### Unpacking a tuple

In [17]:
# pack values
crops_tuple = ('Palay', 'Corn', 'Coconut', 'Mango')

# unpack values
(a, b, c, d) = crops_tuple
print(a)

Palay


## Sets

Sets are:
- **Unordered**: the items in a set do not have a defined order. Set items can appear in a different order every time you use them, and cannot be referred to by index or key.
- **Immutable**: they cannot be modified after they are created
- **Unique/Does not allow duplicates**: sets cannot have two items with the same value


Sets are an easy way to get unique values from a list.

In [18]:
# Sample Python list with duplicates
sample_list = [1, 2, 3, 4, 2, 5, 6, 3, 7, 8, 9, 1, 5, 10, 11, 12, 6, 13, 14, 15, 7, 16, 17, 18, 9, 19, 20]

# Create a set from the list
unique_set = set(sample_list)

# Print the elements of the set
print("Original List:", sample_list)
print("Set without Duplicates:", unique_set)

Original List: [1, 2, 3, 4, 2, 5, 6, 3, 7, 8, 9, 1, 5, 10, 11, 12, 6, 13, 14, 15, 7, 16, 17, 18, 9, 19, 20]
Set without Duplicates: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20}


In [19]:
# Add new element to a set
unique_set.add(21)
unique_set

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21}

In [20]:
# Adding an element that's already in the set will not change the set
unique_set.add(1)
unique_set

{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21}

## Dictionaries (dicts)

- Dictionaries are used to store data values in key:value pairs.
- As of Python version 3.7, dictionaries are ordered. In Python 3.6 and earlier, dictionaries are unordered.
- Dicts consist of a set of keys and values that provide the ability to perform and indexed lookup.
- Dictionaries are identified/created using **curly brackets { }**.

Dicts are:
- **Mutable**: they can be modified after they are created
- **Does not allow duplicate keys**: dicts cannot have two items with the same key

In [21]:
# Create a dict
farm_record = {
    "farm_id": "F-10203",
    "operator": "Juan Dela Cruz",
    "hectares": 2.5,
    "crops": ["Palay", "Mongo"],
    "irrigated": True
}

In [22]:
# get value of dict item with key 'grass'
farm_record['operator']

'Juan Dela Cruz'

Dictionaries can also be created using the ***dict*** function.

In [23]:
farm_record = dict(farm_id='F-10203', operator='Juan Dela Cruz')
farm_record

{'farm_id': 'F-10203', 'operator': 'Juan Dela Cruz'}

### Accessing items
Dict items are accessed using their keys

In [24]:
# Key as index
op = farm_record["operator"]

print(op)

Juan Dela Cruz


In [25]:
# using the get function
op = farm_record.get("operator")

print(op)

Juan Dela Cruz


#### Get keys and values
- `keys()`: gets keys
- `values()`: gets values

In [26]:
farm_record.keys()

dict_keys(['farm_id', 'operator'])

In [27]:
farm_record.values()

dict_values(['F-10203', 'Juan Dela Cruz'])

## CHALLENGE 02

How would you iterate over the items in a dictionary?

You can check to see if a dict contains a certain key or value:

In [28]:
farm_record = dict(farm_id='F-10203', operator='Juan Dela Cruz')

'farm_id' in farm_record

True

In [29]:
'farm_id' in farm_record.keys()

True

In [30]:
'farm_id' in farm_record.values()

False

Try to access an non-existent key (e.g.'palay'):

In [31]:
farm_record['palay']

KeyError: 'palay'

You can check if a value exists before accessing it:

In [32]:
# if the key 'farm_id' is in the dict, print its value; if not, print a prompt
if 'farm_id' in farm_record:
    print(farm_record['farm_id'])
else:
    print("Key 'farm_id' not found.")

F-10203


You can also wrap the code block in a try/except bloc (more preferred and Pythonic way).

In [33]:
try:
    print(farm_record['palay'])
except:
    print("Key not found")

Key not found
