# Data Structures

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/giswqs/geog-510/blob/main/book/python/03_data_structures.ipynb)

## Overview

In this lecture, we will explore the fundamental Python data structures: Tuples, Lists, Sets, and Dictionaries. These data structures are essential tools in geospatial programming, enabling you to efficiently store, manage, and manipulate various types of data. By mastering these structures, you will be able to handle complex geospatial datasets with ease, paving the way for more advanced analysis and processing tasks.

## Learning Objectives

By the end of this lecture, you should be able to:

- Understand the characteristics and use cases of Python tuples, lists, sets, and dictionaries.
- Apply these data structures to store and manipulate geospatial data, such as coordinates, paths, and attribute information.
- Differentiate between mutable and immutable data structures and choose the appropriate structure for different geospatial tasks.
- Perform common operations on these data structures, including indexing, slicing, adding/removing elements, and updating values.
- Utilize dictionaries to manage geospatial feature attributes and understand the importance of key-value pairs in geospatial data management.

## Tuples

Tuples are immutable sequences, meaning that once a tuple is created, its elements cannot be changed. Tuples are useful for storing fixed collections of items.

For example, a tuple can be used to store the coordinates of a geographic point (latitude, longitude).

In [1]:
point = (
    35.6895,
    139.6917,
)  # Tuple representing a geographic point (latitude, longitude)

You can access elements in a tuple using indexing:

In [2]:
latitude = point[0]
longitude = point[1]
print(f"Latitude: {latitude}, Longitude: {longitude}")

Latitude: 35.6895, Longitude: 139.6917


In [3]:
firt_name = "Desmond"
surname = "Kangah"
age = 34

print(f"Your name is {firt_name} with last name {surname}")

Your name is Desmond with last name Kangah


## Lists

Lists are ordered, mutable sequences, meaning you can change, add, or remove elements after the list has been created. Lists are very flexible and can store multiple types of data, making them useful for various geospatial tasks.

For example, you can store a list of coordinates representing a path or boundary.

In [4]:
path = [
    (35.6895, 139.6917),
    (34.0522, -118.2437),
    (51.5074, -0.1278),
]  # List of tuples representing a path

In [5]:
print(f"the first tuple {path[2]}")

the first tuple (51.5074, -0.1278)


In [6]:
path.append((48.8566, 2.3522))

In [7]:
print(f"the updated_path: {path}")

the updated_path: [(35.6895, 139.6917), (34.0522, -118.2437), (51.5074, -0.1278), (48.8566, 2.3522)]


You can add a new point to the path:

In [8]:
path.append((48.8566, 2.35234))  # Adding Paris to the path
print("Updated path:", path)

Updated path: [(35.6895, 139.6917), (34.0522, -118.2437), (51.5074, -0.1278), (48.8566, 2.3522), (48.8566, 2.35234)]


Lists allow you to perform various operations such as slicing, which lets you access a subset of the list:

In [9]:
sub_path = path[:2]  # Slicing the first two points from the path
print("Sub-path:", sub_path)

Sub-path: [(35.6895, 139.6917), (34.0522, -118.2437)]


In [10]:
last_path = path[:2]
print(last_path)

[(35.6895, 139.6917), (34.0522, -118.2437)]


## Sets

Sets are unordered collections of unique elements. Sets are useful when you need to store a collection of items but want to eliminate duplicates.

For example, you might want to store a set of unique geographic regions visited during a survey.

In [11]:
regions = ["North America", "Europe", "Asia"]  # Set of regions
regions = set(regions)

In [12]:
regions

{'Asia', 'Europe', 'North America'}

In [13]:
names = ['Kangah','Adwoa', 'Esi','Kua','Maame']
names = set(names)

In [14]:
names

{'Adwoa', 'Esi', 'Kangah', 'Kua', 'Maame'}

In [15]:
last = names.pop()
last

'Kangah'

You can add a new region to the set:

In [16]:
regions.add("Africa")
print("Updated regions:", regions)

Updated regions: {'Europe', 'Asia', 'Africa', 'North America'}


Since sets do not allow duplicates, adding an existing region will not change the set:

In [17]:
regions.add("Europe")  # Attempting to add a duplicate element
print("Regions after attempting to add duplicate:", regions)

Regions after attempting to add duplicate: {'Europe', 'Asia', 'Africa', 'North America'}


## Dictionaries

Dictionaries are collections of key-value pairs, where each key is unique. Dictionaries are extremely useful for storing data that is associated with specific identifiers, such as attribute data for geographic features.

For example, you can use a dictionary to store attributes of a geospatial feature, such as a city.

In [18]:
city_attributes = {
    "name": "Tokyo",
    "population": 13929286,
    "coordinates": (35.6895, 139.6917),
}  # Dictionary storing attributes of a city

You can access the values associated with specific keys:

In [19]:
city_name = city_attributes["name"]
city_population = city_attributes["population"]
print(f"City: {city_name}, Population: {city_population}")

City: Tokyo, Population: 13929286


In [20]:
family = {
    "father": "Roland Kangah",
    "father_age": 58,
    "mother_age": 56,
    "senior": "Desmond Kangah",
    "senior_age": 34,
}

In [21]:
father = family["father"]
mother = family["mother_age"]
print(f"Father: {father}, Mother: {mother}")

Father: Roland Kangah, Mother: 56


In [22]:
family["mother_name"] = "Grace Kangah"

In [24]:
family.keys()

dict_keys(['father', 'father_age', 'mother_age', 'senior', 'senior_age', 'mother_name'])

You can also add or update key-value pairs in a dictionary:

In [25]:
city_attributes["area_km2"] = 2191  # Adding the area of the city in square kilometers
print("Updated city attributes:", city_attributes)

Updated city attributes: {'name': 'Tokyo', 'population': 13929286, 'coordinates': (35.6895, 139.6917), 'area_km2': 2191}


## Exercises

Create a dictionary to store attributes of a geographic feature (e.g., a river or mountain). Include keys for the name, length, and location of the feature. Then, add an additional attribute (e.g., the source of the river or the height of the mountain) and print the dictionary.

In [26]:
missisipi_river = {
    "color":"blue",
    "height":"1234km",
    "length":"2345448km",
    "name": "missisipi"
}
print(f"missisipi is a river with color {missisipi_river['color']}, height {missisipi_river['height']} and length {missisipi_river['length']}")

missisipi is a river with color blue, height 1234km and length 2345448km


## Summary

Understanding and utilizing Python's data structures such as tuples, lists, sets, and dictionaries are fundamental skills in geospatial programming. These structures provide the flexibility and functionality required to manage and manipulate spatial data effectively.

Continue exploring these data structures by applying them to your geospatial projects and analyses.