# Session 3: Data Structures - Organizing Financial Data

**Objective:** Learn how to use Python's core data structures—lists, tuples, dictionaries, and sets—to organize and manage financial data effectively.

## Introduction

So far, we've worked with single pieces of data (a single price, a single name). But in finance, we deal with large collections of related data: a portfolio of stocks, a time series of prices, or the fundamentals of a company. Data structures are how we organize these collections.

## 1. Lists: Ordered Collections

A **list** is an ordered, changeable collection of items. They are one of the most versatile data structures in Python.

- **Ordered:** Items have a defined order, and that order will not change.
- **Changeable (Mutable):** You can add, remove, or change items in a list after it has been created.
- **Syntax:** Items are enclosed in square brackets `[]`, separated by commas.

In [1]:
# A list of S&P 500 sector names
sp500_sectors = ["Information Technology", "Health Care", "Financials", "Consumer Discretionary"]
print(sp500_sectors)

# Accessing items by index (starts at 0)
first_sector = sp500_sectors[0]
print(f"The first sector is: {first_sector}")

# Slicing: Getting a range of items
top_two_sectors = sp500_sectors[0:2] # Gets items at index 0 and 1
print(f"The top two sectors are: {top_two_sectors}")

# Adding an item to the end of the list
sp500_sectors.append("Communication Services")
print(f"List after adding an item: {sp500_sectors}")

# Removing an item
sp500_sectors.remove("Financials")
print(f"List after removing an item: {sp500_sectors}")

['Information Technology', 'Health Care', 'Financials', 'Consumer Discretionary']
The first sector is: Information Technology
The top two sectors are: ['Information Technology', 'Health Care']
List after adding an item: ['Information Technology', 'Health Care', 'Financials', 'Consumer Discretionary', 'Communication Services']
List after removing an item: ['Information Technology', 'Health Care', 'Consumer Discretionary', 'Communication Services']


## 2. Tuples: Immutable Collections

A **tuple** is an ordered, *unchangeable* collection of items.

- **Unchangeable (Immutable):** Once a tuple is created, you cannot add, remove, or change its items.
- **Use Case:** They are useful for data that should not change, like a fixed pair of coordinates or, in our case, a stock's unique identifiers.
- **Syntax:** Items are enclosed in parentheses `()`, separated by commas.

In [4]:
# A tuple representing a stock's CUSIP and Ticker
stock_identifier = ("037833100", "AAPL")
print(stock_identifier)

# You can access items just like a list
cusip = stock_identifier[0]
ticker = stock_identifier[1]
print(f"CUSIP: {cusip}, Ticker: {ticker}")

# But you CANNOT change an item (this line will cause an error if you run it)
# stock_identifier[1] = "MSFT" # TypeError: 'tuple' object does not support item assignment

('037833100', 'AAPL')
CUSIP: 037833100, Ticker: AAPL


## 3. Dictionaries: Key-Value Pairs

A **dictionary** is an unordered (in older Python versions) collection of key-value pairs. They are optimized for retrieving data when you know the key.

- **Key-Value:** Each item has a `key` and a corresponding `value`.
- **Changeable (Mutable):** You can add, remove, or change items.
- **Use Case:** Perfect for storing data with named attributes, like a company's financial metrics.
- **Syntax:** Items are enclosed in curly braces `{}`, with each key-value pair written as `key: value`.

In [5]:
# A dictionary of a company's fundamental data
company_data = {
    "ticker": "MSFT",
    "market_cap_billions": 2140,
    "pe_ratio": 28.6,
    "sector": "Information Technology"
}

print(company_data)

# Accessing a value by its key
market_cap = company_data["market_cap_billions"]
print(f"Market Cap: ${market_cap}B")

# Adding a new key-value pair
company_data["dividend_yield"] = 0.011
print(f"Data after adding dividend yield: {company_data}")

# Changing an existing value
company_data["pe_ratio"] = 29.1
print(f"Data after updating P/E ratio: {company_data}")

{'ticker': 'MSFT', 'market_cap_billions': 2140, 'pe_ratio': 28.6, 'sector': 'Information Technology'}
Market Cap: $2140B
Data after adding dividend yield: {'ticker': 'MSFT', 'market_cap_billions': 2140, 'pe_ratio': 28.6, 'sector': 'Information Technology', 'dividend_yield': 0.011}
Data after updating P/E ratio: {'ticker': 'MSFT', 'market_cap_billions': 2140, 'pe_ratio': 29.1, 'sector': 'Information Technology', 'dividend_yield': 0.011}


## 4. Sets: Unordered & Unique Items

A **set** is an unordered collection with no duplicate items.

- **Unique:** Sets automatically remove any duplicates.
- **Unordered:** Items in a set do not have a defined order.
- **Use Case:** Great for membership testing and finding unique values in a collection.
- **Syntax:** Items are enclosed in curly braces `{}`.

In [6]:
# A list of portfolio sectors with duplicates
portfolio_sectors_list = ["Technology", "Health Care", "Financials", "Technology", "Industrials"]
print(f"Original list of sectors: {portfolio_sectors_list}")

# Create a set to find the unique sectors
unique_sectors = set(portfolio_sectors_list)
print(f"Unique sectors in portfolio: {unique_sectors}")

Original list of sectors: ['Technology', 'Health Care', 'Financials', 'Technology', 'Industrials']
Unique sectors in portfolio: {'Industrials', 'Technology', 'Financials', 'Health Care'}


---

## Finance Exercise: Portfolio Management

Let's combine these data structures to build a simple portfolio tracker.

**Task:** You will create and manipulate a portfolio represented as a list of dictionaries.

In [7]:
# Part 1: Create the Portfolio
# A portfolio is a list, where each item is a dictionary representing a stock holding.
portfolio = [
    {"ticker": "AAPL", "shares": 100, "sector": "Technology"},
    {"ticker": "GOOGL", "shares": 50, "sector": "Technology"},
    {"ticker": "JNJ", "shares": 75, "sector": "Health Care"},
    {"ticker": "V", "shares": 120, "sector": "Financials"}
]

print("--- Initial Portfolio ---")
print(portfolio)

--- Initial Portfolio ---
[{'ticker': 'AAPL', 'shares': 100, 'sector': 'Technology'}, {'ticker': 'GOOGL', 'shares': 50, 'sector': 'Technology'}, {'ticker': 'JNJ', 'shares': 75, 'sector': 'Health Care'}, {'ticker': 'V', 'shares': 120, 'sector': 'Financials'}]


In [8]:
# Part 2: Analyze the Portfolio

# a) Find all the unique sectors in your portfolio.
# Hint: You'll need to create a list of all sectors first, then convert it to a set.
all_sectors = []
# We will learn about loops next session, for now here is how you can do it:
for stock in portfolio:
    all_sectors.append(stock["sector"])

unique_sectors_in_portfolio = set(all_sectors)
print(f"\nUnique Sectors: {unique_sectors_in_portfolio}")

# b) Let's say you want to look up info on a specific stock.
# Access and print the dictionary for the second stock in your portfolio (GOOGL).
google_stock_info = portfolio[1] # Remember list indexing starts at 0
print(f"\nInformation for Google stock: {google_stock_info}")

# c) From that stock's info, print out how many shares you own.
google_shares = google_stock_info["shares"]
print(f"Shares of Google held: {google_shares}")


Unique Sectors: {'Technology', 'Financials', 'Health Care'}

Information for Google stock: {'ticker': 'GOOGL', 'shares': 50, 'sector': 'Technology'}
Shares of Google held: 50


In [9]:
# Part 3: Update the Portfolio

# a) You buy a new stock: 200 shares of Microsoft ("MSFT") in the "Technology" sector.
# Create a new dictionary for this holding.
new_holding = {"ticker": "MSFT", "shares": 200, "sector": "Technology"}

# b) Add this new holding to your portfolio list.
# Hint: Use the .append() method.
portfolio.append(new_holding)

print("\n--- Updated Portfolio ---")
print(portfolio)


--- Updated Portfolio ---
[{'ticker': 'AAPL', 'shares': 100, 'sector': 'Technology'}, {'ticker': 'GOOGL', 'shares': 50, 'sector': 'Technology'}, {'ticker': 'JNJ', 'shares': 75, 'sector': 'Health Care'}, {'ticker': 'V', 'shares': 120, 'sector': 'Financials'}, {'ticker': 'MSFT', 'shares': 200, 'sector': 'Technology'}]
