# Python Fundamentals II
---
- Collection of data that logically belongs together: **Data Set**
- Type of variable that we use to store it in Python: **Data Structure**

## Lists
- a collection of data that stores multiple items in a single variable.
- these items must be ordered, able to be changed after they are created (mutable), and can be duplicated. (very flexible)
- can store data of multiple types (not all the items in the list need to be the same type)


#### Creating Lists
- written with square brackets (`[]`)

In [431]:
price_php = [17000, 8000, 22000, 15000]
print(price_php)

[17000, 8000, 22000, 15000]


#### Working with Lists
- access any time on the list by referring to the item's **index number**. <br>
- access an item at the end of the list, you can use **negative indexing**. (-1: last item, -2, 2nd to the last and so on)
- to add an item to a list that already exists, use the **`append`** method
- items on a lits can be **aggregated** to make analyzing more useful
- **`len`** method: tells the size of the list, how many items the list have (count does not start with zero (0))
  
*Note: In Python, the first item in a list is always zero (0)*

In [432]:
# Access the 2nd item of the price_php list
print(price_php[1])

8000


In [433]:
# Access the last item in the price_php_list
print(price_php[-1])

15000


In [434]:
# Append 12500 price to the price_php list
price_php.append(12500)
print(price_php)

[17000, 8000, 22000, 15000, 12500]


In [435]:
# Retrieving a slice of items from a list
print(price_php[1:4])
print(price_php[1:])
print(price_php[:1])

[8000, 22000, 15000]
[8000, 22000, 15000, 12500]
[17000]


In [436]:
# Slice using a step size
nums = [1,2,3,4,5,6,7,8,9,10,11,12]

print(nums[::2])
print(nums[4:1:-1])

[1, 3, 5, 7, 9, 11]
[5, 4, 3]


In [437]:
# Aggregating the price_php list: total value in Peso of the houses in the list
total_php = sum(price_php)
total_php

74500

In [438]:
# Average Value in Peso of the houses on the list (ave = sum/total count)
average_php = sum(price_php)/len(price_php)
average_php

14900.0

In [439]:
# Can store any type of data
int_list = [2, 6, 3049, 18, 37]
float_list = [3.7, 8.2, 178.245, 63.1]
mixed_list = [26, False, 'some words', 1.264]

print(int_list)
print(float_list)
print(mixed_list)

[2, 6, 3049, 18, 37]
[3.7, 8.2, 178.245, 63.1]
[26, False, 'some words', 1.264]


In [440]:
# a list inside of a list
list_of_lists = [['a', 'list', 'of', 'words'], [1, 5, 209], [True, True, False]]
print(list_of_lists)

[['a', 'list', 'of', 'words'], [1, 5, 209], [True, True, False]]


In [441]:
grocery_list = ['chicken', 'onions', 'rice', 'peppers', 'bananas']

# for loop in iterating items in a list
for item in grocery_list:
    print(item)

chicken
onions
rice
peppers
bananas


Sometimes we will combine a for loop with indexing.<br>
The `range` function is useful for this. (Range function: returns a sequece of integers between the 1st and 2nd argument-1, using the 3rd argument as the stepsize.

In [442]:
for i in range(0, len(grocery_list)):
    print(i, grocery_list[i])

0 chicken
1 onions
2 rice
3 peppers
4 bananas


In [443]:
for i in range(0, len(grocery_list),2):
    print(i, grocery_list[i])

0 chicken
2 rice
4 bananas


In [444]:
print(range(0, 10, 3))
print(range(104, 100, -1))
print(range(5)) # starts at 0 and counts by 1 by default

range(0, 10, 3)
range(104, 100, -1)
range(0, 5)


In [445]:
# Using indexing/slicing to replace items in the list
grocery_list = ['chicken', 'onions', 'rice', 'peppers', 'bananas']
print(grocery_list)

grocery_list[-1] = 'grapes'
print(grocery_list)

grocery_list[1:3] = ['carrots', 'pasta']
print(grocery_list)

['chicken', 'onions', 'rice', 'peppers', 'bananas']
['chicken', 'onions', 'rice', 'peppers', 'grapes']
['chicken', 'carrots', 'pasta', 'peppers', 'grapes']


In [446]:
# Adding items on a list
grocery_list = ['chicken', 'onions', 'rice', 'peppers', 'bananas']
print(grocery_list)
grocery_list.append('squash')
print(grocery_list)
grocery_list.append(['bread', 'salt'])
print(grocery_list)

['chicken', 'onions', 'rice', 'peppers', 'bananas']
['chicken', 'onions', 'rice', 'peppers', 'bananas', 'squash']
['chicken', 'onions', 'rice', 'peppers', 'bananas', 'squash', ['bread', 'salt']]


In [447]:
grocery_list = ['chicken', 'onions', 'rice', 'peppers', 'bananas']
print(grocery_list)
grocery_list.extend(['bread', 'salt'])
print(grocery_list)

['chicken', 'onions', 'rice', 'peppers', 'bananas']
['chicken', 'onions', 'rice', 'peppers', 'bananas', 'bread', 'salt']


In [448]:
# Removing items on a list
print(grocery_list)
del grocery_list[-1]
print(grocery_list)

['chicken', 'onions', 'rice', 'peppers', 'bananas', 'bread', 'salt']
['chicken', 'onions', 'rice', 'peppers', 'bananas', 'bread']


In [449]:
print(grocery_list)
print(grocery_list.pop(-1)) # default: removes and returns last item of a list
print(grocery_list)

['chicken', 'onions', 'rice', 'peppers', 'bananas', 'bread']
bread
['chicken', 'onions', 'rice', 'peppers', 'bananas']


In [450]:
# Sorting a list
grocery_list.sort()
print(grocery_list)

['bananas', 'chicken', 'onions', 'peppers', 'rice']


### Exercises

1. Make a list of 10 elements and select only the last 2 elements
2. Take that same list of 10 elements and select every other element starting with the very first element.
3. Select every other element starting with the second element.

In [451]:
# Solution
lst = list(range(0,10))
print(f"Original List: {lst}", end="\n\n")
print(f"Solution 1: {lst[-2:]}")
print(f"Solution 2: {lst[::2]}")
print(f"Solution 3: {lst[1::2]}")

Original List: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Solution 1: [8, 9]
Solution 2: [0, 2, 4, 6, 8]
Solution 3: [1, 3, 5, 7, 9]


---
## Tuple
- very similar to a `list` with 1 major difference -- it is **immutable**
- we create a `tuple` using parentheses `()`
  

In [452]:
example_tuple = ('L', 26, 167.6, True)
print(example_tuple)

('L', 26, 167.6, True)


- While we can retrieve data through indexing (because `tuple` is ordered), we cannot modify it as it is immutable.

In [453]:
print(example_tuple[2])
print(example_tuple[1:3])
print(example_tuple[-2])

167.6
(26, 167.6)
167.6


In [454]:
# # EXPECT ERROR
# del example_tuple[-1]

- While for clarity we should enclose tuples with `()`, Python will assume we want a `tuple` if we don't use any symbols to enclose comma separated values

In [455]:
example_tuple = 'Nate', 36, 162.3, True
print(example_tuple)
print(type(example_tuple))

('Nate', 36, 162.3, True)
<class 'tuple'>


- One common mistake people make with immutability, especially with tuples is to assume data structures inside the tuple are immutable because the tuple is immutable.

In [456]:
tup = tuple([[], 'a'])
print(tup)
tup[0].append(1)
print(tup)

([], 'a')
([1], 'a')


---
## Set
- Similar to a `list`, except it is **unordered**
- Removes duplicates
- Created by enclosing data with curly brackets `{}`

In [457]:
example_set = {'L', 27, 3.14, True}
print(example_set)

{27, 'L', 3.14, True}


In [458]:
# Expect error
# print(example_set[0])

- Items can still be added and deleted from a set

In [459]:
print(example_set)
print(example_set.pop())
print(example_set)

{27, 'L', 3.14, True}
27
{'L', 3.14, True}


In [460]:
example_set.add('True')
print(example_set)
example_set.update([58.1, 'brown'])
print(example_set)

{True, 3.14, 'L', 'True'}
{True, 3.14, 58.1, 'brown', 'L', 'True'}


- **`add`** method of `set` = **`append`** method of a `list`
= **`update`** method of `set` = **`extend`** method of a `list`

----
## Dictionaries (dict)
- a collection of data that occurs in an order, is able to be changed and does not allow duplicates.
- data in a dictionary is always presented as **keys** and **values** *(key-value pair)*

#### Creating Dictionaries
- Dictionaries are written in curly brackets `{}`, with key-value pairs inside: `dct = {key1:value1}`

In [461]:
work_details = {
    "company": "GoTyme",
    "department": "Data and Analytics",
    "members": 10
}

#### Working with Dictionaries
- one can **access any item** in a dictionary by using its key name inside square brackets.
- use **`get`**: to **retrieve a value**
- use **`keys`** method: to access all keys in a dictionary

In [462]:
dept = work_details['department']
print(dept)

Data and Analytics


In [463]:
work_details.get('department')

'Data and Analytics'

In [464]:
work_details.keys()

dict_keys(['company', 'department', 'members'])

In [465]:
# To use keys in a list
list(work_details.keys())

['company', 'department', 'members']

In [466]:
# Iterate over keys
for k in work_details.keys():
    print(k)

company
department
members


In [467]:
# Iterate over values
for v in work_details.values():
    print(v)

GoTyme
Data and Analytics
10


In [468]:
# Iterate over key-value pairs
for k, v in work_details.items():
    print(f'{k}: {v}')

company: GoTyme
department: Data and Analytics
members: 10


In [469]:
# using .get method to retrieve value of a specific key
work_details.get('department')

'Data and Analytics'

#### Zipping items
Given area_m2 = [235.0, 135.0, 260.0, 170.5]
- it might be useful to combine -- or **zip**-- 2 lists together. For example, we might want to create a new list that pairs the house price list  with their corresponding area in the area_m2 list. To do this, we use the **`zip`** method
- `Keys` must be immutab;e and unique similar to the elements of a set
- can be very handy for creating a dictionary

In [470]:
price_php = [17000, 8000, 22000, 15000]
area_m2 = [235.0, 135.0, 260.0, 150.5]

In [471]:
new_list = zip(price_php, area_m2)
new_list

<zip at 0x7c1544abcdc0>

In [472]:
zipped_list = list(new_list) # to convert it to a legit list
zipped_list

[(17000, 235.0), (8000, 135.0), (22000, 260.0), (15000, 150.5)]

In [473]:
person = ['L', 27, 164.5, 50.0, 'black', 'brown', True]

In [474]:
value_list = person
key_list = ['name', 'age', 'height', 'weight', 'hair', 'eyes', 'has dog']

print(value_list)
print(key_list)

['L', 27, 164.5, 50.0, 'black', 'brown', True]
['name', 'age', 'height', 'weight', 'hair', 'eyes', 'has dog']


In [475]:
key_value_pairs = list(zip(key_list, value_list))
print(key_value_pairs)

[('name', 'L'), ('age', 27), ('height', 164.5), ('weight', 50.0), ('hair', 'black'), ('eyes', 'brown'), ('has dog', True)]


In [476]:
me_dict = dict(key_value_pairs)
print(me_dict)

{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'black', 'eyes': 'brown', 'has dog': True}


In [477]:
# invalid key
# invalid_dict = {[1, 5]: 'a', 5: 23}

In [478]:
valid_dict = {(1, 5): 'a', 5: [23, 6]}
print(valid_dict)

{(1, 5): 'a', 5: [23, 6]}


In [479]:
# Adding key-value pair in a dict.
print(me_dict)
me_dict['favorite book'] =  'The Little Prince'
print(me_dict)

{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'black', 'eyes': 'brown', 'has dog': True}
{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'black', 'eyes': 'brown', 'has dog': True, 'favorite book': 'The Little Prince'}


In [480]:
# Update/extend existing dictionary
print(me_dict)
me_dict.update({'favorite color': 'white/black', 'siblings': 2})
print(me_dict)

{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'black', 'eyes': 'brown', 'has dog': True, 'favorite book': 'The Little Prince'}
{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'black', 'eyes': 'brown', 'has dog': True, 'favorite book': 'The Little Prince', 'favorite color': 'white/black', 'siblings': 2}


In [481]:
# Replacing or deleting key-valye pairs
print(me_dict)
me_dict['hair'] = 'blonde'
print(me_dict)

{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'black', 'eyes': 'brown', 'has dog': True, 'favorite book': 'The Little Prince', 'favorite color': 'white/black', 'siblings': 2}
{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'blonde', 'eyes': 'brown', 'has dog': True, 'favorite book': 'The Little Prince', 'favorite color': 'white/black', 'siblings': 2}


In [482]:
del me_dict['favorite book']
print(me_dict)

{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'blonde', 'eyes': 'brown', 'has dog': True, 'favorite color': 'white/black', 'siblings': 2}


In [483]:
print(me_dict.pop('siblings'))
print(me_dict)

2
{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'blonde', 'eyes': 'brown', 'has dog': True, 'favorite color': 'white/black'}


## Switching Data Structures

In [484]:
example_list = ['a', 'b', 23, 10, True, 'a', 10]
example_tuple = tuple(example_list)
example_set = set(example_tuple)
example_list = list(example_set)

print(example_tuple)
print(example_set)
print(example_list) # lost the duplicates because of set

('a', 'b', 23, 10, True, 'a', 10)
{True, 'b', 'a', 10, 23}
[True, 'b', 'a', 10, 23]


### Search

In [485]:
print(example_list)
print('a' in example_list)
print('c' in example_list)

[True, 'b', 'a', 10, 23]
True
False


- When dealing with dictionary, we can search keys, but not values

In [486]:
print(me_dict)
print('hair' in me_dict)
print('has cat' in me_dict)
print('brown' in me_dict)

{'name': 'L', 'age': 27, 'height': 164.5, 'weight': 50.0, 'hair': 'blonde', 'eyes': 'brown', 'has dog': True, 'favorite color': 'white/black'}
True
False
False


## Comprehensions
- Python has a special syntax called **comprehension** for combining iteration with the creation of a data structure.
- essentially a `for` loop wrapped in the appropriate brackets for creating the data structure

In [487]:
squares = [x**2 for x in range(10)]
square_lut = {x: x**2 for x in range(10)}

print(squares)
print(square_lut)

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}


In [488]:
me_dict_dtypes = {k: type(v) for k, v in me_dict.items()}
print(me_dict_dtypes)

{'name': <class 'str'>, 'age': <class 'int'>, 'height': <class 'float'>, 'weight': <class 'float'>, 'hair': <class 'str'>, 'eyes': <class 'str'>, 'has dog': <class 'bool'>, 'favorite color': <class 'str'>}


- Comparing for loop implementation with comprehension

In [489]:
# For Loop
square_lut = {}
for x in range(10):
    square_lut[x] = x**2

print(square_lut)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}


In [490]:
# COmprehension
square_lut = {x: x**2 for x in range(10)}

print(square_lut)

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}


In [491]:
# Parallel lists = toy "rows"
customers = ["Ana", "Ben", "Ana", "Cara"]
products  = ["Pen", "Pen", "Notebook", "Pen"]
qtys      = [2, 1, 3, 4]
unit_price = {"Pen": 12.5, "Notebook": 40.0}  # dictionary lookup

# 1) Compute line totals and aggregate revenue per product using a dict
revenue_per_product = {}          # e.g., {"Pen": 75.0, "Notebook": 120.0}
print("Line totals:")
for cust, prod, q in zip(customers, products, qtys):
    line_total = q * unit_price.get(prod, 0)
    print(cust, prod, q, "->", line_total)
    revenue_per_product[prod] = revenue_per_product.get(prod, 0) + line_total

# 2) Unique customers (set)
unique_customers = set(customers)

# 3) Tuple to represent an immutable (product, min_stock) rule
min_stock_rule = ("Pen", 5)   # don’t mutate this
rule_product, rule_min = min_stock_rule

print("Revenue per product:", revenue_per_product)
print("Unique customers:", unique_customers)
print("Rule product:", rule_product, "Min stock:", rule_min)


Line totals:
Ana Pen 2 -> 25.0
Ben Pen 1 -> 12.5
Ana Notebook 3 -> 120.0
Cara Pen 4 -> 50.0
Revenue per product: {'Pen': 87.5, 'Notebook': 120.0}
Unique customers: {'Ana', 'Cara', 'Ben'}
Rule product: Pen Min stock: 5


----
## JSON
- stands for Java Script Object Notation
- a text format for storing and transporting data

#### Working with JSON
- works by creating key-value pairs, where kay is data that can be represented by letters (string).
- JSON values can be strings, numbers, objects, arrays, boolean data, or null.
- usually comes as a list of dictionaries

In [492]:
# Json example
[
    {"company": "GoTyme", "department": "Data and Analytics"},
    {"company": "GoTyme", "department": "Data and Analytics"},
    {"company": "GoTyme", "department": "Data and Analytics"},
]

[{'company': 'GoTyme', 'department': 'Data and Analytics'},
 {'company': 'GoTyme', 'department': 'Data and Analytics'},
 {'company': 'GoTyme', 'department': 'Data and Analytics'}]

----
## Exercises

#### 1. Clean product names to lowercase, get unique names, and count occurrences.

In [493]:
# Given data
raw = ["Pen", "pen", "PEN", "Notebook", "Pen"]

##### Function: `to_lower`
- Accepts a list as the argument.
- Utilize list comprehension to make a new list and return the list.
- The list comprehension will iterate through `str_list` and append the lowercase values of each element.

In [494]:
def to_lower(str_list: list) -> list:
  return [x.lower() for x in str_list]  # List comprehension

##### Function: `get_unique`
- Converts `str_list` to a set, leaving only the unique names.
- Converts the set back into a list and returns the list.

In [495]:
def get_unique(str_list) -> list:
  return list(set(str_list))

##### Function: `count_occurences`
- Utilize dictionary comprehension and return the dictionary.
- The dictionary comprehension will iterate through `str_list` and add the results as the key value pair of `x` as the key where x is an element from `str_list` and the value as the value returned by the method `str_list.count()`

In [496]:
def count_occurrences(str_list) -> dict:
  return {x: str_list.count(x) for x in str_list} # Dictionary Comprehension

In [497]:
lowercase_data = to_lower(raw)
unique_data = get_unique(lowercase_data)
occurrences_dict = count_occurrences(lowercase_data)

print(f"Product Names changed to lowercase: {lowercase_data}")
print(f"Unique product names: {unique_data}")
print(f"Occurrences of each product: {occurrences_dict}")

Product Names changed to lowercase: ['pen', 'pen', 'pen', 'notebook', 'pen']
Unique product names: ['pen', 'notebook']
Occurrences of each product: {'pen': 4, 'notebook': 1}


#### 2. You have tuples (name, sku, price). Build a dict sku -> price and look up one SKU safely.
(sku: just a product code)

In [509]:
p1 = ("Pen", "SKU001", 12.5)
p2 = ("Notebook", "SKU002", 40.0)

In [523]:
def to_dict(p_tuples) -> dict:
  # sku_map = {}
  # for name, sku, price in p_tuples:
  #   sku_map[sku] = price
  # return sku_map
  return {sku : price for name, sku, price in p_tuples}

In [557]:
def lookup_price(sku_map, sku: str) -> float:
  return sku_map.get(sku, None)

In [558]:
product_list = [p1, p2]

skumap = to_dict(product_list)

print(f"Product List")
print("-" * 50)

print(f"   {'Product Name':<20} {'Product Code':<15} {'Price':>10}")
for name, sku, price in product_list:
  print(f"-> {name:<20} {sku:<15} {price:>10}")

print("-" * 50)

print(f"Price of SKU001: {lookup_price(skumap, "SKU001")}")

Product List
--------------------------------------------------
   Product Name         Product Code         Price
-> Pen                  SKU001                12.5
-> Notebook             SKU002                40.0
--------------------------------------------------
Price of SKU001: 12.5


#### 3. Use zip to walk parallel lists and compute per-row totals and a grand total.

In [578]:
items = ["Pen","Pen","Notebook"]
qty   = [2, 1, 3]
price = [12.5, 12.5, 40.0]

In [584]:
def compute_item_total(tuple_data):
  grand_total = 0
  for name, quantity, item_price in tuple_data:
    row_total = qty * price
    grand_total += row_total
    print(f"{qty} * {price} = {row_total}")
  return grand_total

In [575]:
testvar1 = list(zip(qty, price))
testdict1 = dict(zip(items, testvar1))
print(data_dict2)
# print(data)

data_dict = {key : (q, p) for key, q, p in zip(items, qty, price)}
print(data_dict)
print(type(data_dict['Notebook']))

{'Pen': (1, 12.5), 'Notebook': (3, 40.0)}
{'Pen': (1, 12.5), 'Notebook': (3, 40.0)}
<class 'tuple'>


In [586]:
data = zip(items, qty, price)

print(f"   {'Product Name':<20} {'Product Quantity':<15} {'Price':>10}")
compute_item_total(data)

   Product Name         Product Quantity      Price


TypeError: can't multiply sequence by non-int of type 'list'

#### 4. Compare old vs new prices and label Up/Down/No change.

In [501]:
products = ["Pen", "Notebook", "Marker"]
old_p    = [10.0, 40.0, 20.0]
new_p    = [12.0, 40.0, 18.0]

In [502]:
# Sol'n

#### 5. Sum amounts by category

In [503]:
cats   = ["food","transport","food","other","food"]
amount = [120,50,80,30,60]

In [504]:
# Sol'n

#### 6. Build a quick inventory dict from parallel lists

In [505]:
items = ["Mug", "T-Shirt", "Sticker"]
qtys  = [10, 4, 25]

In [506]:
# Sol'n

#### 7. Budget vs Actual (Category Labels). Label each category as "Under", "On", or "Over" budget.

In [507]:
cats   = ["Rent", "Groceries", "Transport", "Phone"]
budget = [8000,     3500,        1200,        600]
actual = [8000,     3700,        900,         650]

In [508]:
# Sol'n

---
*End of Code*