<a href="https://colab.research.google.com/github/alirempel/cap-comp215/blob/main/lessons/week02-data-structures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sequence and Map data structures - Strings, Tuples, Lists, Dictionaries
Our week 2 lesson workbook, available on Github from the powderflask/cap-comp215 repository.

As usual, the first code block just imports the modules we will use.

In [None]:
import datetime
import matplotlib.pyplot as plt
import  matplotlib.dates as mdates
from pprint import pprint

## f-strings
A `string` is a sequence of characters / symbols.
This familiar data structure is quite powerful, and format-strings (f-strings) take it to the next level....

In [None]:
today = datetime.date.today()
the_answer = 42
PI = 3.1415926535

f'{today} is not special, but {the_answer} and {PI} are!'

'2024-01-11 is not special, but 42 and 3.1415926535 are!'

## List Comprehension
Provides a compact syntax for two very common sequence-processing algorithms:  Map  and Filter

Basic syntax:

In [None]:
#[f(x) for x in C if g(x)]      # format for list comprehension
[i for i in range(100) if i%5==0]       # filtering operation

[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95]

### Map Algorithm
Apply the same function to every item in another sequence (i.e., provide a "mapping" from the source sequence to the target

In [None]:
# Problem:  compute the first 10 natural squares
[s**2 for s in range(1,11)]

[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]

### Filter
Select a sub-set of the elements from another sequence based on some criteria.

In [None]:
VOWELS = 'aeiou'
text = '''
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
'''
# Problem:  create a string with just the vowels from the text, in order.
''.join([v for v in text if v in VOWELS])




'oeiuooiaeoeeuaiiieieoeiuoeoiiiuuaoeeooeaaaiuaeiaiieiauioueeiaiouaoaoiiiuaiuieeaooooeua'

## Data Wrangling with List Comprehension
E-learn's Live Quiz module does track quiz scores for each student, but does not store them in the gradebook,
and it reports on them in the most useless way.

Let's do some "data wrangling" to make sense out of this mess!

### The Problem: Unstructured Data!
Notice it is just a single large string!  The real data set has 36 students, and I need to do this every week!

In [None]:
scores = """
  1.                 Ali Oop scored  7/ 8 = 87%


  2.          Alison Ralison scored  8/ 8 = 100%


  3.         Ambily Piturbed scored  8/ 8 = 100%


  4.  Arshan Risnot Farquared scored  5/ 8 = 62%


  5.       Ayushma Jugernaugh scored  5/ 8 = 62%


  6.       Brayden Labaguette scored  7/ 8 = 87%
"""

### Goal
Turn this into structured data: a list of 2-tuples, each student's full name and their integer score.

In [None]:
list = [s.split() for s in scores.split('\n') if s]
goal = [(' '.join(item[1:-5]),int(item[-4].rstrip("/"))) for item in list]
goal

[('Ali Oop', 7),
 ('Alison Ralison', 8),
 ('Ambily Piturbed', 8),
 ('Arshan Risnot Farquared', 5),
 ('Ayushma Jugernaugh', 5),
 ('Brayden Labaguette', 7)]

## Records
A *record* is a compound data value - a collection of simpler data values (fields) that all describe a single entity.

 * tuple
 * dictionary
 * object

Problem: develop the data representation for a `student` in a student record system,
where a `student` has a first and last name, student id, and date of birth

In [None]:
# Tuple
("Bob","Jones", "444444", "1910-01-31")
# Dictionary
{
    "FNAME" : "Bob",
    "LNAME":"Jones",
    "ID" : "444444",
    "BIRTH" : "1910-01-31"}
# Object

In [8]:
goal = "(a + d[i] / (a-b) + f{'data'})"
s = ")a- (3/6)*(4+7)("

obrackets = ("(","[","{")
cbrackets = (")","]","}")

def matches(open,close):
  pairs = (("(",")"),("[","]"),("{","}"))
  return (open,close) in pairs

stack = []
for char in s:
  if char in obrackets:
    stack.append(char)

  if char in cbrackets:
    matches(stack.pop(),char)


match = len(stack) == 0
match

# open = 0
# for char in s:
#   if char == "(":
#     open += 1
#   if char == ")":
#     open -= 1
#   if open < 0:
#     break
# match = open == 0
# match

IndexError: pop from empty list