<a href="https://colab.research.google.com/github/powderflask/cap-comp215/blob/main/examples/week2-solution.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sequence and Map data structures - Strings, Tuples, Lists, Dictionaries
This is our week 2 examples notebook and will be available on Github from the powderflask/cap-comp215 repository.

As usual, the first code block just imports the modules we will use.

In [28]:
import datetime
import matplotlib.pyplot as plt
import  matplotlib.dates as mdates
from pprint import pprint

## f-strings
A `string` is a sequence of characters / symbols.
This familiar data structure is quite powerful, and format-strings (f-strings) take it to the next level....

In [29]:
today = datetime.date.today()
the_answer = 42
PI = 3.1415926535

'{today} is not special, but {the_answer} and {PI} are!'

'{today} is not special, but {the_answer} and {PI} are!'

## List Comprehension
Provides a compact syntax for two very common sequence-processing algorithms:  Map  and Filter

Basic syntax:

In [30]:
def calculate_it(val):
  return 2*val if val%2==0 else 3*val if val%3==0 else val*4

# [calculate_it(i) for i in range(0, 100)]

from collections import defaultdict

# def counter_factory():
#   return 0

counts = defaultdict(lambda : 0)
for i in range(0, 100):
  counts[calculate_it(i)] += 1

counts

defaultdict(<function __main__.<lambda>()>,
            {0: 1,
             4: 2,
             9: 1,
             8: 1,
             20: 2,
             12: 1,
             28: 2,
             16: 1,
             27: 1,
             44: 2,
             24: 1,
             52: 2,
             45: 1,
             32: 1,
             68: 2,
             36: 1,
             76: 2,
             40: 1,
             63: 1,
             92: 2,
             48: 1,
             100: 2,
             81: 1,
             56: 1,
             116: 2,
             60: 1,
             124: 2,
             64: 1,
             99: 1,
             140: 2,
             72: 1,
             148: 2,
             117: 1,
             80: 1,
             164: 2,
             84: 1,
             172: 2,
             88: 1,
             135: 1,
             188: 2,
             96: 1,
             196: 2,
             153: 1,
             104: 1,
             212: 1,
             108: 1,
             220: 1,
    

### Map Algorithm
Apply the same function to every item in another sequence (i.e., provide a "mapping" from the source sequence to the target

In [31]:
# Problem:  compute the first 10 natural squares

### Filter
Select a sub-set of the elements from another sequence based on some criteria.

In [32]:
VOWELS = 'aeiou'
text = '''
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
'''
# Problem:  create a string with just the vowels from the text, in order.


## Data Wrangling with List Comprehension
E-learn's Live Quiz module does track quiz scores for each student, but does not store them in the gradebook,
and it reports on them in the most useless way.

Let's do some "data wrangling" to make sense out of this mess!

### The Problem: Unstructured Data!
Notice it is just a single large string!  The real data set has 36 students, and I need to do this every week!

In [33]:
"""
  1.                 Ali Oop scored  7/ 8 = 87%


  2.          Alison Ralison scored  8/ 8 = 100%


  3.         Ambily Piturbed scored  8/ 8 = 100%


  4.  Arshan Risnot Farquared scored  5/ 8 = 62%


  5.       Ayushma Jugernaugh scored  5/ 8 = 62%


  6.       Brayden Labaguette scored  7/ 8 = 87%
"""

'\n  1.                 Ali Oop scored  7/ 8 = 87%\n\n\n  2.          Alison Ralison scored  8/ 8 = 100%\n\n\n  3.         Ambily Piturbed scored  8/ 8 = 100%\n\n\n  4.  Arshan Risnot Farquared scored  5/ 8 = 62%\n\n\n  5.       Ayushma Jugernaugh scored  5/ 8 = 62%\n\n\n  6.       Brayden Labaguette scored  7/ 8 = 87%\n'

### Goal
Turn this into structured data: a list of 2-tuples, each student's full name and their integer score.

## Records
A *record* is a compound data value - a collection of simpler data values (fields) that all describe a single entity.

 * tuple
 * dictionary
 * object

Problem: develop the data representation for a `student` in a student record system,
where a `student` has a first and last name, student id, and date of birth

In [53]:
# Tuple
tuple_students = [
  ('Bob', '', 'Squarepants', 123456789, datetime.date(year=1994, month=2, day=5)),
  ('Dora', 'The', 'Explora', 192837465, datetime.date(year=2000, month=8, day=14))
]
s = tuple_students[-1]
age = datetime.date.today() - s[4]
age.days // 365

# Dictionary
dict_students = [
    {
      'first': 'Bob',
      'last': 'Squarepants',
      'sn': 123456789,
      'dob': datetime.date(year=1994, month=2, day=5),
    },
    {
      'first': 'Dora',
      'middle': 'The',
      'last': 'Explora',
      'sn': 192837465,
      'dob': datetime.date(year=2000, month=8, day=14),
    },
]
s = dict_students[-1]
s['dob']

students = [
    {'first':s[0], 'last':s[2], 'sn':s[3], 'dob':s[4] } for s in tuple_students
]
students


[{'first': 'Bob',
  'last': 'Squarepants',
  'sn': 123456789,
  'dob': datetime.date(1994, 2, 5)},
 {'first': 'Dora',
  'last': 'Explora',
  'sn': 192837465,
  'dob': datetime.date(2000, 8, 14)}]

In [67]:
# Objects
from dataclasses import dataclass

@dataclass
class Student:
  first: str
  middle:str
  last: str
  sn: int
  dob: datetime.date

  def full_name(self):
    return f'{self.first} {self.last}'

students = [
    Student('Bob', '', 'Squarepants', 123456789, datetime.date(year=1994, month=2, day=5)),
    Student('Dora', 'The', 'Explora', 192837465, datetime.date(year=2000, month=8, day=14))
]
dora = [s for s in students if s.first=='Dora'][0]
dora.full_name()

'Dora Explora'

In [62]:
import math

math.sin(123)

-0.45990349068959124