# Sequence and Map data structures - Strings, Tuples, Lists, Dictionaries
Our week 2 lesson workbook, available on Github from the powderflask/cap-comp215 repository.

As usual, the first code block just imports the modules we will use.

In [None]:
import datetime
import matplotlib.pyplot as plt
import  matplotlib.dates as mdates
from pprint import pprint

## f-strings
A `string` is a sequence of characters / symbols.
This familiar data structure is quite powerful, and format-strings (f-strings) take it to the next level....

In [None]:
today = datetime.date.today()
the_answer = 42
PI = 3.1415926535

f'{today} is not special, but {the_answer} and {PI} are!'

'2026-01-12 is not special, but 42 and 3.1415926535 are!'

## List Comprehension
Provides a compact syntax for two very common sequence-processing algorithms:  Map  and Filter

Basic syntax:

In [None]:
[2*x for x in range(100)]

[0,
 2,
 4,
 6,
 8,
 10,
 12,
 14,
 16,
 18,
 20,
 22,
 24,
 26,
 28,
 30,
 32,
 34,
 36,
 38,
 40,
 42,
 44,
 46,
 48,
 50,
 52,
 54,
 56,
 58,
 60,
 62,
 64,
 66,
 68,
 70,
 72,
 74,
 76,
 78,
 80,
 82,
 84,
 86,
 88,
 90,
 92,
 94,
 96,
 98,
 100,
 102,
 104,
 106,
 108,
 110,
 112,
 114,
 116,
 118,
 120,
 122,
 124,
 126,
 128,
 130,
 132,
 134,
 136,
 138,
 140,
 142,
 144,
 146,
 148,
 150,
 152,
 154,
 156,
 158,
 160,
 162,
 164,
 166,
 168,
 170,
 172,
 174,
 176,
 178,
 180,
 182,
 184,
 186,
 188,
 190,
 192,
 194,
 196,
 198]

### Map Algorithm
Apply the same function to every item in another sequence (i.e., provide a "mapping" from the source sequence to the target

In [None]:
# Problem:  compute the first 10 natural squares

[y**2 for y in range(10)]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

### Filter
Select a sub-set of the elements from another sequence based on some criteria.

In [None]:
VOWELS = 'aeiou'
text = '''
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
'''
# Problem:  create a string with just the vowels from the text, in order.

vowels = [letter for letter in text if text if letter in VOWELS]
vowels

# write the loop out to help figure out what to write


['o',
 'e',
 'i',
 'u',
 'o',
 'o',
 'i',
 'a',
 'e',
 'o',
 'e',
 'e',
 'u',
 'a',
 'i',
 'i',
 'i',
 'e',
 'i',
 'e',
 'o',
 'e',
 'i',
 'u',
 'o',
 'e',
 'o',
 'i',
 'i',
 'i',
 'u',
 'u',
 'a',
 'o',
 'e',
 'e',
 'o',
 'o',
 'e',
 'a',
 'a',
 'a',
 'i',
 'u',
 'a',
 'e',
 'i',
 'a',
 'i',
 'i',
 'e',
 'i',
 'a',
 'u',
 'i',
 'o',
 'u',
 'e',
 'e',
 'i',
 'a',
 'i',
 'o',
 'u',
 'a',
 'o',
 'a',
 'o',
 'i',
 'i',
 'i',
 'u',
 'a',
 'i',
 'u',
 'i',
 'e',
 'e',
 'a',
 'o',
 'o',
 'o',
 'o',
 'e',
 'u',
 'a']

## Data Wrangling with List Comprehension
E-learn's Live Quiz module does track quiz scores for each student, but does not store them in the gradebook,
and it reports on them in the most useless way.

Let's do some "data wrangling" to make sense out of this mess!

### The Problem: Unstructured Data!
Notice it is just a single large string!  The real data set has 36 students, and I need to do this every week!

In [None]:
"""
  1.                 Ali Oop scored  7/ 8 = 87%


  2.          Alison Ralison scored  8/ 8 = 100%


  3.         Ambily Piturbed scored  8/ 8 = 100%


  4.  Arshan Risnot Farquared scored  5/ 8 = 62%


  5.       Ayushma Jugernaugh scored  5/ 8 = 62%


  6.       Brayden Labaguette scored  7/ 8 = 87%
"""

'\n  1.                 Ali Oop scored  7/ 8 = 87%\n\n\n  2.          Alison Ralison scored  8/ 8 = 100%\n\n\n  3.         Ambily Piturbed scored  8/ 8 = 100%\n\n\n  4.  Arshan Risnot Farquared scored  5/ 8 = 62%\n\n\n  5.       Ayushma Jugernaugh scored  5/ 8 = 62%\n\n\n  6.       Brayden Labaguette scored  7/ 8 = 87%\n'

### Goal
Turn this into structured data: a list of 2-tuples, each student's full name and their integer score.

In [None]:
[(Brayden Labaguette, 7)]

In [None]:
elearn_string = """
  1.                 Ali Oop scored  7/ 8 = 87%


  2.          Alison Ralison scored  8/ 8 = 100%


  3.         Ambily Piturbed scored  8/ 8 = 100%


  4.  Arshan Risnot Farquared scored  5/ 8 = 62%


  5.       Ayushma Jugernaugh scored  5/ 8 = 62%


  6.       Brayden Labaguette scored  7/ 8 = 87%
"""

#students = [student for student in elearn_string.split("\n") if student]
#records = [student.split() for student in students]

compact_students = [student.split() for student in elearn_string.split("\n") if student]
records = [(rec[1:5], rec[-4]) for rec in compact_students]
records = [(" ".join(name), score[:-1]) for name,score in compact_students]

'''
student_list = [(student, grade) for student in compact_students[1:3] for grade in compact_students[-4]]
print(student_list)
'''

ValueError: too many values to unpack (expected 2)

## Records
A *record* is a compound data value - a collection of simpler data values (fields) that all describe a single entity.

 * tuple
 * dictionary
 * object

Problem: develop the data representation for a `student` in a student record system,
where a `student` has a first and last name, student id, and date of birth

In [10]:
# Tuple
student = ('Monica', 'Illner', 1001, '05-13-1997') # Tuples are compact (less memory), immutable (can't be changed)

# Dictionary
#student = {name :'Monica', last: 'Illner', id: 1001, birth: '05-13-1997'} #two different ways, this is the 115 way

dict(
    name = 'Monica',
    last = 'Illner',
    id = 1001,
    birth = '05-13-1997'
)

# Object   -- Best for lots of students
from types import SimpleNamespace
student = SimpleNamespace(
    name = 'Monica',
    last = 'Illner',
    id = 1001,
    birth = '05-13-1997'
)
#student.name

from typing import NamedTuple #to name the items in a tuple
StudentRecord = NamedTuple('StudentRecord', [('name', str), ('last', str), ('id', int), ('birth', str)])  # highlight, shift, ()
monica = StudentRecord(
    name = 'Monica',
    last = 'Illner',
    id = 1001,
    birth = '05-13-1997'
)
monica.id

class Student:
  def __init__(self, name, last, id, birth):
    self.name = name
    self.last = last
    self.id = id
    self.birth = birth

monica = Student(
    name = 'Monica',
    last = 'Illner',
    id = 1001,
    birth = '05-13-1997'
)
monica.name

'Monica'