# Sequence and Map data structures - Strings, Tuples, Lists, Dictionaries
Our week 2 lesson workbook, available on Github from the powderflask/cap-comp215 repository.

As usual, the first code block just imports the modules we will use.

In [None]:
import datetime
import matplotlib.pyplot as plt
import  matplotlib.dates as mdates
from pprint import pprint

## f-strings
A `string` is a sequence of characters / symbols.
This familiar data structure is quite powerful, and format-strings (f-strings) take it to the next level....

In [None]:
today = datetime.date.today()
the_answer = 42
PI = 3.1415926535

'{today} is not special, but {the_answer} and {PI} are!'

'{today} is not special, but {the_answer} and {PI} are!'

## List Comprehension
Provides a compact syntax for two very common sequence-processing algorithms:  Map  and Filter

Basic syntax:

In [1]:
# [f(x) for x in C if g(x)]
[i**2 for i in range(10) if i%2==0]

[0, 4, 16, 36, 64]

### Map Algorithm
Apply the same function to every item in another sequence (i.e., provide a "mapping" from the source sequence to the target

In [None]:
# Problem:  compute the first 10 natural squares

### Filter
Select a sub-set of the elements from another sequence based on some criteria.

In [None]:
VOWELS = 'aeiou'
text = '''
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
'''
# Problem:  create a string with just the vowels from the text, in order.


## Data Wrangling with List Comprehension
E-learn's Live Quiz module does track quiz scores for each student, but does not store them in the gradebook,
and it reports on them in the most useless way.

Let's do some "data wrangling" to make sense out of this mess!

### The Problem: Unstructured Data!
Notice it is just a single large string!  The real data set has 36 students, and I need to do this every week!

In [59]:
scores = """
  1.                 Ali Oop scored  7/ 8 = 87%


  2.          Alison Ralison scored  8/ 8 = 100%


  3.         Ambily Piturbed scored  8/ 8 = 100%


  4.  Arshan Risnot Farquared scored  5/ 8 = 62%


  5.       Ayushma Jugernaugh scored  5/ 8 = 62%


  6.       Brayden Labaguette scored  7/ 8 = 87%
"""

### Goal
Turn this into structured data: a list of 2-tuples, each student's full name and their integer score.

In [61]:
# split operation?

students = [i.split() for i in scores.split('\n') if i]
names = [' '.join(j[1:-5]) for j in students]
scores = [int(k[-4].rstrip('/')) for k in students]

# import re
# matches = re.findall(r'([\w+\ ]+) scored  (\d)', scores)
# matches

list(zip(names, scores))

AttributeError: 'list' object has no attribute 'split'

## Records
A *record* is a compound data value - a collection of simpler data values (fields) that all describe a single entity.

 * tuple
 * dictionary
 * object

Problem: develop the data representation for a `student` in a student record system,
where a `student` has a first and last name, student id, and date of birth

In [None]:
# Tuple

tup = ('Baguette', 'Croissant')

# Dictionary

dict = {'key1': 'Baguette', 'key2': 'Croissant'}

# Object

class Student:

    attr1 = 'Baguette'