<a href="https://colab.research.google.com/github/khwaishrana/comp215/blob/main/lessons/week02-data-structures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Sequence and Map data structures - Strings, Tuples, Lists, Dictionaries
Our week 2 lesson workbook, available on Github from the Hamilton-at-CapU/comp215 repository.

As usual, the first code block just imports the modules we will use.

In [None]:
import datetime
from pprint import pprint

## f-strings
A `string` is a sequence of characters / symbols.
This familiar data structure is quite powerful, and format-strings (f-strings) take it to the next level....

In [2]:
 today = datetime.date.today()
 the_answer = 42
 PI = 3.1415926535

 # make the next line an f-string that will replace the variable
 # names today, the_answer and PI with their values
 my_string = f'{today} is not special, but {the_answer} and {PI} are!'
 print(my_string)

NameError: name 'datetime' is not defined

## List Comprehension
Provides a compact syntax for two very common sequence-processing algorithms:  Map  and Filter

Basic syntax: ```[f(x) for x in C if g(x)]```


### Map Algorithm
A *map* applies the same function to every item in another sequence (i.e., provide a "mapping" from the source sequence to the target)

In [None]:
# a basic map
data = [1,2,3,4,5,6]
squares = []
for n in data:
  squares.append(n**2)
print(f'the squares of {data} are {squares}')

# Problem: write the above map loop as a list comprehension


the squares of [1, 2, 3, 4, 5, 6] are [1, 4, 9, 16, 25, 36]


### Filter
A *filter* selects a sub-set of the elements from another sequence based on some criteria.

In [None]:
# a filter

VOWELS = 'aeiou'
text = '''
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.
'''

# Problem: use list comprehension to create a list with just the vowels from the text, in the order they appear in the text.


### Functions

List comprehension can be used with user defined functions.

In [None]:
# list comprehension with functions
data = [1,2,3,4,5,6]
squares = []

def square(x):
  ''' Return the square of x. '''
  return x*x

def even(x):
  ''' Return True iff x is even. '''
  return x%2==0

# a loop that makes a list of the squares of even numbers in data
for n in data:
  if even(n):
    squares.append(square(n))
print(f'the squares of even numbers in {data} are {squares}')


# Problem: write the above map with filter loop as a list comprehension using the square() and even() functions


## Data Wrangling with List Comprehension
E-learn's Live Quiz module does track quiz scores for each student, but does not store them in the gradebook,
and it reports on them in the most useless way.  Let's do some "data wrangling" to make sense out of this mess!



In [None]:
data = """
  1.                 Ali Oop scored  7/ 8 = 87%


  2.         Amolak Singh . scored  8/ 8 = 100%


  3.  Arshan Risnot Farquared scored  5/ 8 = 62%


  4.       Ayushma Jugernaugh scored  5/ 8 = 62%


  5.       Brayden Labaguette scored  7/ 8 = 87%
"""


Notice it is just a single large string (ie. unstructured).  The real data set has 36 students and needs to be parsed every week, so some structure would be helpful...

Turn this into *structured* data: a list of 2-tuples, each student's full name and their integer score.

In [None]:
# Problem: turn the above unstructured data into a structured data (a list of 2-tuples with full name and score) using list comprehension in each of the following steps

# 1. make a list lines, where each line is a list of words in that line
#    example: [['1.', 'Ali', 'Oop', 'scored', '7/', '8', '=', '87%'], ... ]


# 2. make a list of scores for the student in each line
#    example: [7, 8, 5, 5, 7]


# 3. make a list of names for the student in each line
#    example: ['Ali Oop', 'Amolak Singh .', 'Arshan Risnot Farquared', 'Ayushma Jugernaugh', 'Brayden Labaguette']


# 4. zip the scores and names lists
#    example: [('Ali Oop', 7), ('Amolak Singh .', 8), ('Arshan Risnot Farquared', 5), ('Ayushma Jugernaugh', 5), ('Brayden Labaguette', 7)]


## Records
A *record* is a compound data value - a collection of simpler data values (fields) that all describe a single entity.  The above is an example of a tuple as a data record.  Dictionaries can create records that are easier to understand.


In [None]:
# Problem: develop a dictionary data representation for the above data, where the name is the key and the grade is the value


In [None]:
# Challenge Problem: develop an object data representation for the above data (ie. define your own class)
