# Python Language Intro (Part 3)

## Agenda

1. Language overview
2. White space sensitivity
3. Basic Types and Operations
4. Statements & Control Structures
5. Functions
6. OOP (Classes, Methods, etc.)
7. Immutable Sequence Types (Strings, Ranges, Tuples)
    - TODAY
8. Mutable data structures: Lists, Sets, Dictionaries
    - TODAY

## 7. Immutable Sequence Types: Strings, Ranges, Tuples

Recall: All immutable sequences support the [common sequence operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations). For many sequence types, there are constructors that allow us to create them from other sequence types.

- cannot change the values once you create them
- support common sequence operations
- These are all self standing data structures

### Strings

In [1]:
s = 'hello'

In [2]:
type(s)

str

In [4]:
# operations performed on strings
# note: there is not a character single sized type, its a string even if a single character
#           mapped to ascii
(
    s[0], #indexing, returns a string
    s[1:3], #slice, returns a string, exclusive of 3
    'e' in s, #boolean returns if its there
    s + s, #adds to string, when you add a string to itself youre creating a new string, you have computed a new string
)

('h', 'el', True, 'hellohello')

In [5]:
# trying to access by index and assign 
s[0] = 'j'

TypeError: 'str' object does not support item assignment

In [6]:
t = s
# taking the string, adding to itself, then updating the string? 
s += s # not mutating the string!

In [8]:

t, s

('hello', 'hellohello')

### Ranges
    -Range is not a list of all the elements stored in memory!

In [9]:
#represents a HOMOGENEOUS sequence, meaning that all the data is numbers
# 150 down to 10 in steps of 8
r = range(150, 10, -8)

In [10]:
type(r)

range

In [11]:
#using sequence operation on range
# ranges represents a sequence of numbers, it doesnt actually hold all numbers, its a lazy holder, you can get to them but they arent all explictly there
#       range(10000000000000000), all these numbers arent actually being held in memory
#       note: number with a ton of digits, you can put underscores (numbers just dont register them so be careful you dont put it in the wrong spot)
(
    r[2],
    r[3:7], #returns a range
    94 in r
)
#r[2] = 9 # wont work, range is immutable

(134, range(126, 94, -8), True)

### Tuples

In [12]:
()

()

In [14]:
#empty tuples exists
type(_) # note: '_' is the last result in a notebook, used here because we didnt assign to a variable

type

In [15]:
(1, 2, 3)

(1, 2, 3)

In [16]:
1, 2, 3 #tuples dont need parens, its the comma
#consider
#a,b,c = 1,2,3 #tuple of right gets assigned to tuple of the left, this is why the syntax works

(1, 2, 3)

In [17]:
(1) # not a tuple, ITS A PARENTHESIZED EXPRESSION! 

1

In [18]:
type(_)

int

In [19]:
(1,) #COMMAS MAKE THE TUPLE

(1,)

In [None]:
type(_)

In [None]:
1,

In [None]:
('a', 10, False, 'hello') # tuples are heterogenous 

In [21]:
#represents numbers 0-9, get a sequence of the numbers
# you are reifying the range by populating the memory, it is all in memory unlike range
tuple(range(10))

(0, 1, 2, 3, 4, 5, 6, 7, 8, 9)

In [22]:
#take string hello and create a tuple from the sequence
# the tuple should be a sequence of the individual characters in hello
tuple('hello')

('h', 'e', 'l', 'l', 'o')

In [23]:
t = tuple('hello')
(
    'e' in t,
    t[::-1], #reversal
    t * 3 #copy the string 3 times
)
#t[0] = "tigers" # impossible, its immutable

(True,
 ('o', 'l', 'l', 'e', 'h'),
 ('h', 'e', 'l', 'l', 'o', 'h', 'e', 'l', 'l', 'o', 'h', 'e', 'l', 'l', 'o'))

## 8. Mutable data structures: Lists, Sets, Dicts

### Lists

This list supports  the [mutable sequence operations](https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types) in addition to the [common sequence operations](https://docs.python.org/3/library/stdtypes.html#common-sequence-operations).

- ex mutable 
    1. assign to an index
    2. assign to a slice
    3. delete a slice
    4. append, clear
    5. copy (shallow copy, same as s[:])
- Lots of methods that work well with sequences, convert sequences between eachother, etc
- Mutable data structures note: part of stateful, imperative programming
    - most modern languages are primarily imperative 
    - its possible to have and use data structures in a non imperative way: PERSISTENT DATA STRUCTURES
        - they are often very efficient, reach out about it 
    

In [24]:
# these can be heterogeneous lists
#   conveinent, easy to play with
#   not great because you dont have a reliable way to know what type you just pulled out

#square brackets: literal syntax
# empty lists are supported, best via []
l = [1, 2, 1, 1, 2, 3, 3, 1]
l = ['lions', 5, True, 'hello']

In [25]:
type(l)

list

In [26]:
len(l)

4

In [28]:
l[3]

'hello'

In [29]:
l[1:-1] # doesnt include the last one

[5, True]

In [31]:
# appending doesnt alter the original list, if we use the common immutable sequence operations on a mutable data type it still acts the same!
l + ['tigers', 'bears']

['lions', 5, True, 'hello', 'tigers', 'bears']

In [None]:
l # `+` does *not* mutate the list!

In [None]:
l * 3 # creates a new list, copied three times

In [33]:
total = 0
for x in l:
    print(x)
    #total += x
total

lions
5
True
hello


0

#### Lists from other things ...

In [None]:
#create list from other sequences
#   dont need to append from a for loop when you need to populate a list
list(range(10))

In [34]:
#this also works
list('hello!')

['h', 'e', 'l', 'l', 'o', '!']

In [35]:
#the tuple is the purple 
#   3 elements, one is a nested tuple
#   we are copying the reference over, therefore they are referring to the same obj in memory
list((1, 2, (3, 4)))

[1, 2, (3, 4)]

In [36]:
# splits the string along the delimiter (default is whitespace, thus they throw away spaces and tokenize)
'I love CS 331'.split()

['I', 'love', 'CS', '331']

In [37]:
# you can define a custom delimiter, but then spaces will be present
# you can trim to get rid of whitespace
'apples, bananas, cats, dogs'.split(',')

['apples', ' bananas', ' cats', ' dogs']

In [38]:
# also, strings from lists of strings
'-'.join(['a', 'e', 'i', 'o', 'u'])
# takes elements of list, puts string between them and joins them into one string

'a-e-i-o-u'

In [None]:
# string not currently list-ified? split it first
' 👏 '.join('this is a beautiful day'.split())

#### Mutable list operations

In [45]:
# create a list from the string 'hell'
l = list('hell')

In [41]:
k=l


In [42]:
# FIRST MUTABLE LIST OPERATION
# NO RESULT, this is consistent to mutable operations, if it changes a thing it wont return a thing, it has a side effect
# we call this modifying in place
l.append('o')
k is l # these are still the same identical object

True

In [40]:
# if I want to see the change, I inspect the list
l

['lions', 5, True, 'hello', 'o']

In [46]:
# this will append the string to the list
l.append(' there')

In [47]:
l

['h', 'e', 'l', 'l', ' there']

In [48]:
# now lets get rid of " there"
#del is a keyword, followed by something that supports deletion
#   uses a special method
del l[-1]
# no return, its changing the actual object

In [49]:
l

['h', 'e', 'l', 'l']

In [50]:
# notice it takes a sequence and adds all elements of the sequence, not the sequence itself
l.extend(' there')

In [51]:
l

['h', 'e', 'l', 'l', ' ', 't', 'h', 'e', 'r', 'e']

In [52]:
l[2:7]

['l', 'l', ' ', 't', 'h']

In [53]:
del l[2:7]
# so are the empty indices being shifted down? yes
# arrays dont usually support apis like this
# python doesnt natively support arrays, speaks to high level philosophy
# dont call an array a list!

In [58]:
l

['g', 'e', 't', ' ', 't', ' ', 'e', 'r', 'e']

In [56]:
# when you assign to a slice, it needs to be a sequence
#spreads and contracts as necesarry
l[0:2] = 'get '

In [59]:
l

['g', 'e', 't', ' ', 't', ' ', 'e', 'r', 'e']

In [60]:
l[:]

['g', 'e', 't', ' ', 't', ' ', 'e', 'r', 'e']

In [61]:
l == l[:]

True

In [63]:
# not the same object in memory, its a brand new list 
l is l[:]
l1 = [[1,2], [3,4]]

#fun example with diagram provided
l2 = l1[:]
l1[0][0] = 42
l2[1] = ['test', 'test']
l1,l2

([[42, 2], [3, 4]], [[42, 2], ['test', 'test']])

#### Sorting lists

See <https://docs.python.org/3/library/stdtypes.html#list.sort>

In [64]:
import random

l = list(range(-10,10)) #-10 to 9
random.shuffle(l) # this does it in place, actually modifies the list
l

[-3, -8, 4, -6, -4, -10, -5, -2, 6, -9, 1, -1, 2, 8, 0, -7, 5, 9, 3, 7]

In [65]:
# sort method accepts parameters, order, checks the values directly to themselves
l.sort()
l

[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [66]:
random.shuffle(l)
# sorted is a function that works on sequences, this is the non mutating version of sorted, creates a new list
# less efficient, has to create a brand new sequence 
#       memory wise sort list method is a lot more efficient
sorted(l)

[-10, -9, -8, -7, -6, -5, -4, -3, -2, -1, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [67]:
l

[3, 7, -6, 9, 5, 6, -10, -7, 1, -4, -8, 0, -2, -5, -9, 2, 4, 8, -3, -1]

In [68]:
l.sort(reverse=True)
l

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0, -1, -2, -3, -4, -5, -6, -7, -8, -9, -10]

In [71]:
# can specify a key function that is applied to every element prior to sorting, BUT WE ARENT CHANGING THE VALUE ITSELF
#   example where lambdas are great, anonymous fxn good and fast and legible here
l.sort(key=lambda n:abs(n))
l

[0, 1, -1, 2, -2, 3, -3, 4, -4, 5, -5, 6, -6, 7, -7, 8, -8, 9, -9, -10]

In [72]:
# can pass along inbuilt fxns
# stable sort place the one that came before the other one in the list if they are equal, it preserves the order because it is a stable sort
l.sort(key=abs, reverse=True)
l

[-10, 9, -9, 8, -8, 7, -7, 6, -6, 5, -5, 4, -4, 3, -3, 2, -2, 1, -1, 0]

#### List comprehensions

In [78]:

# created using simple list literals, we literally write out the values
# traditional loop style way of populating
l = []
for i in range(10):
    #2*i + 5 is the generating expression, we are seeding it with these values
    l.append(2*i +5)

l


[5, 7, 9, 11, 13, 15, 17, 19, 21, 23]

In [81]:
# list comprehension 
# inspired by mathematical set notation, this is a better way to do this above {2x | x <- N}, set of even numbers
# above you have to look through your code to find your generating expression, not as succint as this
[(2*x +5) for x in range(10)]
#generating expression can be arbitarily complicated
#nested list comprehension
[[(2*n) for n in range(x)] for x in range(10)]

[[],
 [0],
 [0, 2],
 [0, 2, 4],
 [0, 2, 4, 6],
 [0, 2, 4, 6, 8],
 [0, 2, 4, 6, 8, 10],
 [0, 2, 4, 6, 8, 10, 12],
 [0, 2, 4, 6, 8, 10, 12, 14],
 [0, 2, 4, 6, 8, 10, 12, 14, 16]]

In [84]:
adjs = ('hot', 'blue', 'quick')
nouns = ('table', 'fox', 'sky')
# we are joining together adjs and nouns, takes the cartesian products 
# two for structures inside a single list comprehension, like a loop flipped inside out 
    # so the order of your for structures matters
[(adj + ' ' + noun) for adj in adjs 
                    for noun in nouns]

['hot table',
 'hot fox',
 'hot sky',
 'blue table',
 'blue fox',
 'blue sky',
 'quick table',
 'quick fox',
 'quick sky']

In [85]:
# pythagorean triples
# also includes a filter
# generating tuples of values of a,b,c where a, b, and c come from different ranges, but only generate them if they pass the filter test
#       looking for integer right triangles
n = 50
[(a,b,c) for a in range(1,n) 
         for b in range(a,n) 
         for c in range(b,n) 
         #allows one to limit the things appended
         if a**2 + b**2 == c**2]

# if list comprehension looks too weirdly and big, you can switch to using different notation 
# this is built in syntax, we cant build data structures that magically can use their own syntactic sugar

[(3, 4, 5),
 (5, 12, 13),
 (6, 8, 10),
 (7, 24, 25),
 (8, 15, 17),
 (9, 12, 15),
 (9, 40, 41),
 (10, 24, 26),
 (12, 16, 20),
 (12, 35, 37),
 (15, 20, 25),
 (15, 36, 39),
 (16, 30, 34),
 (18, 24, 30),
 (20, 21, 29),
 (21, 28, 35),
 (24, 32, 40),
 (27, 36, 45)]

### Sets

A [set](https://docs.python.org/3.7/library/stdtypes.html#set-types-set-frozenset) is a data structure that represents an *unordered* collection of unique objects (like the mathematical set). 

-least used data structures, dont use them much, but when theyre applicable theyre great

In [88]:
# represents a collection of UNIQUE objects in an unordered manner, but it is in memory so practically speaking it is ordered in memory, but they dont have an indexing you can 
# use to access them
list = [1, 2, 1, 1, 2, 3, 3, 1]
# using the set constructor 
s = set(list)

In [89]:
s
# removes duplicates, automatically unique-ifies

{1, 2, 3}

In [90]:
t = {2, 3, 4, 5}

In [91]:
# support mathematical set notation
s.union(t)

{1, 2, 3, 4, 5}

In [92]:
# union bar is supported too 
s | t

{1, 2, 3, 4, 5}

In [93]:
# present in the first one but not in the second
s.difference(t)

{1}

In [None]:
s - t

In [94]:
s.intersection(t)

{2, 3}

In [None]:
s & t

In [95]:
1 in s 
# consider performance differences in looking at difference between list, set is faster if only because its unique, but also its implementation, its great if you are often
# checking about set containment 

True

In [None]:
for x in s:
    print(x) # will work fine
    # the order of iteration will not be predictable, if I restarted this i may not get the same order, but this may change depending on python version 
    # if you iterate over it once, and iterate it again, it will be preserved because its stored in memory at that point 
s[0] # you CANT access by position

### Dicts (aka maps)

A [dictionary](https://docs.python.org/3/library/stdtypes.html#mapping-types-dict) is a data structure that contains a set of unique key &rarr; value mappings. 

-maps from a set of unique keys to not necesarrily unique values 

In [96]:
# literal syntax 
d = {
    'Superman' :  'Clark Kent',
    'Batman'   :    'Bruce Wayne',
    'Spiderman': 'Peter Parker',
    'Ironman'  :   'Tony Stark'
}

In [97]:
#look up values based on keys, uses same bracket access of list indexing
d['Ironman']

'Tony Stark'

In [99]:
#mutable, therefore you can change the mapping
d['Ironman'] = 'James Rhodes'

In [100]:
d

{'Superman': 'Clark Kent',
 'Batman': 'Bruce Wayne',
 'Spiderman': 'Peter Parker',
 'Ironman': 'James Rhodes'}

In [104]:
#using delete statement
#del d['Ironman']
d['Hulk'] = "Bruce Banner"
d["Foo"] = "Bruce Wayne"
d

#what is serviceable as a key? 
#   typically want to pick something that is IMMUTABLE 

{'Superman': 'Clark Kent',
 'Batman': 'Bruce Wayne',
 'Spiderman': 'Peter Parker',
 'Hulk': 'Bruce Banner',
 'Foo': 'Bruce Wayne'}

In [105]:
#iterating over dicts
# the set of the keys also dont preserve order mostly, but new versions of python will preserve initial order now 
for k in d:
    print(f'{k} => {d[k]}')

Superman => Clark Kent
Batman => Bruce Wayne
Spiderman => Peter Parker
Hulk => Bruce Banner
Foo => Bruce Wayne


In [106]:
#d.keys() => gives an iterator over keys
for k in d.keys():
    print(f'{k} => {d[k]}')

Superman => Clark Kent
Batman => Bruce Wayne
Spiderman => Peter Parker
Hulk => Bruce Banner
Foo => Bruce Wayne


In [None]:
# can just get an iterator over the values, which can have duplicates if values are duplicated, dont know what key it belonged to and cant get it 
for v in d.values():
    print(v)

In [107]:
#items gives a tuple of (ki,vi)
# we unpacked the tuple to be k,v 
# THIS IS MORE EFFICIENT THAN HAVING TO PERFORM A SUBSEQUENT LOOKUP IN FOR K IN D, here theres no lookup you are walking through the tuples
#   dicts are fast lookup structures 
for k,v in d.items():
    print(f'{k} => {v}')

Superman => Clark Kent
Batman => Bruce Wayne
Spiderman => Peter Parker
Hulk => Bruce Banner
Foo => Bruce Wayne


#### Dictionary comprehensions

In [110]:
#not as frequently used, dictionary population usually isnt as susceptible to this sort of 
#generating expression: e:2**e: key value pair, so key is e, val 2**e
# does this in the range 0-100 exclusive in steps of 10 
es = {e:2**e for e in range(0,100,10)}
es[20]

1048576

In [None]:
# 30 combinations, but youre rewriting the key each time, so youll end in 9 for all because it overwrites obviously 
{x:y for x in range(3)
     for y in range(10)}

In [111]:
sentence = 'a man a plan a canal panama'
# maps the first word to a reverse of itself 
{w:w[::-1] for w in sentence.split()}

{'a': 'a', 'man': 'nam', 'plan': 'nalp', 'canal': 'lanac', 'panama': 'amanap'}