## Sets

A set is a data structure that can only contain unique objects.  Adding something that already exists in a set does nothing (but also does not cause an error)

Ref: https://realpython.com/python-sets/


In [None]:
numbers = set([1,1,1,1,1,3,3,3,3,3,2,2,2,2,3,3,3,4])
letters = set('TheQuickBrownFoxJumpedOverTheLazyDog'.lower())
numbers.add(4)
print(numbers)

{1, 2, 3, 4}


In [None]:
numbers.add(5)
print(numbers)

{1, 2, 3, 4, 5}


In [None]:
numbers.update([3,4,5,6,7])
print(numbers)

{1, 2, 3, 4, 5, 6, 7}


In [None]:
numbers.pop()

1

In [None]:
print(numbers)

{2, 3, 4, 5, 6, 7}


In [None]:
numbers.remove(7)

In [None]:
print(numbers)

{2, 3, 4, 5, 6}


You can do Union, Intersection, and symmetric difference on sets...

In [None]:
house_pets = {'dog', 'cat', 'fish'}
farm_animals = {'cow', 'sheep', 'pig', 'dog', 'cat'}

house_pets & farm_animals

{'cat', 'dog'}

In [None]:
house_pets | farm_animals

{'cat', 'cow', 'dog', 'fish', 'pig', 'sheep'}

In [None]:
house_pets ^ farm_animals # symmetric difference

{'cow', 'fish', 'pig', 'sheep'}

In [None]:
house_pets - farm_animals # asymmetric difference

{'fish'}

### In Class Exercise

1. If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3,5,6 and 9.  The sum of these mutliples is 23.  Find the sum of all the multiples between 3 or 5 below 1000.

In [None]:
sum = 0

for i in range(1000):
    if i%3 == 0 or i%5 == 0:
            sum += i
            print(i)

print(sum)

0
3
5
6
9
10
12
15
18
20
21
24
25
27
30
33
35
36
39
40
42
45
48
50
51
54
55
57
60
63
65
66
69
70
72
75
78
80
81
84
85
87
90
93
95
96
99
100
102
105
108
110
111
114
115
117
120
123
125
126
129
130
132
135
138
140
141
144
145
147
150
153
155
156
159
160
162
165
168
170
171
174
175
177
180
183
185
186
189
190
192
195
198
200
201
204
205
207
210
213
215
216
219
220
222
225
228
230
231
234
235
237
240
243
245
246
249
250
252
255
258
260
261
264
265
267
270
273
275
276
279
280
282
285
288
290
291
294
295
297
300
303
305
306
309
310
312
315
318
320
321
324
325
327
330
333
335
336
339
340
342
345
348
350
351
354
355
357
360
363
365
366
369
370
372
375
378
380
381
384
385
387
390
393
395
396
399
400
402
405
408
410
411
414
415
417
420
423
425
426
429
430
432
435
438
440
441
444
445
447
450
453
455
456
459
460
462
465
468
470
471
474
475
477
480
483
485
486
489
490
492
495
498
500
501
504
505
507
510
513
515
516
519
520
522
525
528
530
531
534
535
537
540
543
545
546
549
550
552
555
558
560
561


## File IO

Easy to do in Python!

In [None]:
myfile = open('data.txt', 'w')
myfile.write("writing data to file")
myfile.close()

data.txt should be saved to the folder from where you are running this note book - go check!

Note - opening a file that already exists will erase the original file!

There are only a few modes which are commonly used
1. w - writing
2. r - reading
3. a - appending
4. r+ - reading and writing
5. b - binary mode

### File-like Objects

With the amount of RAM available in modern computers there could be a big performance gain in your applicaiton by using a file-like object.3

In [None]:
import io
mystringfile = io.StringIO()
mystringfile.write("This is my data!")
mystringfile.read() # cursor is at the end!3

''

In [None]:
mystringfile.seek(0) # put the cursor back at the start
mystringfile.read()

'This is my data!'

### In Class Exercise

Write a word count funtion that takes a text file and returns a dictionary that contains the count for each word.  Remove all punctuation except apostrophe.  Lowercase all words.

In [None]:
myfile = open('data.txt', 'r')

mydata = myfile.read()
mydict = {}

for i in mydata.split():
    if i in mydict:
        mydict[i] += 1
    else:
        mydict[i] = 1


print(mydict)

{'writing': 1, 'data': 1, 'to': 1, 'file': 1}


## Map Reduce

A map is a function that takes two arguments: another function and a collection of items.  It will:
1. Run the function on each item of the original collection
2. Return a new collection containing the results
3. Leave the original collection unchanged

In Python the collection must simply be iterable (list, tuple, string)

In [None]:
#Example 1: String Lengths
string_lengths = map(len, ["dog", "cat", "zebra", "turtle"])
print(list(string_lengths))

[3, 3, 5, 6]


Example 2: Cubing

In [None]:
cubes = map(lambda x: x**3, [0,1,2,3,4])
print(list(cubes))

[0, 1, 8, 27, 64]


## Lambdas

A Lambda lets you define and use an unnamed function.  Arguments fit between the lambda and the colon while the stuff after the colon gets implicity returned (ie without an explicit return statement)

Lambdas are most useful when:
1. your function is simple
2. you only need to use it once

Consider the usual way:

In [None]:
def cube(x):
    return x**3

print(cube(4))

64


In [None]:
# we could have done this instead
cubes = map(cube, [0,1,2,3,4])
print(list(cubes))

[0, 1, 8, 27, 64]


### In Class Exercise

Write a lambda to add 5 to a list of numbers

In [None]:
func = lambda x: x+5
print(list(map(func, [1,2,3,4,5,6])))

Now back to map - Here is a procedural way to take a list of real names and assign them a random code name

In [None]:
import random

names = ['Jared', 'Gavin', 'Walter', 'Mike', 'Brett', 'Hugh']
code_names = ['Eagle', 'Hawk', 'Seagull', 'Heron', 'Sparrow', 'Raven']

for i in range(len(names)):
    names[i] = random.choice(code_names)

print(names)

['Eagle', 'Heron', 'Eagle', 'Raven', 'Raven', 'Sparrow']


### In Class Exercise

Rewrite the above function functionally (use map and lambda)

In [None]:
names = ['Jared', 'Gavin', 'Walter', 'Mike', 'Brett', 'Hugh']

covernames = map(lambda x: random.choice(['Eagle', 'Hawk', 'Seagull', 'Heron', 'Sparrow', 'Raven']), names)

print(list(covernames))

## Reduce

Reduce is a counter part to map.  Given a funciton and a collection of items, it uses the function to combine then into a single value and returns that result.

The function passed to reduce has some restictions.  It must take two arguments: an accumulator and an update value.  The update value is like it was before with map; it will get set to each item in the collection one by one.  The accumulator is new.  It recieves the output from the previous function call, thus "accumulating" the combined value from item to item through the collection.

In [None]:
import functools
sum = functools.reduce(lambda a, x: a + x, [0,1,2,3,4])
print(sum)

10
