<p style="padding:10px;border-radius:4px;background-color:cornflowerblue;color:white;font-size: 28px;line-height:33px;">A compendium of comprehensions: an instructive review notebook.</p>
<hr />
<p style="font-size: 9px;">GB Feb 01, 2023</p>
<p>The purpose of this notebook is to work sequentially through the fundamentals of comprehensions, with sidepoints along the way.  Experienced students may end-up skipping parts &lsquo;cause they know the topic already; others may want to work each cell.  In any case here&rsquo;s a compendium of comprehensions.</p>
<hr />
<h2>Table of Contents</h2>
<ul>
    <li>List and List Comprehensions</li>
    <li>Tuples</li>
    <li>Sets</li>
    <li>Dictionaries</li>
</ul>
<hr />

<p style="color:red;font-family:Baskerville;font-size:30px;">
    List Comprehensions</p>
 <blockquote>A list is rather the equivalent of other languages' <i>array</i>, a mutable data structure that can hold a variety of data types in a single block - the list.<br /><br />
    A vital point ultimately is to <i>iterate</i> (or loop thru) all elements of the list; and build versions to select data that meet some criterion.
<br /><br />Examples:
</blockquote>

In [103]:
students = [] # an empty list
print(f"{students} type: {type(students)}  id: {id(students)}")

[] type: <class 'list'>  id: 4549616384


In [4]:
students = ['Tom', 25, 'Ming', 32, 'Toby', 18] # list created with elements
print(f"{students} \n type: {type(students)}  id: {id(students)}")

['Tom', 25, 'Ming', 32, 'Toby', 18] 
 type: <class 'list'>  id: 4427492096


<p style="color:red;font-family:Baskerville;font-size:16px;">
    Iterations: for loop and comprehension examples</p>

In [5]:
for student in students:
    print(student)
    
print("-"*30)
# let's find Ming:
for student in students:
    if student == "Ming":
        print(student)

Tom
25
Ming
32
Toby
18
------------------------------
Ming


<p style="color:red;font-family:Baskerville;font-size:16px;">
Creating a new list: namespace and object space
</p>
<blockquote>We want to use more than 1 list and think about whether that list and copies <i>point</i> to the same object (in RAM; aka object space) and what happens to that object when we manipulate it.
<br /><br />
Do any of these list objects share the same space in RAM/object space, even tho they have different &ldquo;name spaces&rdquo;?  Notice the result of the new_students list - a new object with new name &amp; object spaces.<br /><br />
Here is a baseline syntax: <br />
    <code>the_list = [expression <span style="color:green;">for</span> item <span style="color:green;">in</span> iterable <span style="color:green;">if</span> condition == True]</code>
</blockquote>

In [6]:
# let's find Ming and if we do, add to a new list
all_students = []
new_students = []

all_students = students

for student in students:
    if "Ming" in students: # notice the 'in' returns True/False
        new_students.append(student)
        
print("-"*30)
print(f"students: {students} \n type: {type(students)}  id: {id(students)}\n")
print(f"all_students: {all_students} \n type: {type(all_students)}  id: {id(all_students)}\n")
print(f"new_students: {new_students} \n type: {type(new_students)}  id: {id(new_students)}")

------------------------------
students: ['Tom', 25, 'Ming', 32, 'Toby', 18] 
 type: <class 'list'>  id: 4427492096

all_students: ['Tom', 25, 'Ming', 32, 'Toby', 18] 
 type: <class 'list'>  id: 4427492096

new_students: ['Tom', 25, 'Ming', 32, 'Toby', 18] 
 type: <class 'list'>  id: 4427465536


<p style="color:red;font-family:Baskerville;font-size:16px;">Working from a <i>for loop</i> to a <i>list comprehension</i>.</p>

In [7]:
all_students = ['Ming', 'かちとくに', 'סר', 'Tom', 'José', 'Maria', 'Lynne', 'Dave']

new_student_list = [student for student in all_students if "סר" in student]

not_student_list = [student for student in all_students if student != 'Lynne']


new_student_list
print(f"all_students: {all_students} \n type: {type(all_students)}  id: {id(all_students)}\n")
print(f"new_student_list: {new_student_list} \n type: {type(new_student_list)}  id: {id(new_student_list)}\n")
print(f"not_student_list: {not_student_list} \n type: {type(not_student_list)}  id: {id(not_student_list)}\n")


all_students: ['Ming', 'かちとくに', 'סר', 'Tom', 'José', 'Maria', 'Lynne', 'Dave'] 
 type: <class 'list'>  id: 4427425792

new_student_list: ['סר'] 
 type: <class 'list'>  id: 4427431744

not_student_list: ['Ming', 'かちとくに', 'סר', 'Tom', 'José', 'Maria', 'Dave'] 
 type: <class 'list'>  id: 4427425472



<p style="font-size:24; color:red; font-family:Baskerville;">Affecting the extracted data: the temp variable (e.g., "student") is its own object so let's change it - such as rendering the output in upper case.  Using the model of .upper(), we can practice all the other string manipulation methods.</p>

In [8]:
upper_case_output = [s.upper() for s in all_students]
print(upper_case_output)

['MING', 'かちとくに', 'סר', 'TOM', 'JOSÉ', 'MARIA', 'LYNNE', 'DAVE']


<p style='font-size:24px;color:red;font-family:Baskerville;'>
More conditions: integrating the <i>if ... else</i> in a comprehension:</p>
<p>Compare how 'Tom' is replaced with 'José' - compare the original students list - notice the "25" was preceeded by 'Tom', but now José.</p>

In [11]:
# hmmm. 
test_list = [student if student != 'Tom' else 'José' for student in students]
#test_list = [student if student != 'Tom' for student in students]
test_list

['José', 25, 'Ming', 32, 'Toby', 18]

<p style="font-size:32; color:red; font-family:Baskerville;">Using integers example</p>

In [145]:
# very common to use the range() object for testing - especially when running algorithm complexity tests.

my_ints = [i for i in range(12)] # notice the "12" is not included.
my_ints

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

<p style="font-size:24; color:red; font-family:Baskerville;">Using integers example: adding conditions</p>

In [146]:
# we want numbers that are &lt; 7
low_ints = [i for i in range(10) if i < 7]
print(low_ints)

[0, 1, 2, 3, 4, 5, 6]


<p><b>Experiment &amp; Practice</b>  Please be sure you can create and use <i>if</i> ... statements in list comprehensions.  Once you have a solid foundation, you&rsquo;ll be confident with the variations python will offer.  Practice each of the cells: make variations, deliberately try to cause an error [to learn what kinds of errors (aka &rsquo;exceptions&rsquo;) can be created] and how to address them.  Once you have a set of skills that work for you, it&rsquo;s recommended that you start to keep a record of them - that&rsquo;s both the beginning of your own how-to manual and your own library of code to import for your future projects.</p>
<hr />

<p style="color:red;font-size:36px;font-family:Baskerville;line-height:40px;">Sort and Sorting</p>
<blockquote><p>In real practice, it's common to get data [from a file, from a life stream, from a URL, from an SQL server ... doesn't matter - it's all just data] ... once you have access to data, you need to store them someplace in order to (a) remove data that aren't applicable, (b) check the domain and range of the data, (c) ensure the encoding of the data are the same (for integration, sorting, testing, whatever else you'll do), (d) address missing values, (e) and that the amount of data coming in don't overwhelm your computer.  Consequently, depending on what you're doing, the rule-of-thumb is to grab a chunk of data (from something called <i>chunking theory</i>) and &ldquo;clean&rdquo; the data [as suggested above] so we look for ways to explore the data, manipulate them and when we're satisfied the data set is ready for real analysis, then we'll copy those data to a new object.</p>
<p>This means (a) we'll probably have some <i>original</i> data, say "A".  And we'll make a copy of A into B.  If B = A, then what happens to A will happen to B and vice verse.  Important note: how the original data are stored in RAM is one thing; how those data are presented <i>on the screen</i> can be entirely different - the source data may not be affected at all - it's just their presentation that appears different.  That's why we'll manipulate the list (and other data structures) for our exploration of the data - then we can copy the manipulated data output to a <i>new</i> structure - a new object with its own new name (in name space) and in object space.</blockquote>
<p>Tech side note: if you studied CS you know we're looking at SNL (singly linked lists) and DLL (doubly linked lists) and considerable byte-level shifting.</p>

<p style='font-size:24px;color:red;font-family:Baskerville;'>
    Sort versus Sorted
</p>

In [147]:
# let's experiment: a list then sort it ... with sort()
list1 = ['a', 'b', 4, 'cat', 'alpha', 0, '99']
print(f"List 1: {list1} \n {id(list1)}")

List 1: ['a', 'b', 4, 'cat', 'alpha', 0, '99'] 
 4569243648


In [148]:
# NOTE! trying to *sort()* a list that combines different data types will
# throw an exception... not what we want.
list1.sort()

# could you write a list comprehension that'd convert 
# the ints to strings before sorting?

TypeError: '<' not supported between instances of 'int' and 'str'

In [149]:
# let's try this again....
list1 = ['a', 'b', 'f', 'z', '6']
print(f"List 1: {list1} \n {id(list1)}")

list1.sort()
print(f"List 1: {list1} \n {id(list1)}")

List 1: ['a', 'b', 'f', 'z', '6'] 
 4569440704
List 1: ['6', 'a', 'b', 'f', 'z'] 
 4569440704


In [150]:
list1_sorted = sorted(list1)
print(list1_sorted)
print(f"{id(list1_sorted)}, Same data, same sort - but a different object, confirmed by the id()")

['6', 'a', 'b', 'f', 'z']
4569248896, Same data, same sort - but a different object, confirmed by the id()


<p style='font-size:24px;color:red;font-family:Baskerville;'>
Planning for more complicated list manipulations: adding a function.
</p>
<p>Here we're anticipating the use of more than 1 list, the lambda and the map() methods.</p>
<p>Before moving to working with more than 1 list, sometimes we need to do more 
    preparation or have some criterion we want to use to include or exclude some data.  Here's the idea of adding a function to a sort, as a parameter.  When we move to 2+ lists, we'll want to use the map() method.</p>

In [151]:
# okay, so we can use ints and strings and whatnot ... 
# how we treat the -output- can be a method or a function.
# here are two examples with string (to upper), reverse

new_list = ['z','b','e','3','99','o','f']
print(f"new_list (before reversing) = {new_list}, id = {id(new_list)}")

new_list (before reversing) = ['z', 'b', 'e', '3', '99', 'o', 'f'], id = 4569249088


In [152]:
new_list.reverse()
print(f"Reversed list: {new_list}, id {id(new_list)}")

Reversed list: ['f', 'o', '99', '3', 'e', 'b', 'z'], id 4569249088


In [153]:
new_list[0] = '000A'
print(f" changing data at 0: {new_list}, id {id(new_list)}")

 changing data at 0: ['000A', 'o', '99', '3', 'e', 'b', 'z'], id 4569249088


<p style='font-size:24px;color:red;font-family:Baskerville;'>

<hr />
<p style='font-size:24px;color:red;font-family:Baskerville;'>Copying, Joining, and other List Methods</p>
<p>In this section we consider real-world practices.  Say you're to integrate data from different data sets (for instance, there are various offices, labs, companies, whose data you have to inspect and later integrate for analysis.</p>
<p style="font-size:18px;color:red;font-family:Baskerville;">Using the + sign.</p>

In [154]:
lab_1 = [1, 5, 3, 2, 4, 1, 2]
lab_2 = [9, 10, 30, 1, 4, 2, 1]

# let's test ...
lab_1_copy = lab_1.copy()
print(f"Lab_1 id is {id(lab_1)} and the copy is {id(lab_1_copy)}")

print(f"\nCool.  Now we have 3 different list objects.  Let's explore ... ")

# USING THE + SIGN
all_labs = lab_1 + lab_2
print(f"\nlab_1: {lab_1}, id = {id(lab_1)}")
print(f"\nlab_2: {lab_2}, id = {id(lab_2)}")
print(f"\nall_labs: {all_labs}, id = {id(all_labs)}\n")

Lab_1 id is 4568768064 and the copy is 4568825792

Cool.  Now we have 3 different list objects.  Let's explore ... 

lab_1: [1, 5, 3, 2, 4, 1, 2], id = 4568768064

lab_2: [9, 10, 30, 1, 4, 2, 1], id = 4568772288

all_labs: [1, 5, 3, 2, 4, 1, 2, 9, 10, 30, 1, 4, 2, 1], id = 4556374208



<p style="font-size:18px;color:red;font-family:Baskerville;">Using the extend() method.</p>

In [155]:
berkeley_labs = lab_1.extend(lab_2)
print(f"berkeley_labs = {berkeley_labs} with id {id(berkeley_labs)}")

print(f"lab_1 = {lab_1} with id {id(lab_1)}")
print(f"lab_2 = {lab_2} with id {id(lab_2)}")

berkeley_labs = None with id 4515636896
lab_1 = [1, 5, 3, 2, 4, 1, 2, 9, 10, 30, 1, 4, 2, 1] with id 4568768064
lab_2 = [9, 10, 30, 1, 4, 2, 1] with id 4568772288


In [156]:
print("A different technique - just extending lab_1 with lab_2's contents... ")
print(f"lab_1 = {lab_1} with id {id(lab_1)}")
print(f"lab_2 = {lab_2} with id {id(lab_2)}")

lab_1.extend(lab_2)

print(f"\n\nApplying the .extend() to lab_1: {lab_1}, id {id(lab_1)}")

A different technique - just extending lab_1 with lab_2's contents... 
lab_1 = [1, 5, 3, 2, 4, 1, 2, 9, 10, 30, 1, 4, 2, 1] with id 4568768064
lab_2 = [9, 10, 30, 1, 4, 2, 1] with id 4568772288


Applying the .extend() to lab_1: [1, 5, 3, 2, 4, 1, 2, 9, 10, 30, 1, 4, 2, 1, 9, 10, 30, 1, 4, 2, 1], id 4568768064


<p>Pay attention, then, to the creation of a <i>new</i> object that combines/extends two lists versus a single list that is extended by a different list.  Notice that all elements in both lists are considered "siblings" - they're all in the same dimension and can be accessed by their location (index) number).</p>

<p style="font-size:18px;color:red;font-family:Baskerville;">Using the append() method.</p>
<p>What do you do when you want to integrate data from 2+ sets <i>but</i> want to preserve the integrity of the added lists?  In other words, you'll end up with a single list object - but that object with contain "sub-lists" - a multidimensional list.</p>

In [157]:
list_a = ['Paris', 'Antibes', 'Nice', 'Menton']
list_b = ['Rome', 'Naples', 'Bari']
print(f"list_a = {list_a}.  id {id(list_a)}")
print(f"list_b = {list_b}.  id {id(list_b)}\n\nNow appending ... ")

list_a.append(list_b)
print(f"list_a = {list_a}. \n\t id {id(list_a)}")

print(list_a[4][0])

list_a = ['Paris', 'Antibes', 'Nice', 'Menton'].  id 4569267776
list_b = ['Rome', 'Naples', 'Bari'].  id 4556347008

Now appending ... 
list_a = ['Paris', 'Antibes', 'Nice', 'Menton', ['Rome', 'Naples', 'Bari']]. 
	 id 4569267776
Rome


<p>A note about indexing styles ... Notice that the resulting list_a has a "sub-list" at position 4.  That means if we use multidimensional indexing (e.g. [4][0]) we'll return "Rome"; but if we use this on a non-sub-list (like elements 0-3), then using [3][0] will return the letter 'M'.</p>

In [158]:
print(list_a[4][0])
print(list_a[3][0])

Rome
M


<p style='font-size:24px;color:red;font-family:Baskerville;'>Other useful methods</p>
<p>We&rsquo;re still using the list object as our foundation.  So let&rsquo;s look at a few other options and discuss why they&rsquo;re useful.</p>
<ol>
    <li>How many items in my list? <code><i>my_list</i>.count()</code></li>
    <li>How many times does 'x' appear in my list? <code><i>my_list</i>.count('x')</code></li>
    <li>Where <i>is</i> 'x' in my list? <code><i>my_list</i>.index('x')</code></li>
    <li>But now I need to insert 'y' <i>before</i> 'x'. <code><i>my_list</i>.insert(<i>index</i>, 'x')</code>.  Notice we have to know what the value of index is first.  So to save a step, use the index() as the parameter (see below)</li>
    <li>Hmmm, I don't like Y there.  Let's remove it.  <code><i>my_list</i>.remove('Y')</code>  Notice we don't need to find the index.  Python searches, and if found (then True) the index is known (but not shown to us) and the element at that index is removed.</li>
    <li>pop() is your friend.  Pop() with no parameter removes the <i>final</i> element; pass it an index number and that's your target ... <code><i>my_list</i>.pop(<i>index</i>)</code></li>
</ol>

In [159]:
testo = ['a','b','x','z']
print(testo)
testo.insert( testo.index('x'), 'Y')
print(testo)

print("-"*30)
for i in range(0, len(testo)):
    testo.pop()
    print(f"i = {i}, {testo}")

['a', 'b', 'x', 'z']
['a', 'b', 'Y', 'x', 'z']
------------------------------
i = 0, ['a', 'b', 'Y', 'x']
i = 1, ['a', 'b', 'Y']
i = 2, ['a', 'b']
i = 3, ['a']
i = 4, []


<p>&nbsp;&nbsp;&nbsp;</p>
    <hr />
<p style="color:red;font-size:36px;line-height:40px;font-family:Baskerville, serif;">
    Tuples</p>
<p>Tuples use the <code>( )</code> parentheses to create and are intended to store multiple items ... and are <b>unchangeable</b> ... but if you stick a list (a mutable object) into a tuple, you <i>can</i> change the list's contents but <i>not</i> any of the tuple's.</p>
<p>Important: once a tuple has been created, it cannot be changed - no additions, no changes, no removals.  [This has to do with where and how the tuple object is stored in the heap in RAM.]  But since tuples use indexing (as lists do) then we <b>duplicate values are ok</b>.</p>


In [160]:
# number of times this artist hit #1 in 1980
music_tuple = ("Dianna Ross", 12, "Bee-Gees", 15, "Donna Summer", 4)
print(music_tuple)

('Dianna Ross', 12, 'Bee-Gees', 15, 'Donna Summer', 4)


<p style="font-family:Baskerville; color:red; font-size:18px;">
    Use indexing (including negative indexing (remember string splicing?); as well as <code>in</code> ... notice the Python syntax pattern emerging?)</p>

In [161]:
if ('Dianna Ross') in music_tuple:
    print("Can you believe she's still performing?")

Can you believe she's still performing?


In [162]:
print(f"What's the lowest number of hits for this group?  {music_tuple[-1]}")

What's the lowest number of hits for this group?  4


In [163]:
print(f"But who is that?  {music_tuple[-2]}") 
print(f"But who is that with name?  {music_tuple[-2:]}")

print(f"What is the range of the #s?  {music_tuple[::-2]}")

But who is that?  Donna Summer
But who is that with name?  ('Donna Summer', 4)
What is the range of the #s?  (4, 15, 12)


<p style="font-family:Baskerville; color:red; font-size:18px;">
Adding data to a tuple?</p>
<p>Yes, by converting an immutable object to a mutable one ... tuple to list and then back to tuple.  Note that this is computationally pretty expensive.</p>
<p>Or, incestiously it seems, you can <b>add a tuple to a tuple</b> [it's actually a SLL.]

In [164]:
tuple_a = ("Switzerland","Greece","Russia","Austria") #places where too much smoking
tuple_b = ("Denmark",) # notice that tuples *must* have pair idea ... 

tuple_a += tuple_b

print(tuple_a)

('Switzerland', 'Greece', 'Russia', 'Austria', 'Denmark')


<p style="font-family:Baskerville; color:red; font-size:18px;">
&ldquo;Unpacking&rdquo; or &ldquo;Right Tuples&rdquo; - attaching names to our data for easier access.
</p>
<p>Important to note, oddly, that the packed tuple names <i>cannot</i> have quotes.  They're not really strings but <i>references</i> (like pointers) to the data in the tuple.  Weird.</p>

In [168]:
# start with a tuple
cats = ("Bix", "Bunny", "BabyKitty", "Suky") # yes, my actual cats

(tabby, siamese, orange_tabby, calico) = cats  # notice the name (cat)

print(orange_tabby)

BabyKitty


<p>Clearly I'm trying to draw parallels to what you know about Strings and python's interesting way of viewing data.  If we can use + for two string vars, why not use other math symbols for tuples?</p>

In [171]:
print(f"Using the + sign - for curious additions!")
print((cats + tuple_a))

print(f"\nLike the Time Warp ... ")
rockyhorror = ("Time", "Warp")

print(f"Let's do the {rockyhorror * 3} one more time!")

Using the + sign - for curious additions!
('Bix', 'Bunny', 'BabyKitty', 'Suky', 'Switzerland', 'Greece', 'Russia', 'Austria', 'Denmark')

Like the Time Warp ... 
Let's do the ('Time', 'Warp', 'Time', 'Warp', 'Time', 'Warp') one more time!


<p>&nbsp;&nbsp;&nbsp;</p>
    <hr />
<p style="color:red;font-size:36px;line-height:40px;font-family:Baskerville, serif;">
    Sets</p>

<p>Sets are <b>immutable</b>, unordered (which means displaying them the data may appear in different order each time), <b>unindexed</b>, which means they cannot have duplicates because data are located by their unique values.</p>
<p>So if you have a list of lots of data and want to identify each unique value, convert your list into a set and there you go!</p>
<p>Sets use <code>{ }</code> as their constructor.</p>

In [1]:
set_1 = {'vino', 'Italian', 'vin', 'French', 'wine', 'English', 'Wein', 'German'}
print(set_1)

for drinks in set_1:
    print(drinks)
    

# rats, forgot Polish
set_1.add("Wino")
set_1.add("Polish")
for drinks in set_1:
    print(drinks)

{'Wein', 'vin', 'French', 'German', 'wine', 'vino', 'English', 'Italian'}
Wein
vin
French
German
wine
vino
English
Italian
Wein
vin
French
German
Polish
wine
Wino
vino
English
Italian


<p>Note that set uses slightly different commands: <code>discard()</code> for <b>deleting an element</b>.  Tho like its sibling data structures, you can use <code>pop()</code> to remove items.  <br />
    To <b>delete a set</b> use <code>del</code>, e.g., del my_set
    </p>
<hr />
<p style="color:red;font-size:16px;font-family: Baskerville;">Examples</p>
<p>Here we want to explore sets and converting sets to other data types to address the mutability concern.  Note that <code>set</code> has features that are more akin in SQL and set theory (examples further below).</p>

In [2]:
print(x for x in drinks)

<generator object <genexpr> at 0x10496b840>


In [5]:
new_set = [x for x in drinks] # if "סר" in student]
print(new_set)

new_set_2 = [x for x in drinks[0]] # if "סר" in student]
print(new_set_2)


['I', 't', 'a', 'l', 'i', 'a', 'n']
['I']


In [6]:
new_set_3 = {x for x in drinks}
print(f"new set and type: {new_set_3}, {type(new_set_3)}")


new set and type: {'t', 'I', 'l', 'n', 'i', 'a'}, <class 'set'>


In [15]:
a_set = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
new_tuple_from_set = () # a tuple
new_list_from_set = [] # a list

print(type(new_tuple_from_set))
print(type(new_list_from_set))

# this won't work - tuples aren't mutiple
for x in a_set:
    new_tuple_from_set.add(x)
    

<class 'tuple'>
<class 'list'>


AttributeError: 'tuple' object has no attribute 'add'

In [19]:
for x in a_set:
    new_list_from_set.append(x)
print(f"new_list_from_set {new_list_from_set} \n{type(new_list_from_set)}")

new_a_set = set(new_list_from_set)
print(f"new_a_set {new_a_set} \n{type(new_a_set)}")

new_list_from_set [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] 
<class 'list'>
new_a_set {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} 
<class 'set'>


<hr >
<p style="font-family:Baskerville; color:red; font-size:18px;">Set Comprehensions</p>

In [34]:
# start off with some lists ... 
students = ['Tom', 'Ming', 'Fifi', 'Pine']  # lists from some output source ... 
grades = [90,59,62,42]

print(f"Output for creating a new set with condition: {test1}\n")
test1 =  {i for i in grades if i > 0}

test2 =  {(i,j) for i in grades for j in students if i > 0}
print(f"\nOutput for creating a new set from two lists with condition:\n {test2}")

Output for creating a new set with condition: {90, 59, 42, 62}


Output for creating a new set from two lists with condition:
 {(42, 'Fifi'), (42, 'Pine'), (59, 'Fifi'), (59, 'Pine'), (42, 'Ming'), (59, 'Ming'), (62, 'Ming'), (90, 'Ming'), (42, 'Tom'), (59, 'Tom'), (62, 'Fifi'), (90, 'Fifi'), (90, 'Pine'), (62, 'Pine'), (62, 'Tom'), (90, 'Tom')}


<hr >
<p style="font-family:Baskerville; color:red; font-size:18px;">Set Intersection</p>
<p>intersection is unique among our data structures.  Here&rsquo;s an example that returns a set that contains the similarity of 2 or more sets.  It's worth using this example and experimenting with other set methods.</p>

In [44]:
my_vacation_opts = {"Newport", "Antibes", "München", "Calabria", "Lake Como", "Monterey"}
daves_vacation_opts = {"St. John", "Québec", "Hong Kong", "Perth", "Newport"}
lynns_vacation_opts = {"Québec", "Warsaw", "München", "Newport"}

# a little pre-work test:
print(f"how many options in my set? {len(my_vacation_opts)}, {id(my_vacation_opts)}")

destination = my_vacation_opts.intersection(daves_vacation_opts)
print(destination)

final_dest = my_vacation_opts.intersection(daves_vacation_opts, lynns_vacation_opts)

print(f"\nWell, I guess we're off to destination {final_dest}\n")

# and a little housekeeping - clean-up unwanted data
my_vacation_opts.intersection_update(daves_vacation_opts, lynns_vacation_opts)
print(f"The only shared item is: {my_vacation_opts}\n")

print(f"\nBut notice the status now of my_vacation_opts: {my_vacation_opts}\n {len(my_vacation_opts)}")

# a little post-work test:
print(f"Now how many options in my set? {len(my_vacation_opts)}, {id(my_vacation_opts)}")
# so we see the original set is altered.  

how many options in my set? 6, 4377424416
{'Newport'}

Well, I guess we're off to destination {'Newport'}

The only shared item is: {'Newport'}


But notice the status now of my_vacation_opts: {'Newport'}
 1
Now how many options in my set? 1, 4377424416


<p>&nbsp;&nbsp;&nbsp;</p>
    <hr />
<p style="color:red;font-size:36px;line-height:40px;font-family:Baskerville, serif;">
Dictionaries</p>
<p>By this point, you&rsquo;re probably tired of these variations and slight differences in syntax.  So let&rsquo;s focus on dictionaries - ultra important feature.</p>
<p>The Dictionary <code>key:value</code> pair is the equivalent of the name:value pair used on the web - because sending data over a network usually means we can't preserve the data type.  So, we need to provide some means of associating a key (or variable name) with the data.  That's why in search enginines you'll see ?q=xxx  in the URL search address.  The ? is the delimiter between the URL and the name/value pair; usually "q" is used for "query" and then the value to be sought follows the = sign.</p>
<p>Dictionaries are used <i>a lot</i> for sharing data between systems: data are exported from SQL into json (Excel, too?); python scripts then read the .json file into a dictionary and we're off!</p>

new_dict = {<b>key:value</b> <span style="color:green;">for</span> (<b>key, value</b>) in iterable}

new_dict_conditions = {<b>key:value</b> for (<b>key, value</b>) in iterable <span style="color:green;">if</span> (key, value satisfy a condition)}

In [51]:
# creating a dictionary - and get familiar with its : , syntax:
students = { 
    "Tom": 90,
    "Jane": 90,
    "Ming": 90,
    "BabyKitty": 100
}
print(students)

# my cat, BabyKitty, is pretty smart.  How well is she doing?
# cannot use " " in both places in this line - and cannot escape \" here.
print(f"What is BabyKitty's score? {students['BabyKitty']}")

{'Tom': 90, 'Jane': 90, 'Ming': 90, 'BabyKitty': 100}
What is BabyKitty's score? 100


In [52]:
dict_1 = {} # create a dictionary

# adding data using the loop approach
for i in range(0, 10):
    # Note the first i is from line 4 i as key - need a value, too, 
    # so add one for convenience, just squaring the value of i
    dict_1[i] = i*2
print(f"for version of adding data to {dict_1}")
    

for version of adding data to {0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14, 8: 16, 9: 18}


In [53]:
# comprehension version:
dict_1 = {}

dict_1 = {i: i*2 for i in range(0, 10)}
print(dict_1)

# comprehnsion version WITH CONDITION:  using the "in" command
primes = [2, 3, 5, 7, 9, 11, 13]

dict_2 = {i: i*2 for i in range(0, 10) if i in primes}
print(dict_2)

{0: 0, 1: 2, 2: 4, 3: 6, 4: 8, 5: 10, 6: 12, 7: 14, 8: 16, 9: 18}
{2: 4, 3: 6, 5: 10, 7: 14, 9: 18}


<hr />
<p style="font-family:Baskerville; color:red; font-size:18px;">Using inputs: 2 lists (for keys and values) <code>zip</code> to create a new dictionary.
</p>

In [89]:
students = ['Tom', 'Ming', 'Fifi', 'Pine']  # lists from some output source ... 
grades = ['90','89','92','100']

students_and_grades = {} # EMPTY dictionary
for (key, value) in zip(students, grades):
    students_and_grades[key] = value
    
# now imagine we're printing a Registrar's grade list ... 
print(f"Here are the grades for course X100:\n {students_and_grades}")

Here are the grades for course X100:
 {'Tom': '90', 'Ming': '89', 'Fifi': '92', 'Pine': '100'}


<hr />
<p style="font-family:Baskerville; color:red; font-size:18px;">Understanding access to items.
</p>

In [58]:
# equivalent commands - notice the symbol differences [ ] versus get()
stocks = {"APPL": 300,
         "UBER": 3,
         "TSLR": 2,
         "NCPL": 500}


print(stocks["APPL"])
print(stocks.get("APPL"))

300
300


In [60]:
# inspect my data by UNIQUE keys:
print(stocks.keys())

# hmmm... let's fix one of the values
stocks["APPL"] = 100
print(stocks)

dict_keys(['APPL', 'UBER', 'TSLR', 'NCPL'])
{'APPL': 100, 'UBER': 3, 'TSLR': 2, 'NCPL': 500}


<hr />
<p style="font-family:Baskerville; color:red; font-size:18px;">building on the access issue... all <i>values</i>, all <i>keys</i>, and both
</p>

In [69]:
# ALL VALUES
for money in stocks.values():
    print(f"The values are {money}")
    
# what stocks do I have?
print("\n")
for companies in stocks.keys():
    print(f"Remind me, Smithers, of my investments: {companies}")
    
# how are my investments doing {pretty badly, actually}
print("\n")
for company, money in stocks.items():
    print(f"The SEC is here! What do we got? {company, money}")

The values are 100
The values are 3
The values are 2
The values are 500


Remind me, Smithers, of my investments: APPL
Remind me, Smithers, of my investments: UBER
Remind me, Smithers, of my investments: TSLR
Remind me, Smithers, of my investments: NCPL


The SEC is here! What do we got? ('APPL', 100)
The SEC is here! What do we got? ('UBER', 3)
The SEC is here! What do we got? ('TSLR', 2)
The SEC is here! What do we got? ('NCPL', 500)


<hr />
<p style="font-family:Baskerville;font-size:20px;color:red;">
    Preparing for .json and data integration: nested dictionaries</p>
    <p>Here we imagine a library book catalogue of checked out items:</p>

In [64]:
onloan = {
  "book1" : {
    "name" : "Hamlet",
    "year" : 1559,
    "author": "Shakespeare"
  },
  "book2" : {
    "name" : "La vie en rose",
    "year" : 1977,
    "author": "Piaf, Edith"
  },
  "book3" : {
    "name" : "Italian Riviera",
    "year" : 2002,
    "author": "Rick Steeves"
  }
}

print(onloan)

{'book1': {'name': 'Hamlet', 'year': 1559, 'author': 'Shakespeare'}, 'book2': {'name': 'La vie en rose', 'year': 1977, 'author': 'Piaf, Edith'}, 'book3': {'name': 'Italian Riviera', 'year': 2002, 'author': 'Rick Steeves'}}


<p><code>pop</code> goes the dictionary ... Dictionaries share some methods, like clear(), copy(), pop() but different with <b>fromkeys()</b>, <b>get()</b>, <b>items</b>, <b>keys()</b>, <b>popitem()</b>, <b>setdefault()</b>, <b>update()</b> and finally <b>values()</b>.  This makes sense 'cause dictionaires require <i>unique</i> keys so before manipulating data we must locate it by <i>key</i> not by index ... and so require usually a known key and a new/updated value.</p>

In [68]:
onloan.update({"author":"Shakespeare, William"})
onloan

# but note!  the author must be disambiguated...
onloan['book1']['author'] = "Shakespeare, William"
print(f"\nProperly updated nested dictionary: {onloan}")


Properly updated nested dictionary: {'book1': {'name': 'Hamlet', 'year': 1559, 'author': 'Shakespeare, William'}, 'book2': {'name': 'La vie en rose', 'year': 1977, 'author': 'Piaf, Edith'}, 'book3': {'name': 'Italian Riviera', 'year': 2002, 'author': 'Rick Steeves'}, 'author': 'Shakespeare, William'}


<hr />
<p style="font-family:Baskerville; color:red; font-size:22px;">Dictionary Comprehensions
</p>

In [83]:
# since we have keys/values, we need 2 temp vars:
# using the example from the zip above ...
some_students = dict.fromkeys(range(4), True)
print(some_students)

{0: True, 1: True, 2: True, 3: True}


In [86]:
students = ['Tom', 'Ming', 'Fifi', 'Pine']  # lists from some output source ... 
grades = [50, 49, 45, 52]

report = {person:score for (person, score) in zip(students, grades)}
print(f"Here's the score report {report} \n")


# this effort seems logical but won't yield the top score ... see below.
new_report = {} # EMPTY dictionary
for (key, value) in zip(students, grades):
    new_report[key] = value
    
finalreport = {person:score for (person,score) in new_report.items() if score > 50}
print(f"Here's the score report {new_report}")


Here's the score report {'Tom': 50, 'Ming': 49, 'Fifi': 45, 'Pine': 52} 

Here's the score report {'Tom': 50, 'Ming': 49, 'Fifi': 45, 'Pine': 52}


<hr />
<p style="font-family:Baskerville;font-size:20px;color:red;">Finding the highest value in a dictionary; or other conditions you need.
</p>

In [88]:
students_test = {'Tom': 50, 'Ming': 49, 'Fifi': 83, 'Pine': 21, 'Dave': 38, 'Toby': 47}

last_one = {person:score for (person, score) in students_test.items() if score > 50}

print(last_one)

{'Fifi': 83}


<hr />Done. Feb 2, 2023, GB