# Agenda

1. Q&A
2. Dispatch tables
3. Comprehensions (list, dict, set, nested)
4. Functional programming (passing functions as arguments)
5. `lambda`

In [1]:
def a():
    return 'A'

def b():
    return 'B'

while True:
    s = input('Enter function name: ').strip()
    
    if not s:
        break
        
    if s == 'a':
        print(a())
    elif s == 'b':
        print(b())
    else:
        print(f'{s} is not a valid option')

Enter function name: a
A
Enter function name: b
B
Enter function name: x
x is not a valid option
Enter function name: z
z is not a valid option
Enter function name: q
q is not a valid option
Enter function name: w
w is not a valid option
Enter function name: ab
ab is not a valid option
Enter function name: 


# Dispatch table

In [2]:
def a():
    return 'A'

def b():
    return 'B'

funcs = {'a':a,
         'b':b}

while True:
    s = input('Enter function name: ').strip()
    
    if not s:
        break
        
    if s in funcs:
        print(funcs[s]())
    else:
        print(f'{s} is not a valid option')

Enter function name: a
A
Enter function name: b
B
Enter function name: c
c is not a valid option
Enter function name: 


# Exercise: Calculator

1. Write two functions, `add` and `mul`, that implement addition and multiplication.
2. Ask the user, repeatedly, to enter a simple math expression with two numbers and either `+` or `*`.
    - If the operator is known, then run the appropriate function, and display the expression and the result.
    - If not, then give the user a message that makes it clear the operator isn't known.
3. Use a dispatch table to implement this functionality.
4. How easy will it be to add new operators, and functions?

Example:

    Enter expression: 2 + 5
    2 + 5 = 7
    Enter expression: 3 * 10
    3 * 10 = 30
    Enter expression: 10 / 2
    10 / 2 = (not implemented)

In [6]:
def add(first, second):
    return first + second

def mul(first, second):
    return first * second

def sub(first, second):
    return first - second

funcs = {'+':add,
         '*':mul,
         '-':sub        }

while True:
    s = input('Enter expression: ').strip()
    
    if not s:
        break
        
    fields = s.split()
    if len(fields) != 3:
        print(f'Illegal input; enter an expression as X op Y')
        continue
        
    x, op, y = fields
    x = int(x)
    y = int(y)
    
    if op in funcs:
        result = funcs[op](x, y)
    else:
        result = f'({op} is not implemented)'
        
    print(f'{x} {op} {y} = {result}')
    

Enter expression: 2 + 3
2 + 3 = 5
Enter expression: 


In [8]:
# prefix syntax
# + 2 3 4 5
# * 2 3 4 5

def add(*numbers):
    total = 0
    
    for one_number in numbers:
        total += one_number

    return total

def mul(*numbers):
    product = 1
    
    for one_number in numbers:
        product *= one_number
        
    return product

    
funcs = {'+':add,
         '*':mul}

while True:
    s = input('Enter expression: ').strip()
    
    if not s:
        break
        
    op, *str_numbers = s.split()     # str_numbers will be a list
        
    numbers = []
    for one_number in str_numbers:
        if one_number.isdigit():
            numbers.append(int(one_number))
        else:
            print(f'Ignoring {one_number}; not intable')

    if op in funcs:
        result = funcs[op](*numbers)  # turn the list numbers into many individual arguments, from its elements
    else:
        result = f'({op} is not implemented)'
        
    print(f'{x} {op} {y} = {result}')
    

Enter expression: + 2 3 4
2 + 3 = 9
Enter expression: * 2 3 4
2 * 3 = 24
Enter expression: 


# Uses for `*NAME`

- In the definition of a function, `*args` is a tuple, with all positional args no one else wanted
- In calling a function, `*name` must be an iterable, and its elements will be arguments to the function
- In unpacking, `*name` makes `name` a list, with all elements that other variables didn't get.

In [12]:
s = {10, 20, 30}

def mysum(*numbers):
    print(f'{numbers=}')
    total = 0
    
    for one_number in numbers:
        total += one_number
        
    return total

In [13]:
mysum(s)

numbers=({10, 20, 30},)


TypeError: unsupported operand type(s) for +=: 'int' and 'set'

In [14]:
mysum(*s)

numbers=(10, 20, 30)


60

In [15]:
x,*y,z = range(10, 20)

In [16]:
x

10

In [17]:
y

[11, 12, 13, 14, 15, 16, 17, 18]

In [18]:
z

19

In [19]:
w,*x,y,z = range(10, 20)

In [20]:
w

10

In [21]:
x

[11, 12, 13, 14, 15, 16, 17]

In [22]:
y

18

In [23]:
z

19

# Comprehensions

In [24]:
numbers = range(10)

# I want a new list whose elements are the same as numbers, but **2
output = []

for one_number in numbers:
    output.append(one_number ** 2)
    
output    

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [26]:
str_numbers = '10 20 30 40 50'

numbers = []
for one_number in str_numbers.split():
    if one_number.isdigit():
        numbers.append(int(one_number))
    else:
        print(f'Ignoring {one_number}; not intable')

numbers        

[10, 20, 30, 40, 50]

In [27]:
# list comprehension
# creates a new list!

[one_number ** 2 for one_number in range(10)]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [28]:
[int(one_item) for one_item in str_numbers.split()]

[10, 20, 30, 40, 50]

In [29]:
[one_number ** 2                # expression 
 for one_number in range(10)]   # iteration

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [30]:
[int(one_item)                         # SELECT -- any Python expression can be here!
 for one_item in str_numbers.split()]  # FROM  -- any Python iterable can be here!

[10, 20, 30, 40, 50]

# Exercises: Comprehensions

1. Define a list of integers. Use `str.join` to join the integers together with spaces. Remember that `str.join` won't work on a list of int; you need to turn each int into a string.
2. Ask the user to enter a sentence.  Use a list comprehension to count the non-whitespace characters in all of the words.

In [35]:
mylist = [10, 20, 30, 40, 50]

' '.join(mylist)

TypeError: sequence item 0: expected str instance, int found

In [32]:
# comprehension check:

# - what do I have as input? list of integers
# - what do I want as output? list of strings
# - do I have an expression that translates from the first to the second? Yes, str

[str(one_item)
 for one_item in mylist]

['10', '20', '30', '40', '50']

In [34]:
' '.join([str(one_item)
             for one_item in mylist])

'10 20 30 40 50'

In [36]:
s = 'This is a fantastic test sentence for our exercise'

In [37]:
len(s)

50

In [38]:
s.split()

['This', 'is', 'a', 'fantastic', 'test', 'sentence', 'for', 'our', 'exercise']

In [40]:
[len(one_word)
 for one_word in s.split()]

[4, 2, 1, 9, 4, 8, 3, 3, 8]

In [41]:
sum([len(one_word)
 for one_word in s.split()])

42

In [42]:
# input is a list of strings
# output is a list of integers, the length of each word
# transform by running len

In [43]:
s = '  a  b  c  d   '

s.split(' ')

['', '', 'a', '', 'b', '', 'c', '', 'd', '', '', '']

In [44]:
s.split()  

['a', 'b', 'c', 'd']

In [45]:
s = 'this is a bunch of words for my class'

s.capitalize()

'This is a bunch of words for my class'

In [46]:
s.title()

'This Is A Bunch Of Words For My Class'

In [47]:
# how can I get the same output as title, but using only capitalize?

[one_word.capitalize()
 for one_word in s.split()]

['This', 'Is', 'A', 'Bunch', 'Of', 'Words', 'For', 'My', 'Class']

In [48]:
' '.join([one_word.capitalize()
         for one_word in s.split()])

'This Is A Bunch Of Words For My Class'

In [51]:
[one_line.split(':')[0]   # get the first field from the list we got from splitting on ':'
 for one_line in open('/etc/passwd')]

['##\n',
 '# User Database\n',
 '# \n',
 '# Note that this file is consulted directly only when the system is running\n',
 '# in single-user mode.  At other times this information is provided by\n',
 '# Open Directory.\n',
 '#\n',
 '# See the opendirectoryd(8) man page for additional information about\n',
 '# Open Directory.\n',
 '##\n',
 'nobody',
 'root',
 'daemon',
 '_uucp',
 '_taskgated',
 '_networkd',
 '_installassistant',
 '_lp',
 '_postfix',
 '_scsd',
 '_ces',
 '_appstore',
 '_mcxalr',
 '_appleevents',
 '_geod',
 '_devdocs',
 '_sandbox',
 '_mdnsresponder',
 '_ard',
 '_www',
 '_eppc',
 '_cvs',
 '_svn',
 '_mysql',
 '_sshd',
 '_qtss',
 '_cyrus',
 '_mailman',
 '_appserver',
 '_clamav',
 '_amavisd',
 '_jabber',
 '_appowner',
 '_windowserver',
 '_spotlight',
 '_tokend',
 '_securityagent',
 '_calendar',
 '_teamsserver',
 '_update_sharing',
 '_installer',
 '_atsserver',
 '_ftp',
 '_unknown',
 '_softwareupdate',
 '_coreaudiod',
 '_screensaver',
 '_locationd',
 '_trustevaluationagent',
 '

In [52]:
# ignore lines starting with #
[one_line.split(':')[0]                 # (3) SELECT
 for one_line in open('/etc/passwd')    # (1) FROM 
 if not one_line.startswith('#')]       # (2) WHERE

['nobody',
 'root',
 'daemon',
 '_uucp',
 '_taskgated',
 '_networkd',
 '_installassistant',
 '_lp',
 '_postfix',
 '_scsd',
 '_ces',
 '_appstore',
 '_mcxalr',
 '_appleevents',
 '_geod',
 '_devdocs',
 '_sandbox',
 '_mdnsresponder',
 '_ard',
 '_www',
 '_eppc',
 '_cvs',
 '_svn',
 '_mysql',
 '_sshd',
 '_qtss',
 '_cyrus',
 '_mailman',
 '_appserver',
 '_clamav',
 '_amavisd',
 '_jabber',
 '_appowner',
 '_windowserver',
 '_spotlight',
 '_tokend',
 '_securityagent',
 '_calendar',
 '_teamsserver',
 '_update_sharing',
 '_installer',
 '_atsserver',
 '_ftp',
 '_unknown',
 '_softwareupdate',
 '_coreaudiod',
 '_screensaver',
 '_locationd',
 '_trustevaluationagent',
 '_timezone',
 '_lda',
 '_cvmsroot',
 '_usbmuxd',
 '_dovecot',
 '_dpaudio',
 '_postgres',
 '_krbtgt',
 '_kadmin_admin',
 '_kadmin_changepw',
 '_devicemgr',
 '_webauthserver',
 '_netbios',
 '_warmd',
 '_dovenull',
 '_netstatistics',
 '_avbdeviced',
 '_krb_krbtgt',
 '_krb_kadmin',
 '_krb_changepw',
 '_krb_kerberos',
 '_krb_anonymous',
 '_asse

In [53]:
for one_item in 'abcd':
    print(one_item)

a
b
c
d


In [54]:
for one_item in 5:
    print(one_item)

TypeError: 'int' object is not iterable

In [55]:
'c' in one_item

False

In [56]:
'c' in 5

TypeError: argument of type 'int' is not iterable

In [57]:
!ls *.txt

linux-etc-passwd.txt  myconf.txt  output.txt
mini-access-log.txt   nums.txt	  shoe-data.txt


In [58]:
!head shoe-data.txt

Adidas	orange	43
Nike	black	41
Adidas	black	39
New Balance	pink	41
Nike	white	44
New Balance	orange	38
Nike	pink	44
Adidas	pink	44
New Balance	orange	39
New Balance	black	43


# Exercise: Shoe dicts from shoe data

1. From the file `shoe-data.txt`, I want you to create a list of dicts.
2. Because there are 100 lines in the file (each with three columns -- `brand`, `color`, and `size` -- separated by tabs, `\t`), there will be 100 dicts in the list.
3. Each dict should look like

        {'brand':'Adidas',
         'color':'orange',
         'size':'43'}
     
4. I suggest writing a function for use in the list comprehension's expression.     

In [64]:
def line_to_dict(one_line):
    brand, color, size = one_line.strip().split('\t')
    
    return {'brand': brand,
           'color': color,
           'size': size}

[line_to_dict(one_line)
 for one_line in open('shoe-data.txt')]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [67]:
# other way to do this -- warning! short code

def line_to_dict(one_line):
    return dict(zip(['brand', 'color', 'size'],
                   one_line.strip().split('\t')))


[line_to_dict(one_line)
 for one_line in open('shoe-data.txt')]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [66]:
list(zip('abc', [10,20, 30]))

[('a', 10), ('b', 20), ('c', 30)]

In [68]:
# return the expression to the comprehension... at your own professional risk

[ dict(zip(['brand', 'color', 'size'],
           one_line.strip().split('\t')))
 for one_line in open('shoe-data.txt')]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [70]:
list(zip([10, 20, 30], (100, 200, 300)))

[(10, 100), (20, 200), (30, 300)]

In [74]:
# normally, zip stops at the shortest argument's length
list(zip([10, 20, 30], (100, 200, 300), 'wxyz'))

[(10, 100, 'w'), (20, 200, 'x'), (30, 300, 'y')]

In [73]:
# new in Python 3.10
list(zip([10, 20, 30], (100, 200, 300), 'wxyz', strict=True))

ValueError: zip() argument 3 is longer than arguments 1-2

# Next up

1. Comprehensions
    1. Set comprehension
    2. Dict comprehension
    3. Nested comprehensions
2. Sorting in Python, and passing functions as args

Resume at 11:05

In [75]:
with open('shoe-data.txt') as f:
    # f.__enter__()
    for one_line in f:
        print(len(one_line), end=' ')
    # f.__exit__()  # with a file -- flushes + closes

17 14 16 20 14 22 13 15 22 21 22 14 17 16 15 16 17 13 15 22 13 14 14 14 16 20 16 20 22 17 22 21 20 14 13 21 14 16 22 21 14 16 16 17 22 14 16 14 17 15 22 13 13 16 21 13 14 15 17 22 15 15 15 14 15 17 13 16 15 20 21 20 15 13 14 14 21 16 15 14 21 20 16 16 22 13 17 14 15 16 14 13 21 14 20 15 22 21 16 14 

In [None]:
def line_to_dict(one_line):
    brand, color, size = one_line.strip().split('\t')
    
    return {'brand': brand,
           'color': color,
           'size': size}

[line_to_dict(one_line)
 for one_line in open('shoe-data.txt')]

In [78]:
s = input('Enter numbers: ').strip()

sum([int(x)
 for x in s.split()])

Enter numbers: 10 20 30 10 20 30


120

In [79]:
# sum only the different/unique numbers

s = input('Enter numbers: ').strip()

sum(set([int(x)
 for x in s.split()]))

Enter numbers: 10 20 30 10 20 30


60

In [80]:
# set comprehension!
# use {} instead of [], and we get a set back
# all elements must be hashable (basically immutable)

{int(x)
for x in s.split()}


{10, 20, 30}

In [81]:
!head -30 linux-etc-passwd.txt

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin
mail:x:8:8:mail:/var/mail:/usr/sbin/nologin



news:x:9:9:news:/var/spool/news:/usr/sbin/nologin
uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin
proxy:x:13:13:proxy:/bin:/usr/sbin/nologin
www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin
backup:x:34:34:backup:/var/backups:/usr/sbin/nologin
list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin
irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin
gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin

nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin
syslog:x:101:104::/home/syslog:/bin/false
messagebu

# Exercise: What shells?

1. On each line of a passwd file, we have a record describing a user.  The final field is the "shell," the command interpreter.
2. Use a set comprehension to find the different shells in `linux-etc-passwd.txt.`
3. Note that the file contains both comment lines (starting with `#` and blank/empty lines)!  Ignore these two types of lines.

In [85]:
[one_line
  for one_line in open('linux-etc-passwd.txt')
  if not one_line.startswith('#') and one_line.strip()]

['root:x:0:0:root:/root:/bin/bash\n',
 'daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin\n',
 'bin:x:2:2:bin:/bin:/usr/sbin/nologin\n',
 'sys:x:3:3:sys:/dev:/usr/sbin/nologin\n',
 'sync:x:4:65534:sync:/bin:/bin/sync\n',
 'games:x:5:60:games:/usr/games:/usr/sbin/nologin\n',
 'man:x:6:12:man:/var/cache/man:/usr/sbin/nologin\n',
 'lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin\n',
 'mail:x:8:8:mail:/var/mail:/usr/sbin/nologin\n',
 'news:x:9:9:news:/var/spool/news:/usr/sbin/nologin\n',
 'uucp:x:10:10:uucp:/var/spool/uucp:/usr/sbin/nologin\n',
 'proxy:x:13:13:proxy:/bin:/usr/sbin/nologin\n',
 'www-data:x:33:33:www-data:/var/www:/usr/sbin/nologin\n',
 'backup:x:34:34:backup:/var/backups:/usr/sbin/nologin\n',
 'list:x:38:38:Mailing List Manager:/var/list:/usr/sbin/nologin\n',
 'irc:x:39:39:ircd:/var/run/ircd:/usr/sbin/nologin\n',
 'gnats:x:41:41:Gnats Bug-Reporting System (admin):/var/lib/gnats:/usr/sbin/nologin\n',
 'nobody:x:65534:65534:nobody:/nonexistent:/usr/sbin/nologin\n',
 'syslog:x:101:

In [94]:
# set comprehension showing different shells from linux-etc-passwd.txt
{one_line.split(':')[-1].strip()
  for one_line in open('linux-etc-passwd.txt')
  if not one_line.startswith(('#', '\n'))}  # startswith can take a string, or a tuple of strings

{'/bin/bash',
 '/bin/false',
 '/bin/nologin',
 '/bin/sh',
 '/bin/sync',
 '/usr/sbin/nologin'}

In [97]:
# dict comprehension
# creates one dict; the expression describes/defines one key-value pair

words = 'this is a bunch of words and is a great example'.split()

{ one_word : len(one_word)     # two separate expressions -- one for the key, one for the value
 for one_word in words }

{'this': 4,
 'is': 2,
 'a': 1,
 'bunch': 5,
 'of': 2,
 'words': 5,
 'and': 3,
 'great': 5,
 'example': 7}

In [98]:
# flip a dict

d = {'a':1, 'b':2, 'c':3}

{ value : key
  for key, value in d.items() 
}


{1: 'a', 2: 'b', 3: 'c'}

In [100]:
# flip a dict

d = {'a':1, 'b':2, 'c':3, 'd':4, 'e':4, 'f':3, 'g':2}

{ value : key
  for key, value in d.items() 
}


{1: 'a', 2: 'g', 3: 'f', 4: 'e'}

In [99]:
!ls *conf*

myconf.txt


In [102]:
!cat myconf.txt

a:10
b:20
c:30


In [103]:
# read from the config file via a dict comprehension

{ one_line.split(':')[0]  :  one_line.split(':')[1].strip()
 for one_line in open('myconf.txt')
}

{'a': '10', 'b': '20', 'c': '30'}

In [106]:
# if you really insist, we can use := to share data

# := walrus (assignment expression operator)


{ fields[0]  :  fields[1]
 for one_line in open('myconf.txt')
 if (fields := one_line.strip().split(':'))
}

{'a': '10', 'b': '20', 'c': '30'}

# Exercise: Usernames and user IDs

Use a dict comprehension on `linux-etc-passwd.txt` to create a dict where the usernames (index 0) are keys and the user ID (index 2) numbers are values.

In [107]:
!head linux-etc-passwd.txt

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin


In [112]:
{one_line.split(':')[0]     :   one_line.split(':')[2:4]   # slice for user ID + group ID
for one_line in open('linux-etc-passwd.txt')
if not one_line.startswith(('#', '\n'))}

{'root': ['0', '0'],
 'daemon': ['1', '1'],
 'bin': ['2', '2'],
 'sys': ['3', '3'],
 'sync': ['4', '65534'],
 'games': ['5', '60'],
 'man': ['6', '12'],
 'lp': ['7', '7'],
 'mail': ['8', '8'],
 'news': ['9', '9'],
 'uucp': ['10', '10'],
 'proxy': ['13', '13'],
 'www-data': ['33', '33'],
 'backup': ['34', '34'],
 'list': ['38', '38'],
 'irc': ['39', '39'],
 'gnats': ['41', '41'],
 'nobody': ['65534', '65534'],
 'syslog': ['101', '104'],
 'messagebus': ['102', '106'],
 'landscape': ['103', '109'],
 'jci': ['955', '955'],
 'sshd': ['104', '65534'],
 'user': ['1000', '1000'],
 'reuven': ['1001', '1001'],
 'postfix': ['105', '113'],
 'colord': ['106', '116'],
 'postgres': ['107', '117'],
 'dovecot': ['108', '119'],
 'dovenull': ['109', '120'],
 'postgrey': ['110', '121'],
 'debian-spamd': ['111', '122'],
 'memcache': ['113', '124'],
 'genadi': ['1002', '1003'],
 'shira': ['1003', '1004'],
 'atara': ['1004', '1005'],
 'shikma': ['1005', '1006'],
 'amotz': ['1006', '1007'],
 'mysql': ['114', 

In [120]:
import random

random.seed(0)
mylist = [[random.randint(-50, 50) for i in range(random.randint(1, 5))],
         [random.randint(-50, 50) for i in range(random.randint(1, 5))],
         [random.randint(-50, 50) for i in range(random.randint(1, 5))],
         [random.randint(-50, 50) for i in range(random.randint(1, 5))],
         [random.randint(-50, 50) for i in range(random.randint(1, 5))]
         ]

In [121]:
mylist  # list of lists!

[[47, 3, -45, -17], [12, 1, 50, -12, 11], [24, -23, 14], [-14, -33], [29]]

In [122]:
# I want a flattened , 1D list --with integers in it
# nested list comprehension

[one_number
 for one_sublist in mylist
 for one_number in one_sublist]


[47, 3, -45, -17, 12, 1, 50, -12, 11, 24, -23, 14, -14, -33, 29]

In [125]:
[*one_number
 for one_number in mylist]


SyntaxError: iterable unpacking cannot be used in comprehension (3722370174.py, line 1)

In [126]:
[one_number
 for one_sublist in mylist
 for one_number in one_sublist]


[47, 3, -45, -17, 12, 1, 50, -12, 11, 24, -23, 14, -14, -33, 29]

In [128]:
[one_number
 for one_sublist in mylist
    if len(one_sublist) > 3
 for one_number in one_sublist
    if one_number > 0]


[47, 3, 12, 1, 50, 11]

In [129]:
!head movies.dat

1::Toy Story (1995)::Animation|Children's|Comedy
2::Jumanji (1995)::Adventure|Children's|Fantasy
3::Grumpier Old Men (1995)::Comedy|Romance
4::Waiting to Exhale (1995)::Comedy|Drama
5::Father of the Bride Part II (1995)::Comedy
6::Heat (1995)::Action|Crime|Thriller
7::Sabrina (1995)::Comedy|Romance
8::Tom and Huck (1995)::Adventure|Children's
9::Sudden Death (1995)::Action
10::GoldenEye (1995)::Action|Adventure|Thriller


# Exercise: Popular movie categories

1. Use a nested comprehension to create a list of movie categories.
2. Hand this list to `Counter`, which will count the number of times each category appears.
3. What were the 5 most common categories for all of these movies?

In [130]:
!file movies.dat

movies.dat: Unicode text, UTF-8 text


In [131]:
f = open('movies.dat', encoding='UTF-8')

In [133]:
[one_category
 for one_line in open('movies.dat', encoding='UTF-8')
 for one_category in one_line.split('::')[-1]]

['1',
 'Toy Story (1995)',
 "Animation|Children's|Comedy\n",
 '2',
 'Jumanji (1995)',
 "Adventure|Children's|Fantasy\n",
 '3',
 'Grumpier Old Men (1995)',
 'Comedy|Romance\n',
 '4',
 'Waiting to Exhale (1995)',
 'Comedy|Drama\n',
 '5',
 'Father of the Bride Part II (1995)',
 'Comedy\n',
 '6',
 'Heat (1995)',
 'Action|Crime|Thriller\n',
 '7',
 'Sabrina (1995)',
 'Comedy|Romance\n',
 '8',
 'Tom and Huck (1995)',
 "Adventure|Children's\n",
 '9',
 'Sudden Death (1995)',
 'Action\n',
 '10',
 'GoldenEye (1995)',
 'Action|Adventure|Thriller\n',
 '11',
 'American President, The (1995)',
 'Comedy|Drama|Romance\n',
 '12',
 'Dracula: Dead and Loving It (1995)',
 'Comedy|Horror\n',
 '13',
 'Balto (1995)',
 "Animation|Children's\n",
 '14',
 'Nixon (1995)',
 'Drama\n',
 '15',
 'Cutthroat Island (1995)',
 'Action|Adventure|Romance\n',
 '16',
 'Casino (1995)',
 'Drama|Thriller\n',
 '17',
 'Sense and Sensibility (1995)',
 'Drama|Romance\n',
 '18',
 'Four Rooms (1995)',
 'Thriller\n',
 '19',
 'Ace Ventu