Scoping in Python

1. Local -- only in a function (is the variable in `__code__.co_varnames`?)
2. Enclosing -- only in a function
3. Globals -- (is the variable in `globals()`?)
4. Builtins -- (is the variable in `__builtins__`?)

In [1]:
dir(__builtins__)

['ArithmeticError',
 'AssertionError',
 'AttributeError',
 'BaseException',
 'BlockingIOError',
 'BrokenPipeError',
 'BufferError',
 'ChildProcessError',
 'ConnectionAbortedError',
 'ConnectionError',
 'ConnectionRefusedError',
 'ConnectionResetError',
 'EOFError',
 'Ellipsis',
 'EnvironmentError',
 'Exception',
 'False',
 'FileExistsError',
 'FileNotFoundError',
 'FloatingPointError',
 'GeneratorExit',
 'IOError',
 'ImportError',
 'IndentationError',
 'IndexError',
 'InterruptedError',
 'IsADirectoryError',
 'KeyError',
 'KeyboardInterrupt',
 'LookupError',
 'MemoryError',
 'ModuleNotFoundError',
 'NameError',
 'None',
 'NotADirectoryError',
 'NotImplemented',
 'NotImplementedError',
 'OSError',
 'OverflowError',
 'PermissionError',
 'ProcessLookupError',
 'RecursionError',
 'ReferenceError',
 'RuntimeError',
 'StopAsyncIteration',
 'StopIteration',
 'SyntaxError',
 'SystemError',
 'SystemExit',
 'TabError',
 'TimeoutError',
 'True',
 'TypeError',
 'UnboundLocalError',
 'UnicodeDecode

In [3]:
x = 100

def myfunc():
    x = 200  # defining a local variable x
    print(f'In myfunc, x = {x}')  # local? yes - we get 200
    
print(f'Before, x = {x}')  # global? yes -- we get 100
myfunc()
print(f'After, x = {x}')   # global? yes -- we get 100

Before, x = 100
In myfunc, x = 200
After, x = 100


In [5]:
x = [10, 20, 30]

def myfunc():
    x[1] = 3456   # we're not assigning to x, so x remains global
    print(f'In myfunc, x = {x}')  
    
print(f'Before, x = {x}')  
myfunc()
print(f'After, x = {x}')   

Before, x = [10, 20, 30]
In myfunc, x = [10, 3456, 30]
After, x = [10, 3456, 30]


In [6]:
myfunc.__code__.co_varnames

()

About functions:
    
1. When we use `def`, we're creating a function object and assigning it to a variable.
2. A function can return any type of data we want.
3. Any variable defined inside of a function is a local varible.

In [7]:
def outside():
    def inside():
        return 'Hello from inside!'
    return inside

In [8]:
outside() 

<function __main__.outside.<locals>.inside()>

In [9]:
f = outside()

In [10]:
type(f)

function

In [11]:
f()

'Hello from inside!'

In [12]:
g = outside()

In [13]:
g()

'Hello from inside!'

In [14]:
id(f)

4594717456

In [15]:
id(g)

4594716736

In [16]:
# closure -- we have a function that has access to the local variables
# in its enclosing function.

def outside(x):    # x here is an enclosing variable!
    def inside(y): 
        return f'Hello from inside, {x=}, {y=}!'
   
    return inside

f = outside(10)
f(7)

'Hello from inside, x=10, y=7!'

In [21]:
def outside(x): 
    counter = 5
    
    def inside(y): 
        nonlocal counter  # it's local to the enclosing function
        counter = counter + 1
        return f'Hello from inside, {x=}, {y=}, {counter=}!'
   
    return inside

f = outside(10)

print(f(7))
print(f(3))
print(f(10))


Hello from inside, x=10, y=7, counter=6!
Hello from inside, x=10, y=3, counter=7!
Hello from inside, x=10, y=10, counter=8!


In [22]:
f.__code__.co_varnames

('y',)

In [23]:
f.__code__.co_freevars

('counter', 'x')

In [24]:
outside.__code__.co_cellvars

('counter', 'x')

In [25]:
def outside(x): 
    counter = 0
    
    def inside(y): 
        nonlocal counter  # it's local to the enclosing function
        counter = counter + 1
        return f'Hello from inside, {x=}, {y=}, {counter=}!'
   
    return inside

f = outside(10)

print(f(7))
print(f(3))
print(f(10))

g = outside(15)

print(g(7))
print(g(3))


Hello from inside, x=10, y=7, counter=1!
Hello from inside, x=10, y=3, counter=2!
Hello from inside, x=10, y=10, counter=3!
Hello from inside, x=15, y=7, counter=1!
Hello from inside, x=15, y=3, counter=2!


# Exercise: Password generator generator

1. Write a function, `password_generator_generator`, which takes a single argument, a string containing different characters.
2. This function, when called, will return a function that takes an argument and (when called) returns a new password.
3. The `random.choice` function, which returns a random element of a sequence, will come in handy here.

Example: 

    new_pw = password_generator_generator('abcde')
    print(new_pw(5))  # could be 'abddc', 5 characters from a dictionary.
    


In [26]:
import random

def password_generator_generator(s):
    def password_generator(n):
        output = ''
        for i in range(n):
            output += random.choice(s)
        return output
    return password_generator

make_alpha_password = password_generator_generator('abcdefghij')
make_symbol_password = password_generator_generator('!@#$%^&*()')

In [27]:
make_alpha_password(10)

'faghchhhic'

In [28]:
make_symbol_password(10)

'!*$!^&!(()'

In [29]:
make_symbol_password(3)

'^*)'

In [30]:
def a():
    return 'Hello from a!'

def b():
    return 'Hello from b!'

while True:
    choice = input("Enter a choice: ").strip()
    
    if not choice:
        break
    
    if choice == 'a':
        print(a())
    elif choice == 'b':
        print(b())
    else:
        print(f'No such choice {choice}')

Enter a choice: a
Hello from a!
Enter a choice: b
Hello from b!
Enter a choice: c
No such choice c
Enter a choice: 


In [32]:
def a():
    return 'Hello from a!'

def b():
    return 'Hello from b!'

functions = {'a':a,       # dispatch table
             'b':b}

while True:
    choice = input("Enter a choice: ").strip()
    
    if not choice:
        break
    
    if choice in functions:
        print(functions[choice]())
    else:
        print(f'No such choice {choice}')

Enter a choice: a
Hello from a!
Enter a choice: b
Hello from b!
Enter a choice: c
No such choice c
Enter a choice: 


In [34]:
def hello():
    return f'Hello!'

In [35]:
hello()

'Hello!'

In [37]:
globals()['hello']()

'Hello!'

# Exercise: Calculator

1. Ask the user to enter a simple math expression (NUMBER OP NUMBER), where OP is +, -, /, or *.
2. Use a dispatch table and one or more functions to implement this calculator.
3. When the user enters an empty string, stop asking.

Example:

    Enter expression: 2 + 2
    4
    Enter expression: 10 * 5
    50
    Enter expression: 10 * a
    Not a number

In [39]:
def add(a, b):
    return a + b

def sub(a, b):
    return a - b

def mul(a, b):
    return a * b

def div(a, b):
    return a / b

functions = {'+': add,
             '-': sub,
            '*': mul,
            '/':div}

while True:
    s = input("Enter expression: ").strip()
    
    if not s:
        break
        
    first, op, second = s.split()
    
    for one_number in [first, second]:
        if not one_number.isdigit():
            print(f'{one_number} is not numeric; try again')
            continue
            
    if op in functions:
        print(functions[op](int(first), int(second)))
    else:
        print(f'Operator {op} is unknown.')


Enter expression: 10 / 3
3.3333333333333335
Enter expression: 


In [40]:
import operator

functions = {'+': operator.add,
             '-': operator.sub,
             '*': operator.mul,
             '/': operator.truediv}

while True:
    s = input("Enter expression: ").strip()
    
    if not s:
        break
        
    first, op, second = s.split()
    
    for one_number in [first, second]:
        if not one_number.isdigit():
            print(f'{one_number} is not numeric; try again')
            continue
            
    if op in functions:
        print(functions[op](int(first), int(second)))
    else:
        print(f'Operator {op} is unknown.')


Enter expression: 3 + 5
8
Enter expression: 10 / 6
1.6666666666666667
Enter expression: 


In [None]:
2 + 2  # infix notation

+ 2 2  # prefix notation (Polish notation)
2 2 +  # postfix notation (Reverse Polish notation == RPN)

In [43]:
import operator

functions = {'+': operator.add,
             '-': operator.sub,
             '*': operator.mul,
             '/': operator.truediv}

while True:
    s = input("Enter expression: ").strip()
    
    if not s:
        break
        
    op, *numbers = s.split()
    
    if op in functions:
        output = int(numbers[0])  # initialize output with numbers[0]
        
        for one_number in numbers[1:]:  # skip numbers[0], which we already have in output
            output = functions[op](output, int(one_number))
            
        print(output)
    else:
        print(f'Operator {op} is unknown.')


Enter expression: + 2 2
4
Enter expression: + 2 3 4 5
14
Enter expression: * 10 20
200
Enter expression: * 10 20 30
6000
Enter expression: 


In [44]:
f(x) = x**2

f(3) = 9

SyntaxError: cannot assign to function call (<ipython-input-44-d11fb8d80b96>, line 1)

Functional programming

1. Functions should contain no assignment (i.e., no changes of state).
2. We should treat our data as immutable as much as possible.
3. We can treat functions as data.

In [45]:
numbers = list(range(10))
numbers

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [46]:
# I want to get a list of the elements of numbers squared (**2)

output = []   # un-Pythonic
for one_number in numbers:
    output.append(one_number ** 2)
    
output

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [47]:
# list comprehension

[one_number ** 2 for one_number in numbers]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [48]:
# a list comprehension creates a new list

[one_number ** 2 for one_number in numbers]

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [52]:
def normal_loop():
    output = []   
    for one_number in range(100000):
        output.append(one_number ** 2)

    return output

def comprehension():
    output = [one_number ** 2 for one_number in range(100000)] 

In [53]:
%timeit normal_loop()

39.9 ms ± 1.48 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [54]:
%timeit comprehension()

32 ms ± 145 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [55]:
32 / 40

0.8

In [57]:
[one_number ** 2              # expression -- SELECT
 for one_number in numbers]   # iteration  -- FROM 

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [58]:
words = 'this is a bunch of words'.split()
'*'.join(words)

'this*is*a*bunch*of*words'

In [59]:
numbers = [10, 20, 30, 40, 50]

'*'.join(numbers)

TypeError: sequence item 0: expected str instance, int found

In [60]:
# how can I get a list of strings back, based on "numbers"?

[str(one_number)
for one_number in numbers]

['10', '20', '30', '40', '50']

In [61]:
numbers

[10, 20, 30, 40, 50]

In [62]:
# I have a bunch of numbers as strings
# I want to get a list of numbers (integers)

numstrings = ['10', '20', '30', '40', '50']
[int(one_number)
for one_number in numstrings]

[10, 20, 30, 40, 50]

In [63]:
# I have a bunch of numbers as strings
# I want to get a list of numbers (integers)
# oh, these strings are actually representing HEX numbers

numstrings = ['10', '20', '30', '40', '50']
[int(one_number, 16) # interpret the numbers as hex
for one_number in numstrings]

[16, 32, 48, 64, 80]

In [64]:
int('10')

10

In [65]:
int('10', 16)  # interpret the string as 0x10, and turn into an int

16

In [66]:
s = 'abcde'

for one_character in s:
    print(one_character)

a
b
c
d
e


In [67]:
for index, one_character in enumerate(s):
    print(f'{index}: {one_character}')

0: a
1: b
2: c
3: d
4: e


In [68]:
[(index, one_character)
for index, one_character in enumerate(s)]

[(0, 'a'), (1, 'b'), (2, 'c'), (3, 'd'), (4, 'e')]

In [69]:
[one_number / 3
for one_number in range(5)]

[0.0, 0.3333333333333333, 0.6666666666666666, 1.0, 1.3333333333333333]

In [70]:
len([one_number
     for one_number in range(5)])

5

In [71]:
def hello(name):
    return f'Hello, {name}!'

[hello(one_name)
for one_name in 'Tom Dick Harry'.split()]

['Hello, Tom!', 'Hello, Dick!', 'Hello, Harry!']

In [72]:
def hello(name):
    return f'Hello, {name}!'

[print(hello(one_name))
for one_name in 'Tom Dick Harry'.split()]

Hello, Tom!
Hello, Dick!
Hello, Harry!


[None, None, None]

# Exercises: Comprehensions

1. Ask the user to enter a sentence.  Use a comprehension to count the number of non-space characters.
2. Get a list of floats. Use the "round" function to round each to the nearest ingeger.

In [74]:
s = input("Enter a sentence: ")

Enter a sentence: this is a very interesting test


In [75]:
len(s)

31

In [79]:
# sum the lengths of the words

sum([len(one_word)
 for one_word in s.split()])

26

In [80]:
numbers = [1.5, 2.3, 8.6, 10.7, 4.1]

[round(one_number)
for one_number in numbers]

[2, 2, 9, 11, 4]

In [83]:
[one_line.split(":")[0]              # SELECT
for one_line in open('/etc/passwd')  # FROM
if not one_line.startswith("#")]     # WHERE

['nobody',
 'root',
 'daemon',
 '_uucp',
 '_taskgated',
 '_networkd',
 '_installassistant',
 '_lp',
 '_postfix',
 '_scsd',
 '_ces',
 '_appstore',
 '_mcxalr',
 '_appleevents',
 '_geod',
 '_devdocs',
 '_sandbox',
 '_mdnsresponder',
 '_ard',
 '_www',
 '_eppc',
 '_cvs',
 '_svn',
 '_mysql',
 '_sshd',
 '_qtss',
 '_cyrus',
 '_mailman',
 '_appserver',
 '_clamav',
 '_amavisd',
 '_jabber',
 '_appowner',
 '_windowserver',
 '_spotlight',
 '_tokend',
 '_securityagent',
 '_calendar',
 '_teamsserver',
 '_update_sharing',
 '_installer',
 '_atsserver',
 '_ftp',
 '_unknown',
 '_softwareupdate',
 '_coreaudiod',
 '_screensaver',
 '_locationd',
 '_trustevaluationagent',
 '_timezone',
 '_lda',
 '_cvmsroot',
 '_usbmuxd',
 '_dovecot',
 '_dpaudio',
 '_postgres',
 '_krbtgt',
 '_kadmin_admin',
 '_kadmin_changepw',
 '_devicemgr',
 '_webauthserver',
 '_netbios',
 '_warmd',
 '_dovenull',
 '_netstatistics',
 '_avbdeviced',
 '_krb_krbtgt',
 '_krb_kadmin',
 '_krb_changepw',
 '_krb_kerberos',
 '_krb_anonymous',
 '_asse

In [84]:
for one_line in open('/etc/passwd'):
    print(len(one_line))

3
16
3
76
71
18
2
70
18
3
59
50
54
72
62
64
70
61
71
70
70
72
56
66
62
67
52
63
60
69
58
50
50
54
66
67
59
63
64
61
62
61
62
61
55
55
74
53
65
65
55
56
50
56
88
66
61
70
81
65
62
56
75
65
54
64
75
72
85
72
67
53
55
69
77
74
94
85
97
73
84
68
71
70
63
55
82
82
74
64
66
76
55
78
80
56
63
82
76
63
55
69
61
99
73
55
63
79
100
57


In [87]:
one_line.split(':')[0]

'_driverkit'

# Exercises: Comprehensions with conditions

1. Read through the file `nums.txt`, and sum the integers that it contains.  Each line in the file contains either one integer, or nothing (whitespace).
2. Ask the user to enter a sentence. Count the number of vowels in the sentence.

In [88]:
!cat nums.txt

5
	10     
	20
  	3
		   	20        

 25


In [95]:
[int(one_line)
for one_line in open('nums.txt')
if one_line.strip()]

[5, 10, 20, 3, 20, 25]

In [91]:
int('123')

123

In [92]:
int('    123    ')

123

In [93]:
int('')

ValueError: invalid literal for int() with base 10: ''

In [94]:
int()

0

In [97]:
sum([int(one_line)
for one_line in open('nums.txt')
if one_line.strip()])

83

In [101]:
s = input("Enter a sentence: ").strip()

sum([1
for one_letter in s
if one_letter.lower() in 'aeiou'])

Enter a sentence: this is a test


4

In [102]:
# create a list of 1s, with a 1 for each vowel in s
[1
for one_letter in s
if one_letter.lower() in 'aeiou']

[1, 1, 1, 1]

In [104]:
len([one_letter
for one_letter in s
if one_letter.lower() in 'aeiou'])

4

In [105]:
!head shoe-data.txt

Adidas	orange	43
Nike	black	41
Adidas	black	39
New Balance	pink	41
Nike	white	44
New Balance	orange	38
Nike	pink	44
Adidas	pink	44
New Balance	orange	39
New Balance	black	43


# Exercise: Shoes

1. Use a list comprehension to read from `shoe-data.txt`, and create a list of dictionaries.
2. Each dict will have three keys: `brand`, `color`, and `size`.
3. The file contains 100 shoe descriptions, with columns separated by tabs (`\t`).
4. Assign the output from the comprehension to a list called `shoes`.
5. You will probably want to write a function to handle each line of the file, rather that do it inline in the comprehension.

Note: You want to get one list with many dicts, not one dict with many shoes.  Each line of the file will be turned into one dictionary.

In [108]:
filename = 'shoe-data.txt'

def line_to_dict(one_line):
    brand, color, size = one_line.strip().split('\t')
    return {'brand':brand,
           'color':color,
           'size':size}

[line_to_dict(one_line)
for one_line in open(filename)]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [109]:
dict(a=1, b=2, c=3)

{'a': 1, 'b': 2, 'c': 3}

In [110]:
dict([('a', 1), ('b', 2), ('c',3)])

{'a': 1, 'b': 2, 'c': 3}

In [112]:
list(zip('abc', [1,2,3]))

[('a', 1), ('b', 2), ('c', 3)]

In [113]:
filename = 'shoe-data.txt'

def line_to_dict(one_line):
    return dict(zip(['brand', 'color', 'size'],
                   one_line.strip().split('\t')))

[line_to_dict(one_line)
for one_line in open(filename)]

[{'brand': 'Adidas', 'color': 'orange', 'size': '43'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'black', 'size': '39'},
 {'brand': 'New Balance', 'color': 'pink', 'size': '41'},
 {'brand': 'Nike', 'color': 'white', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '38'},
 {'brand': 'Nike', 'color': 'pink', 'size': '44'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '44'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '39'},
 {'brand': 'New Balance', 'color': 'black', 'size': '43'},
 {'brand': 'New Balance', 'color': 'orange', 'size': '44'},
 {'brand': 'Nike', 'color': 'black', 'size': '41'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '37'},
 {'brand': 'Adidas', 'color': 'black', 'size': '38'},
 {'brand': 'Adidas', 'color': 'pink', 'size': '41'},
 {'brand': 'Adidas', 'color': 'white', 'size': '36'},
 {'brand': 'Adidas', 'color': 'orange', 'size': '36'},
 {'brand': 'Nike', 'color': 'pink', 'size': '41'},
 {'brand': '

In [114]:
words = 'this is a bunch of words for my course'.split()

words

['this', 'is', 'a', 'bunch', 'of', 'words', 'for', 'my', 'course']

In [115]:
# create a dict based on "words" where the keys are "words" and the values
# are the word lengths

[(one_word, len(one_word))
 for one_word in words]

[('this', 4),
 ('is', 2),
 ('a', 1),
 ('bunch', 5),
 ('of', 2),
 ('words', 5),
 ('for', 3),
 ('my', 2),
 ('course', 6)]

In [116]:
dict([(one_word, len(one_word))
 for one_word in words])

{'this': 4,
 'is': 2,
 'a': 1,
 'bunch': 5,
 'of': 2,
 'words': 5,
 'for': 3,
 'my': 2,
 'course': 6}

In [117]:
# dict comprehension -- creates one dict based on its input source

{ one_word : len(one_word)
  for one_word in words
}

{'this': 4,
 'is': 2,
 'a': 1,
 'bunch': 5,
 'of': 2,
 'words': 5,
 'for': 3,
 'my': 2,
 'course': 6}

In [118]:
!head linux-etc-passwd.txt

# This is a comment
# You should ignore me
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
bin:x:2:2:bin:/bin:/usr/sbin/nologin
sys:x:3:3:sys:/dev:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
games:x:5:60:games:/usr/games:/usr/sbin/nologin
man:x:6:12:man:/var/cache/man:/usr/sbin/nologin
lp:x:7:7:lp:/var/spool/lpd:/usr/sbin/nologin


# Exercise: passwd to dict

1. Read it `linux-etc-passwd.txt` to a dict comprehension.
2. The output dictionary should have usernames (index 0) as keys, and the user's ID number (index 2) as values.
3. Note that you'll need to ignore comment lines and blank lines.

In [125]:
{ one_line.split(':')[0]  : one_line.split(':')[2]
  for one_line in open('linux-etc-passwd.txt')
  if not one_line.startswith(("#", '\n'))  }

{'root': '0',
 'daemon': '1',
 'bin': '2',
 'sys': '3',
 'sync': '4',
 'games': '5',
 'man': '6',
 'lp': '7',
 'mail': '8',
 'news': '9',
 'uucp': '10',
 'proxy': '13',
 'www-data': '33',
 'backup': '34',
 'list': '38',
 'irc': '39',
 'gnats': '41',
 'nobody': '65534',
 'syslog': '101',
 'messagebus': '102',
 'landscape': '103',
 'jci': '955',
 'sshd': '104',
 'user': '1000',
 'reuven': '1001',
 'postfix': '105',
 'colord': '106',
 'postgres': '107',
 'dovecot': '108',
 'dovenull': '109',
 'postgrey': '110',
 'debian-spamd': '111',
 'memcache': '113',
 'genadi': '1002',
 'shira': '1003',
 'atara': '1004',
 'shikma': '1005',
 'amotz': '1006',
 'mysql': '114',
 'clamav': '115',
 'amavis': '116',
 'opendkim': '117',
 'gitlab-redis': '999',
 'gitlab-psql': '998',
 'git': '1007',
 'opendmarc': '118',
 'dkim-milter-python': '119',
 'deploy': '1008',
 'redis': '112'}

In [128]:
{ fields[0]  : fields[2]
  for one_line in open('linux-etc-passwd.txt')
  if not one_line.startswith(("#", '\n')) 
  if (fields := one_line.split(":")) }

{'root': '0',
 'daemon': '1',
 'bin': '2',
 'sys': '3',
 'sync': '4',
 'games': '5',
 'man': '6',
 'lp': '7',
 'mail': '8',
 'news': '9',
 'uucp': '10',
 'proxy': '13',
 'www-data': '33',
 'backup': '34',
 'list': '38',
 'irc': '39',
 'gnats': '41',
 'nobody': '65534',
 'syslog': '101',
 'messagebus': '102',
 'landscape': '103',
 'jci': '955',
 'sshd': '104',
 'user': '1000',
 'reuven': '1001',
 'postfix': '105',
 'colord': '106',
 'postgres': '107',
 'dovecot': '108',
 'dovenull': '109',
 'postgrey': '110',
 'debian-spamd': '111',
 'memcache': '113',
 'genadi': '1002',
 'shira': '1003',
 'atara': '1004',
 'shikma': '1005',
 'amotz': '1006',
 'mysql': '114',
 'clamav': '115',
 'amavis': '116',
 'opendkim': '117',
 'gitlab-redis': '999',
 'gitlab-psql': '998',
 'git': '1007',
 'opendmarc': '118',
 'dkim-milter-python': '119',
 'deploy': '1008',
 'redis': '112'}

In [130]:
{ username : user_id
  for one_line in open('linux-etc-passwd.txt')
  if not one_line.startswith(("#", '\n')) 
  if ((username, junk, user_id, *rest) := one_line.split(":")) }

SyntaxError: cannot use assignment expressions with tuple (<ipython-input-130-b64883afe35a>, line 4)

In [132]:
{ username  : user_id
  for one_line in open('linux-etc-passwd.txt')
  if not one_line.startswith(("#", '\n')) 
 
  if (fields := one_line.split(":")) 
  if (username := fields[0])
  if (user_id := fields[2])
}

{'root': '0',
 'daemon': '1',
 'bin': '2',
 'sys': '3',
 'sync': '4',
 'games': '5',
 'man': '6',
 'lp': '7',
 'mail': '8',
 'news': '9',
 'uucp': '10',
 'proxy': '13',
 'www-data': '33',
 'backup': '34',
 'list': '38',
 'irc': '39',
 'gnats': '41',
 'nobody': '65534',
 'syslog': '101',
 'messagebus': '102',
 'landscape': '103',
 'jci': '955',
 'sshd': '104',
 'user': '1000',
 'reuven': '1001',
 'postfix': '105',
 'colord': '106',
 'postgres': '107',
 'dovecot': '108',
 'dovenull': '109',
 'postgrey': '110',
 'debian-spamd': '111',
 'memcache': '113',
 'genadi': '1002',
 'shira': '1003',
 'atara': '1004',
 'shikma': '1005',
 'amotz': '1006',
 'mysql': '114',
 'clamav': '115',
 'amavis': '116',
 'opendkim': '117',
 'gitlab-redis': '999',
 'gitlab-psql': '998',
 'git': '1007',
 'opendmarc': '118',
 'dkim-milter-python': '119',
 'deploy': '1008',
 'redis': '112'}

In [134]:
usernames = [one_line.split(':')[0]
for one_line in open('linux-etc-passwd.txt')
if not one_line.startswith(('#', '\n'))]

In [135]:
'root' in usernames

True

In [136]:
'reuven' in usernames

True

In [137]:
'whoever' in usernames

False

In [138]:
# set -- only the keys from a dictionary

# set comprehension!  
# - creates a set
# - based on an iterable
# - guarantees that the elements are unique (and hashable)
# - searching is O(1)

# {} with no colon == set, including in set comprehensions

usernames = {one_line.split(':')[0]
for one_line in open('linux-etc-passwd.txt')
if not one_line.startswith(('#', '\n'))}

In [140]:
'root' in usernames

True

In [142]:
# which IP addresses accessed my server?

[one_line.split()[0]
 for one_line in open('mini-access-log.txt')]

['67.218.116.165',
 '66.249.71.65',
 '65.55.106.183',
 '65.55.106.183',
 '66.249.71.65',
 '66.249.71.65',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '65.55.106.131',
 '65.55.106.131',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '65.55.106.186',
 '65.55.106.186',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '74.52.245.146',
 '74.52.245.146',
 '66.249.65.43',
 '66.249.65.43',
 '66.249.65.43',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '65.55.207.25',
 '65.55.207.25',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '65.55.207.94',
 '65.55.207.94',
 '66.249.65.12',
 '65.55.207.71',
 '66.249.65.12',
 '66.249.65.12',
 '66.249.65.12',
 '98.242.170.241',
 '66.249.65.38',
 '66.249.65.38',
 '66.249.65.38',
 '66.249.65.38',
 '66.249.65.38',
 '

In [143]:
# which *Different* IP addresses access my server?

{one_line.split()[0]
 for one_line in open('mini-access-log.txt')}

{'208.80.193.28',
 '65.55.106.131',
 '65.55.106.155',
 '65.55.106.183',
 '65.55.106.186',
 '65.55.207.126',
 '65.55.207.25',
 '65.55.207.50',
 '65.55.207.71',
 '65.55.207.77',
 '65.55.207.94',
 '65.55.215.75',
 '66.249.65.12',
 '66.249.65.38',
 '66.249.65.43',
 '66.249.71.65',
 '67.195.112.35',
 '67.218.116.165',
 '74.52.245.146',
 '82.34.9.20',
 '89.248.172.58',
 '98.242.170.241'}

In [144]:
# list comprehensions
# dict comprehensions
# set comprehensions

[one_number ** 2
 for one_number in range(-5, 5)
if one_number % 2]

[25, 9, 1, 1, 9]

In [145]:
# list comprehensions
# dict comprehensions
# set comprehensions

{one_number ** 2
 for one_number in range(-5, 5)
if one_number % 2}

{1, 9, 25}

In [148]:
# combining comprehensions with other collections

from collections import Counter

c = Counter([one_line.split()[0]
         for one_line in open('mini-access-log.txt')])

c

Counter({'67.218.116.165': 2,
         '66.249.71.65': 3,
         '65.55.106.183': 2,
         '66.249.65.12': 32,
         '65.55.106.131': 2,
         '65.55.106.186': 2,
         '74.52.245.146': 2,
         '66.249.65.43': 3,
         '65.55.207.25': 2,
         '65.55.207.94': 2,
         '65.55.207.71': 1,
         '98.242.170.241': 1,
         '66.249.65.38': 100,
         '65.55.207.126': 2,
         '82.34.9.20': 2,
         '65.55.106.155': 2,
         '65.55.207.77': 2,
         '208.80.193.28': 1,
         '89.248.172.58': 22,
         '67.195.112.35': 16,
         '65.55.207.50': 3,
         '65.55.215.75': 2})

In [149]:
c.most_common(5)

[('66.249.65.38', 100),
 ('66.249.65.12', 32),
 ('89.248.172.58', 22),
 ('67.195.112.35', 16),
 ('66.249.71.65', 3)]

In [152]:
for ip_address, count in c.items():
    print(f'{ip_address:18}{count}')

67.218.116.165    2
66.249.71.65      3
65.55.106.183     2
66.249.65.12      32
65.55.106.131     2
65.55.106.186     2
74.52.245.146     2
66.249.65.43      3
65.55.207.25      2
65.55.207.94      2
65.55.207.71      1
98.242.170.241    1
66.249.65.38      100
65.55.207.126     2
82.34.9.20        2
65.55.106.155     2
65.55.207.77      2
208.80.193.28     1
89.248.172.58     22
67.195.112.35     16
65.55.207.50      3
65.55.215.75      2


In [155]:
for ip_address, count in c.items():
    print(f'{ip_address:18}{count // 2 * "x"}')

67.218.116.165    x
66.249.71.65      x
65.55.106.183     x
66.249.65.12      xxxxxxxxxxxxxxxx
65.55.106.131     x
65.55.106.186     x
74.52.245.146     x
66.249.65.43      x
65.55.207.25      x
65.55.207.94      x
65.55.207.71      
98.242.170.241    
66.249.65.38      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
65.55.207.126     x
82.34.9.20        x
65.55.106.155     x
65.55.207.77      x
208.80.193.28     
89.248.172.58     xxxxxxxxxxx
67.195.112.35     xxxxxxxx
65.55.207.50      x
65.55.215.75      x


In [156]:
mylist = ['abcd', 'efgh', 'ijklm']

'**'.join(mylist)

'abcd**efgh**ijklm'

In [158]:
mylist = [10, 20, 30, 40, 50]

'**'.join([str(one_item)
           for one_item in mylist])

'10**20**30**40**50'

In [159]:
# nested comprehensions

[one_number ** 2
 for one_number in range(5)]


[0, 1, 4, 9, 16]

In [160]:
[one_number ** 2
for one_number in range(5)
if one_number % 2]

[1, 9]

In [161]:
mylist = [[10, 20, 30], [40, 45, 50, 55, 60], [70, 80], [90, 92, 95, 97, 99, 100]]

In [162]:
[one_item
for one_item in mylist]

[[10, 20, 30], [40, 45, 50, 55, 60], [70, 80], [90, 92, 95, 97, 99, 100]]

In [163]:
[one_item
for one_sublist in mylist  # mylist must be an iterable of itera
for one_item in one_sublist]

[10, 20, 30, 40, 45, 50, 55, 60, 70, 80, 90, 92, 95, 97, 99, 100]

In [None]:
# sorting with custom sort functions
# lambda
# map, filter, reduce

# modules 