# Python Language Essentials

<p>We will learn about</p>
<ul>
<li>Python Interpreter</li>
<li>The Basics</li>
<li>Data Structures and Sequences</li>
<li>Functions</li>
<li>Files and the OS</li>
</ul>
&nbsp;

## Python Interpreter

<ul>
<li>Python is an interpreted language.</li>
<li>The Python interpreter runs a program by executing one statement at a time</li>
<li>Running Python programs is as simple as calling python with a .py file as its first argument.</li>
<li>Suppose we had created hello_world.py with these contents</li>
</ul>
&nbsp;

<img src="Pics4PythonEssentials/picture_0_0.png">

## The Basics

_The Basics_ >
### Language Semantics

<p>The Python language design is distinguished by its emphasis on readability, simplicity, and explicitness.</p>
&nbsp;

_The Basics_ > _Language Semantics_ >
#### Indentation, not braces

Python uses whitespace (tabs or spaces) to structure code instead of using braces.</li>

Python statements also do not need to be terminated by semicolons. Semicolons can be used, however, to separate multiple statements on a single line:

_The Basics_ > _Language Semantics_ >
####  Everything is an object

An important characteristic of the Python language is the consistency of its _object model_.

_The Basics_ > _Language Semantics_ >
#### Comments

Any text preceded by the hash mark (pound sign) #

An easy solution is to comment out the code:

_The Basics_ > _Language Semantics_ >
#### Function and object method calls

**Functions** are called using parentheses and passing zero or more arguments.

Almost every object in Python has attached functions, known as methods.

Positional and keyword arguments:

_The Basics_ > _Language Semantics_ >
#### Variables and pass-by-reference

* When assigning a variable (or name) in Python, you are creating a reference to the object on the right hand side of the equals sign.

In [1]:
a =[1, 2, 3]

In [2]:
b = a

![](Pics4PythonEssentials/picture_0_1.png)

In [3]:
a.append(4)

In [4]:
b

[1, 2, 3, 4]

* Assignment is also referred to as **binding**.
* When you pass objects as arguments to a function, you are only passing references; no copying occurs, pass **by reference**.

In [5]:
def append_element(some_list, element):
    some_list.append(element)

In [6]:
data = [1, 2, 3]

In [7]:
append_element(data, 4)

In [8]:
data

[1, 2, 3, 4]

_The Basics_ > _Language Semantics_ >
#### Dynamic references, strong types

* Object references in Python have **no type** associated with them.

In [9]:
a = 5

In [10]:
type(a)

int

In [11]:
a = "foo"

In [12]:
type(a)

str

Python is **not** a “typed language”. **This is not true!**

In [13]:
"5" + 5

TypeError: must be str, not int

* Python is considered a **strongly-typed** language, which means that every object has a specific type (or class), and implicit conversions will occur only in certain obvious circumstances.

In [14]:
a = 4.5; b = 2

In [15]:
print('a is {}, b is {}' .format(type(a), type(b)))

a is <class 'float'>, b is <class 'int'>


In [16]:
a/b

2.25

_The Basics_ > _Language Semantics_ >
#### Attributes and methods

Objects in Python typically have both 
* **attributes**, other Python objects stored “inside” the object, and 
* **methods**, functions associated with an object which can have access to the object’s internal data.

In [17]:
a = 'foo'

In [18]:
getattr(a, 'split')

<function str.split>

_The Basics_ > _Language Semantics_ >
#### "Duck" typing

Verify that an object is iterable if it implemented the **iterator protocol**.

In [19]:
def isiterable(obj):
    try:
        iter(obj)
        return True
    except TypeError: # not iterbale
        return False

In [20]:
isiterable('a string')

True

In [21]:
isiterable([1, 2, 3])

True

In [22]:
isiterable(3)

False

_The Basics_ > _Language Semantics_ >
#### Imports

A module is simply a <code>.py</code> file containing function and variable definitions along with such things imported from other <code>.py</code> files.

In [23]:
import some_module as sm
from some_module import PI as pi, g as gf

In [24]:
r1 = sm.f(pi)

In [25]:
r1

5.14159

In [26]:
r2 = gf(6, pi); r2

9.14159

_The Basics_ > _Language Semantics_ >
#### Binary operators and comparisons

In [27]:
5 - 7

-2

In [28]:
5 <= 2

False

![](Pics4PythonEssentials/picture_0_2.png)

<ul>
<li>To check if two references refer to the same object, use the <b>is</b> keyword.</li>
</ul>

In [29]:
a = [1, 2, 3]

In [30]:
b = a

In [31]:
# Note, the list function always creates a new list
c = list(a)

In [32]:
a is b

True

In [33]:
a is c

False

<ul>
<li>This is not the same thing is comparing with ==.</li>
</ul>

In [34]:
a == c

True

<ul>
<li>Check if a variable is <b>None</b>.</li>
</ul>

In [35]:
a = None

In [36]:
a is None

True

_The Basics_ > _Language Semantics_ >
#### Strictness versus laziness

**When** expressions are evaluated?
* Once these statements are evaluated, the calculation is immediately (or strictly) carried out.
* Evaluated until it is actually used elsewhere, **lazy evaluation**

Python is a very **strict**(or eager) language. Nearly all of the time, computations and expressions are evaluated immediately. There are Python techniques, especially using iterators and generators, which can be used to achieve laziness. When performing very expensive computations which are only necessary some of the time, this can be an important technique in data-intensive applications.

_The Basics_ > _Language Semantics_ >
#### Mutable and immutable objects

Most objects in Python are **mutable**, such as lists, dicts, NumPy arrays, or most userdefined types (classes). This means that the object or values that they contain **can be modified**.

In [37]:
a_list = ['foo', 2, [4, 5]]

In [38]:
a_list[2] = [3, 4]

In [39]:
a_list

['foo', 2, [3, 4]]

But, strings and tuples, are **immutable**:

In [40]:
a_tuple = (3, 5, (4, 5))

In [41]:
a_tuple[1] = 'four'

TypeError: 'tuple' object does not support item assignment

Because you **can** mutate an object does not mean that you always should. Such actions are known in programming as **side effects**.

_The Basics_ > 
### Scalar Types

A small set of built-in types for handling numerical data, strings, Boolean (True or False) values, and dates and time.

![](Pics4PythonEssentials/picture_0_3.png)

_The Basics_ > _Scalar Type_ >
#### Numeric types

Int, float and long

In [42]:
ival = 17239871

In [43]:
ival ** 6

26254519291092456596965462913230729701102721

Floating point numbers are represented with the Python <code>float</code> type. Under the hood each one is a double-precision (64 bits) value.

In [44]:
fval = 7.243

In [45]:
fval2 = 6.78e-5

Integer division not resulting in a whole number will always yield a floating point number:

In [46]:
3/2

1.5

To get C-style integer division, use the floor division operator //:

In [47]:
3//2

1

Use j for the imaginary part:

In [48]:
cval = 1 + 2j

In [49]:
cval * (1- 2j)

(5+0j)

_The Basics_ > _Scalar Type_ >
#### Strings

String literal using either single quotes ' or double quotes "

In [50]:
a = 'one way of writing a string'

In [51]:
a

'one way of writing a string'

In [52]:
b = "another way"

In [53]:
b

'another way'

For multiline strings with line breaks, you can use triple quotes, either ''' or """ 

In [54]:
c = """
This is a longer string that
spans multiple lines
"""

In [55]:
c

'\nThis is a longer string that\nspans multiple lines\n'

Python strings are **immutable**

In [56]:
a = 'this is a string'

In [57]:
a[10] = 'f'

TypeError: 'str' object does not support item assignment

In [58]:
b = a.replace('string', 'longer string')

In [59]:
b

'this is a longer string'

Many Python objects can be converted to a string using the <code>str</code> function

In [60]:
a = 5.6

In [61]:
s = str(a)

In [62]:
s

'5.6'

Strings are a sequence of characters

In [63]:
s = 'python'

In [64]:
list(s)

['p', 'y', 't', 'h', 'o', 'n']

In [65]:
s[:3]

'pyt'

The backslash character <code>\</code> is an escape character

In [66]:
s = '12\\34'

In [67]:
print(s)

12\34


Preface the leading quote of the string with <code>r</code> which means that the characters should be interpreted as is

In [68]:
s = r'this\has\no\special\characters'

In [69]:
s

'this\\has\\no\\special\\characters'

Adding two strings together concatenates them and produces a new string

In [70]:
a = 'this is the first half'

In [71]:
b = ' and this is the second half'

In [72]:
a + b

'this is the first half and this is the second half'

**String templating or formatting**

Strings with a <code>%</code> followed by one or more format characters is a target for inserting a value into that string.

In [73]:
template = '%.2f %s are worth $%d'

In [74]:
template % (4.5560, 'Argentine Peso', 1)

'4.56 Argentine Peso are worth $1'

_The Basics_ > _Scalar Type_ >
#### Booleans
<code>True</code> and <code>False</code>

In [75]:
a = [1, 2, 3]
if a:
    print('I found something')

I found something


In [76]:
b = []
if not b:
    print('Empty!')

Empty!


Empty sequences (<code>lists</code>, <code>dicts</code>, <code>tuples</code>, etc.) are treated as <code>False</code>

In [77]:
bool([]), bool([1, 2, 3])

(False, True)

In [78]:
bool('Hello World!'), bool('')

(True, False)

In [79]:
bool(0), bool(1)

(False, True)

_The Basics_ > _Scalar Type_ >
#### Type Casting
cast values to <code>str</code>, <code>bool</code>, <code>int</code> and <code>float</code> types:

In [80]:
s = '3.14159'

In [81]:
fval = float(s)

In [82]:
type(fval)

float

In [83]:
int(fval)

3

In [84]:
bool(fval)

True

In [85]:
bool(0)

False

_The Basics_ > _Scalar Type_ >
#### None
Python null value type

In [86]:
a = None

In [87]:
a is None

True

In [88]:
b = 5

In [89]:
b is not None

True

<code>None</code> is not a reserved keyword but rather a unique instance of <code>NoneType</code>.

_The Basics_ > _Scalar Type_ >
#### Date and times
The built-in Python <code>datetime</code> module provides <code>datetime</code>, <code>date</code>, and <code>time</code> types

In [90]:
from datetime import datetime, date, time

In [91]:
dt = datetime(2017, 3, 3, 10, 30, 19)

In [92]:
dt.day

3

In [93]:
dt.minute

30

In [94]:
dt.date()

datetime.date(2017, 3, 3)

In [95]:
dt.time()

datetime.time(10, 30, 19)

<code>datetime.timedelta</code> type:

In [96]:
dt2 = datetime(2017, 5, 5, 14, 31)

In [97]:
delta = dt2 - dt

In [98]:
delta

datetime.timedelta(63, 14441)

In [99]:
type(delta)

datetime.timedelta

In [100]:
dt

datetime.datetime(2017, 3, 3, 10, 30, 19)

In [101]:
dt + delta

datetime.datetime(2017, 5, 5, 14, 31)

_The Basics_ > 
### Control Flow

_The Basics_ > _Control Flow_ >
#### if, elif and else
It checks a condition which, if <code>True</code>, evaluates the code in the block that follows

followed by one or more elif blocks and a catch-all else block if all of the conditions are <code>False</code>

_The Basics_ > _Control Flow_ >
#### for loops
Iterating over a collection or an iterater

advanced to the next iteration, skipping the remainder of the block, using the <code>continue</code> keyword

A for loop can be exited altogether using the <code>break</code> keyword

conveniently <i>unpacked</i> into variables

_The Basics_ > _Control Flow_ >
#### while loops
Specifies a condition and a block of code that is to be executed until the condition evaluates to <code>False</code> or the loop is explicitly ended with <code>break</code>.

x = 256
total = 0
while x > 0:
    if total > 500:
        break
    total += x
    x = x // 2

_The Basics_ > _Control Flow_ >
#### Pass
"no-op" statement

As a place-holder in code

_The Basics_ > _Control Flow_ >
#### Exception handling
Handling Python errors or exceptions gracefully is an important part of building robust programs.

In [102]:
float('1.2345')

1.2345

In [103]:
float('something')

ValueError: could not convert string to float: 'something'

Writing a function that encloses the call to <code>float</code> in a <code>try</code>/<code>except</code> block.

In [104]:
def attempt_float(x):
    try:
        return float(x)
    except:
        return x

In [105]:
attempt_float('1.2345')

1.2345

In [106]:
attempt_float('something')

'something'

In [107]:
float((1, 2))

TypeError: float() argument must be a string or a number, not 'tuple'

You might want to only suppress <code>ValueError</code>, since a <code>TypeError</code> might indicate a legitimate bug in your program.

In [108]:
def attempt_float(x):
    try:
        return float(x)
    except ValueError:
        return x

In [109]:
attempt_float((1, 2))

TypeError: float() argument must be a string or a number, not 'tuple'

In [110]:
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x

In [111]:
attempt_float((1, 2))

(1, 2)

You want some code to be executed regardless of whether the code in the try block succeeds or not.

_The Basics_ > _Control Flow_ >
#### range
The <code>range</code> function produces a list of evenly-spaced integers.
In Python 3, range always returns an **iterator**, and thus it is not necessary to use the <code>xrange</code> function.

In [112]:
sum = 0
for i in range(10000):
    # % is the modulo operator
    if i % 3 == 0 or i % 5 == 0:
        sum += i

In [113]:
sum

23331668

_The Basics_ > _Control Flow_ >
#### Ternary Expressions
A ternary expression in Python allows you combine an if-else block which produces a value into a single line or expression.

In [114]:
x = 5

In [117]:
'Non-negative' if x >=0 else 'Negative'

'Non-negative'

## Data Structures and Sequence

A critical part of becoming a proficient Python programmer 

_Data Structures and Sequence_ > 
### Tuple

**One-dimensional**, **fixed-length**, **immutable** sequence of Python objects

In [119]:
tup = 4, 5, 6

In [120]:
tup

(4, 5, 6)

When defining tuples in more complicated expressions, it’s often necessary to enclose the values in parentheses.

In [121]:
nested_tup = (4, 5, 6), (7, 8)

In [122]:
nested_tup

((4, 5, 6), (7, 8))

Any sequence or iterator can be converted to a tuple.

In [123]:
tuple([4, 0, 2])

(4, 0, 2)

In [124]:
tup = tuple('string')

In [125]:
tup

('s', 't', 'r', 'i', 'n', 'g')

Elements can be accessed with square brackets <code>[]</code>.

In [126]:
tup[0]

's'

In [127]:
tup = tuple(['foo', [1, 2], True])

In [128]:
tup[2]

True

In [129]:
tup[2] = False

TypeError: 'tuple' object does not support item assignment

In [130]:
# however
tup[1].append(3)

In [131]:
tup

('foo', [1, 2, 3], True)

<code>+</code> operator to produce longer tuples.

In [132]:
(4, None) + (6, 0)

(4, None, 6, 0)

In [140]:
(4, None, 'foo') + (6, 0) + ('bar')

TypeError: can only concatenate tuple (not "str") to tuple

Multiplying a tuple by an integer, as with lists, has the effect of concatenating together that many copies of the tuple.

In [141]:
('foo', 'bar') * 4

('foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'bar')

_Data Structures and Sequence_ > _Tuples_ >
#### Unpacking tuples

_Assign_ to a tuple-like expression of variables.

In [142]:
tup = (4, 5, 6)

In [143]:
a, b, c = tup

In [145]:
b

5

Even sequences with nested tuples can be unpacked.

In [146]:
tup = 4, 5, (6, 7)

In [147]:
a, b, (c, d) = tup

In [148]:
d

7

Easy to swap variable names.

In [150]:
b, a = a, b

In [151]:
a

5

When iterating over sequences of tuples or lists.

For returning multiple values from a function.

_Data Structures and Sequence_ > _Tuples_ >
#### Tuple methods
<code>count</code>

In [152]:
a = (1, 2, 2, 2, 3, 4, 2)

In [154]:
a.count(2)

4

_Data Structures and Sequence_ > _List_ >
### List

* Variable-length
* Contents can be modified.
* Defined using square brackets <code>[]</code> or using the list type function.

In [155]:
a_list = [2, 3, 7, None]

In [156]:
tup = ('foo', 'bar', 'bar')

In [178]:
b_list = list(tup)

In [179]:
b_list

['foo', 'bar', 'bar']

In [180]:
b_list[1] = 'peekaboo'

In [181]:
b_list

['foo', 'peekaboo', 'bar']

_Data Structures and Sequence_ > _List_ >
#### Adding and removing elements

**<code>append</code>** method

In [182]:
b_list.append('dwarf')

In [183]:
b_list

['foo', 'peekaboo', 'bar', 'dwarf']

**<code>insert</code>** method

In [184]:
b_list.insert(1, 'red')

In [185]:
b_list

['foo', 'red', 'peekaboo', 'bar', 'dwarf']

The inverse operation to <code>insert</code> is <code>pop</code>, which removes and returns an element.

In [186]:
b_list.pop(2)

'peekaboo'

In [187]:
b_list

['foo', 'red', 'bar', 'dwarf']

In [188]:
b_list.append('foo')

In [189]:
b_list

['foo', 'red', 'bar', 'dwarf', 'foo']

In [190]:
b_list.remove('foo')

In [191]:
b_list

['red', 'bar', 'dwarf', 'foo']

A perfectly suitable “multi-set” data structure.

In [192]:
'dwarf' in b_list

True

_Data Structures and Sequence_ > _List_ >
#### Concatenating and combining lists

Adding two lists together with <code>+</code> concatenates them.

In [193]:
[4, None, 'foo'] + [7, 8, (2, 3)]

[4, None, 'foo', 7, 8, (2, 3)]

**<code>extend</code>** method

In [194]:
x = [4, None, 'foo']

In [195]:
x.extend([7, 8, (2, 3)])

In [196]:
x

[4, None, 'foo', 7, 8, (2, 3)]

List concatenation is a comparatively expensive operation. 
Using <code>extend</code> to append elements to an existing list, especially if you are building up a large list.

_Data Structures and Sequence_ > _List_ >
#### Sorting

sorted in-place

In [197]:
a = [7, 2, 5, 1, 3]

In [198]:
a.sort()

In [199]:
a

[1, 2, 3, 5, 7]

Ability to pass a secondary sort key

In [203]:
b = ['saw', 'small', 'He', 'foxes', 'six']

In [204]:
b.sort(); b

['He', 'foxes', 'saw', 'six', 'small']

In [201]:
b = ['saw', 'small', 'He', 'foxes', 'six']

In [205]:
b.sort(key=len); b

['He', 'saw', 'six', 'foxes', 'small']

_Data Structures and Sequence_ > _List_ >
#### Slicing

Select sections of list-like types by using slice notation with indexing operator <code>[]</code>.

![](Pics4PythonEssentials/picture_0_4.png)

In [206]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]

In [207]:
seq[1:5]

[2, 3, 7, 5]

In [208]:
seq[3:4] = [6,3]

In [209]:
seq

[7, 2, 3, 6, 3, 5, 6, 0, 1]

While element at the start index is included, the stop index is not included. Default to the start of the sequence and the end of the sequence, respectively

In [210]:
seq

[7, 2, 3, 6, 3, 5, 6, 0, 1]

In [211]:
seq[:5]

[7, 2, 3, 6, 3]

In [212]:
seq[3:]

[6, 3, 5, 6, 0, 1]

In [213]:
seq[-4:]

[5, 6, 0, 1]

In [214]:
seq[-6:-2]

[6, 3, 5, 6]

**step**

In [215]:
seq[::2]

[7, 3, 3, 6, 1]

In [216]:
seq[::-1]

[1, 0, 6, 5, 3, 6, 3, 2, 7]

_Data Structures and Sequence_ > 
### Built-in Sequence Functions

_Data Structures and Sequence_ > _Built-in Sequence Functions_ >
#### Enumerate

i = 0
for value in collection:
    # do something with value
    i += 1

for (i, value) in enumerate(collection):
    # do something with value

A useful pattern that uses enumerate is computing a <code>dict</code> mapping the values of a sequence.

In [218]:
some_list = ['foo', 'bar', 'baz']

In [219]:
mapping = dict((v, i) for i, v in enumerate(some_list))

In [220]:
mapping

{'bar': 1, 'baz': 2, 'foo': 0}

_Data Structures and Sequence_ > _Built-in Sequence Functions_ >
#### sorted
The sorted function returns a new sorted list from the elements of any sequence.

In [221]:
sorted([7, 1, 2, 6, 0, 3, 2])

[0, 1, 2, 2, 3, 6, 7]

In [222]:
sorted('horse race')

[' ', 'a', 'c', 'e', 'e', 'h', 'o', 'r', 'r', 's']

In [223]:
sorted(set('this is just some string'))

[' ', 'e', 'g', 'h', 'i', 'j', 'm', 'n', 'o', 'r', 's', 't', 'u']

_Data Structures and Sequence_ > _Built-in Sequence Functions_ >
#### zip
“pairs” up the elements of a number of lists, tuples, or other sequences, to create a list of tuples.

In [224]:
seq1 = ['foo', 'bar', 'baz']

In [225]:
seq2 = ['one', 'two', 'three']

In [229]:
z = zip(seq1, seq2)

In [230]:
z

<zip at 0x106904808>

In [231]:
list(z)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

Common use of zip is for simultaneously iterating over multiple sequences, possibly also combined with enumerate.

In [232]:
for i, (a, b) in enumerate(zip(seq1, seq2)):
    print('%d: %s, %s' % (i, a, b))

0: foo, one
1: bar, two
2: baz, three


“unzip” the sequence

In [233]:
pitchers = [('Nolan', 'Ryan'), ('Roger', 'Clemens'), ('Schilling', 'Curt')]

In [234]:
first_names, last_names = zip(*pitchers)

In [235]:
first_names

('Nolan', 'Roger', 'Schilling')

In [236]:
last_names

('Ryan', 'Clemens', 'Curt')

at the use of *

_Data Structures and Sequence_ > _Built-in Sequence Functions_ >
#### reversed
a sequence in reverse order

In [237]:
list(reversed(range(10)))

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

_Data Structures and Sequence_ > 
### Dict

_hash map_ or _associative array_
A flexibly-sized collection of _key-value_ pairs, where _key_ and _value_ are Python objects

In [238]:
empty_dict = []

In [239]:
d1 = {'a': 'some values', 'b':[1, 2, 3, 4]}

In [240]:
d1

{'a': 'some values', 'b': [1, 2, 3, 4]}

Elements can be accessed and inserted or set

In [241]:
d1[7] = 'an integer'

In [242]:
d1

{'a': 'some values', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [243]:
d1['b']

[1, 2, 3, 4]

Check if a dict contains a key

In [244]:
'b' in d1

True

Values can be deleted either using the <code>del</code> keyword or the <code>pop</code> method

In [247]:
d1[5] = 'some values'; d1['dummy'] = 'another value'; d1

{'a': 'some values',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 5: 'some values',
 'dummy': 'another value'}

In [249]:
del d1[5]

In [250]:
d1

{'a': 'some values',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 'dummy': 'another value'}

In [251]:
ret = d1.pop('dummy'); ret

'another value'

The <code>keys</code> and <code>values</code> method give you lists of the keys and values.

In [252]:
d1.keys()

dict_keys(['a', 'b', 7])

In [253]:
d1.values()

dict_values(['some values', [1, 2, 3, 4], 'an integer'])

If you’re using Python 3, <code>dict.keys()</code> and <code>dict.values(</code>) are iterators instead of lists.

#### <code>update</code> method

In [254]:
d1.update({'b': 'foo', 'c': 12})

In [255]:
d1

{'a': 'some values', 'b': 'foo', 7: 'an integer', 'c': 12}

_Data Structures and Sequence_ > _Dict_ >
#### Creating dicts from sequences

Two sequences that you want to pair up element- wise in a dict.

In [256]:
mapping = dict(zip(range(5), reversed(range(5)))); mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

_Data Structures and Sequence_ > _Dict_ >
#### Default values
It’s very common to have logic like:

The dict methods <code>get</code> and <code>pop</code> can take a default value to be returned.

<code>get</code> by default will return None if the key is not present, while <code>pop</code> will raise an exception.

In [257]:
words = ['apple', 'bat', 'bar', 'atom', 'book']

In [258]:
by_letter = {}

In [259]:
for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter] = [word]
    else:
        by_letter[letter].append(word)

In [260]:
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

In [261]:
by_letter = {}

In [262]:
for word in words:
    letter = word[0]
    by_letter.setdefault(letter, []).append(word)

In [263]:
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

<code>collections</code> module has a useful class, <code>defaultdict</code>, which makes this even easier.

The initializer to <code>defaultdict</code> only needs to be a callable object (e.g. any function), not necessarily a type.

_Data Structures and Sequence_ > _Dict_ >
#### Valid dict key types

The <code>keys</code> have to be immutable objects like scalar types or tuples.
The technical term here is _hashability_.

In [264]:
hash('string')

1559144700222220181

In [265]:
hash((1, 2, (2, 3)))

1097636502276347782

In [266]:
hash((1, 2, [2, 3])) # fails because lists are mutable

TypeError: unhashable type: 'list'

In [267]:
d = {}

In [268]:
d[tuple([1, 2, 3])] = 5

In [269]:
d

{(1, 2, 3): 5}

_Data Structures and Sequence_ > 
### Set

* An unordered collection of unique elements
* Like <code>dicts</code>, but keys only, no values
* Created in two ways: via the set function or using a set literal with curly braces

In [270]:
set([2, 2, 2, 1, 3, 3])

{1, 2, 3}

In [271]:
{2, 2, 2, 1, 3, 3}

{1, 2, 3}

Support mathematical _set operations_ like <code>union</code>, <code>intersection</code>, <code>difference</code>, and <code>symmetric difference</code>.

In [272]:
a = {1, 2, 3, 4, 5}

In [273]:
b = {3, 4, 5, 6, 7, 8}

In [274]:
a | b

{1, 2, 3, 4, 5, 6, 7, 8}

In [275]:
a & b

{3, 4, 5}

In [276]:
a - b

{1, 2}

In [277]:
a ^ b

{1, 2, 6, 7, 8}

![](Pics4PythonEssentials/picture_0_5.png)

Check if a set is a subset of (is contained in) or a superset of (contains all elements of) another set:

In [279]:
a_set = {1, 2, 3, 4, 5}

In [280]:
{1, 2, 3}.issubset(a_set)

True

In [281]:
a_set.issuperset({1, 2, 3})

True

Sets are equal if their contents are equal.

In [282]:
{1, 2, 3} == {3, 2, 1}

True

_Data Structures and Sequence_ > 
### List, Set, and Dict Comprehensions

**_List comprehensions_** are one of the most-loved Python language features.
Concisely form a new list by filtering the elements of a collection and transforming the elements passing the filter in one concise expression.

The filter condition can be omitted

In [283]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']

In [284]:
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

A **<code>dict</code>** comprehension

A **<code>set</code>** comprehension

In [286]:
unique_length = {len(x) for x in strings}; unique_length

{1, 2, 3, 4, 6}

In [287]:
loc_mapping = {val : index for index, val in enumerate(strings)}; loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

In [289]:
loc_mapping = dict((val, idx) for idx, val in enumerate(strings)); loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

_Data Structures and Sequence_ > _Dict_ >
#### Nested list comprehensions
Suppose we have a list of lists containing some boy and girl names:

In [290]:
all_data = [['Tom', 'Billy', 'Jefferson', 'Andrew', 'Wesley', 'Steven', 'Joe'],
            ['Susie', 'Casey', 'Jill', 'Ana', 'Eva', 'Jennifer', 'Stephanie']]

You might have gotten these names from a couple of files and decided to keep the boy and girl names separate. Now, suppose we wanted to get a single list containing all names with two or more e’s in them.

Wrap this whole operation up in a single _nested list comprehension_.

In [291]:
result = [name for names in all_data for name in names
         if name.count('e') >= 2] ; result

['Jefferson', 'Wesley', 'Steven', 'Jennifer', 'Stephanie']

Where we “flatten” a list of tuples of integers into a simple list of integers:

In [292]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [294]:
flattened = [x for tup in some_tuples for x in tup]; flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

The order of the <code>for</code> expressions would be the same if you wrote a nested <code>for</code> loop instead of a list comprehension.

It’s important to distinguish the above syntax from a list comprehension inside a list comprehension, which is also perfectly valid.

In [295]:
[[x for tup in some_tuples for x in tup]]

[[1, 2, 3, 4, 5, 6, 7, 8, 9]]

## Functions

Functions are the primary and most important method of code organization and reuse in Python. declared using the <code>def</code> keyword and returned from using the <code>return</code> keyword:

If the end of a function is reached without encountering a <code>return</code> statement, <code>None</code> is returned.

Some number of _positional_ arguments and some number of _keyword_ arguments. Keyword arguments are most commonly used to specify default values or optional arguments. The keyword arguments must follow the positional arguments

_Functions >_
### Namespaces, Scope, and Local Functions

Functions can access variables in two different scopes: _global_ and _local_. An alternate and more descriptive name describing a variable scope in Python is a _namespace_.

Assigning global variables within a function is possible, but those variables must be declared as global using the <code>global</code> keyword:

In [296]:
a = None

In [297]:
def bind_a_variable():
    global a
    a = []

bind_a_variable()

In [298]:
print(a)

[]


Functions can be declared anywhere, and there is no problem with having local functions that are dynamically created when a function is called:

_Functions_ >
### Returning Multiple Values

Return multiple values from a function

You may realize that what’s happening here is that the function is actually just returning _one object_, namely a tuple, which is then being unpacked into the result variables.

**return_value** would be, as you may guess, a 3-tuple with the three returned variables. A potentially attractive alternative to returning multiple values like above might be to return a <code>dict</code> instead.

_Functions_ >
### Functions Are Objects
Since Python functions are objects, many constructs can be easily expressed that are difficult to do in other languages.

In [299]:
states = [' Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda',
          'south carolina##', 'West virginia?']

In [300]:
import re # Regular expression module

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub('[!#?]', '', value) # remove punctuation
        value = value.title()
        result.append(value)
        
    return result

In [301]:
clean_strings(states)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

In [302]:
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

In [303]:
clean_ops = [str.strip, remove_punctuation, str.title]

In [304]:
def clean_strings(strings, ops):
    result = []
    for value in strings:
        for function in ops:
            value = function(value)
        result.append(value)
    return result

In [306]:
clean_strings(states, clean_ops)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

A more functional pattern enables you to easily modify how the strings are transformed at a very high level.  You can naturally use functions as arguments to other functions.

In [307]:
list(map(remove_punctuation, states))

[' Alabama ',
 'Georgia',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south carolina',
 'West virginia']

<p><i>Functions ></i></p>
### Anonymous (lambda) Functions

Simple functions consisting of a single statement, the result of which is the return value.

It’s often less typing (and clearer) to pass a lambda function as opposed to writing a full-out function declaration or even assigning the lambda function to a local variable.

In [308]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]

In [309]:
ints = [4, 0, 1, 5, 6]

In [310]:
apply_to_list(ints, lambda x: x * 2)

[8, 0, 2, 10, 12]

Sort a collection of strings by the number of distinct letters in each string.

In [311]:
strings = ['foo', 'card', 'bar', 'aaaa', 'abab']

In [312]:
strings.sort(key=lambda x: len(set(list(x)))); strings

['aaaa', 'foo', 'abab', 'bar', 'card']

_functions_ >
### Closures: Functions that Return Functions
A closure is any _dynamically-generated_ function returned by another function. The returned function has access to the variables in the local namespace where it was created.

You can just as easily have a mutable object like a dict, set, or list that can be modified.
A function that returns a function that keeps track of arguments it has been called with:

In [313]:
def make_watcher():
    have_seen = {}
    
    def has_been_seen(x):
        if x in have_seen:
            return True
        else:
            have_seen[x] = True
            return False
        
    return has_been_seen

In [314]:
watcher = make_watcher()

In [315]:
?watcher

In [316]:
vals = [5, 6, 1, 5, 1, 6, 3, 5]

In [317]:
[watcher(x) for x in vals]

[False, False, False, True, True, True, False, True]

One technical limitation to keep in mind is that while you can mutate any internal state objects, you cannot bind variables in the enclosing function scope.
One way to work around this is to modify a dict or list rather than binding variables:

An example of creating a string formatting function.

In [318]:
def format_and_pad(template, space):
    def formatter(x):
        return (template % x).rjust(space)
    return formatter

Create a floating point formatter that always returns a length-15 string.

In [319]:
fmt = format_and_pad('%.4f', 15)

In [320]:
fmt(1.756)

'         1.7560'

_Functions_ >
### Extended Call Syntax with \*args, \*\*kwargs
The internal function receives a <code>tuple</code> **args** and <code>dict</code> **kwargs** and internally does the equivalent of:

In [321]:
def say_hello_then_call_f(f, *args, **kwargs):
    print('args is', args)
    print('kwargs is', kwargs)
    print("Hello! Now I'm going to call %s" % f)
    return f(*args, **kwargs)

def g(x, y, z=1):
    return (x + y) / z

In [322]:
say_hello_then_call_f(g, 1, 2, z=5.)

args is (1, 2)
kwargs is {'z': 5.0}
Hello! Now I'm going to call <function g at 0x106b1e400>


0.6

_Functions_ >
### Currying: Partial Argument Application
_Currying_ is a fun computer science term which means deriving new functions from existing ones by _partial argument application_.
The second argument to add_numbers is said to be curried.
The built-in **functools** module can simplify this process using the **partial** function:

In [323]:
def add_numbers(x, y):
    return x + y

In [324]:
from functools import partial
add_five = partial(add_numbers, 5)

In [325]:
add_five(3)

8

Using this function, we could derive a new function of one variable, <code>add_five</code>, that adds 5 to its argument.

In [326]:
add_five = lambda y: add_numbers(5, y)

In [327]:
add_five(3)

8

_Functions_ >
### Generators
_iterator protocol_, a generic way to make objects iterable

In [328]:
some_dict = {'a': 1, 'b': 2, 'c': 3}

In [329]:
for key in some_dict:
    print(key)

a
b
c


In [330]:
dict_iterator = iter(some_dict)

In [331]:
dict_iterator

<dict_keyiterator at 0x106c92b38>

In [332]:
list(dict_iterator)

['a', 'b', 'c']

A generator is a simple way to construct a new iterable object.
Generators return a sequence of values lazily, pausing after each one until the next one is requested.
Use the <code>yield</code> keyword instead of return in a function:

In [339]:
def squares(n=10):
    for i in range(1, n + 1):
        print('Generating squares from 1 to %d' % (n ** 2))
        yield i ** 2

When you actually call the generator, no code is immediately executed.

In [340]:
gen = squares(); gen

<generator object squares at 0x106ca1620>

It is not until you request elements from the generator that it begins executing its code:

In [341]:
for x in gen:
    print(x)

Generating squares from 1 to 100
1
Generating squares from 1 to 100
4
Generating squares from 1 to 100
9
Generating squares from 1 to 100
16
Generating squares from 1 to 100
25
Generating squares from 1 to 100
36
Generating squares from 1 to 100
49
Generating squares from 1 to 100
64
Generating squares from 1 to 100
81
Generating squares from 1 to 100
100


Find all unique ways to make change for $1 (100 cents) using an arbitrary set of coins.

In [342]:
def make_change(amount, coins=[1, 5, 10, 25], hand=None): 
    hand = [] if hand is None else hand
    if amount == 0:
        yield hand
    for coin in coins:
        # ensures we don't give too much change, and combinations are unique 
        if coin > amount or (len(hand) > 0 and hand[-1] < coin):
            continue
        
        for result in make_change(amount - coin, coins=coins, hand=hand + [coin]):
            yield result

In [343]:
for way in make_change(100, coins=[10, 25, 50]):
    print(way)

[10, 10, 10, 10, 10, 10, 10, 10, 10, 10]
[25, 25, 10, 10, 10, 10, 10]
[25, 25, 25, 25]
[50, 10, 10, 10, 10, 10]
[50, 25, 25]
[50, 50]


_Functions_ > _Generators_ >
#### Generator expresssions
Make a generator is by using a _generator expression_

In [344]:
gen = (x**2 for x in range(100)); gen

<generator object <genexpr> at 0x106c94af0>

In [345]:
?gen

In [346]:
list(gen)

[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401,
 2500,
 2601,
 2704,
 2809,
 2916,
 3025,
 3136,
 3249,
 3364,
 3481,
 3600,
 3721,
 3844,
 3969,
 4096,
 4225,
 4356,
 4489,
 4624,
 4761,
 4900,
 5041,
 5184,
 5329,
 5476,
 5625,
 5776,
 5929,
 6084,
 6241,
 6400,
 6561,
 6724,
 6889,
 7056,
 7225,
 7396,
 7569,
 7744,
 7921,
 8100,
 8281,
 8464,
 8649,
 8836,
 9025,
 9216,
 9409,
 9604,
 9801]

This is completely equivalent to the following more verbose generator:

In [347]:
def _make_gen():
    for x in range(100):
        yield x**2

In [348]:
gen = _make_gen()

In [349]:
gen?

In [364]:
import numpy as np
np.sum((x ** 2 for x in range(100)))

328350

In [358]:
dict((i, i **2) for i in range(5))

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

_Functions_ > _Generators_ >
#### itertools module
_itertools_ module has a collection of generators for many common data algorithms.

In [365]:
import itertools

In [366]:
first_letter = lambda x: x[0]

In [367]:
names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']

In [368]:
for letter, names in itertools.groupby(names, first_letter): 
    print(letter, list(names)) # names is a generator

A ['Alan', 'Adam']
W ['Wes', 'Will']
A ['Albert']
S ['Steven']


![](Pics4PythonEssentials/picture_0_6.png)

## Files and the operating system

To open a file for reading or writing

In [369]:
path = 'segismundo.txt'

In [370]:
f = open(path)

By default, the file is opened in read-only mode 'r'.

In [371]:
lines = [x.rstrip() for x in open(path)]

In [372]:
lines

['Sueña el rico en su riqueza,',
 'que más cuidados le ofrece;',
 '',
 'sueña el pobre que padece',
 'su miseria y su pobreza;',
 '',
 'sueña el que a medrar empieza,',
 'sueña el que afana y pretende,',
 'sueña el que agravia y ofende,',
 '',
 'y en el mundo, en conclusión,',
 'todos sueñan lo que son,',
 'aunque ninguno lo entiende.',
 '']

If we had typed <code>f = open(path, 'w')</code>, a _new file_ at the path would have been created.

![](Pics4PythonEssentials/picture_0_7.png)

To write text to a file, you can use either the file’s <code>write</code> or <code>writelines</code> methods.

In [374]:
with open('tmp.txt', 'w') as handle:
    handle.writelines(x for x in open(path) if len(x) > 1)

In [375]:
open('tmp.txt').readlines()

['Sueña el rico en su riqueza,\n',
 'que más cuidados le ofrece;\n',
 'sueña el pobre que padece\n',
 'su miseria y su pobreza;\n',
 'sueña el que a medrar empieza,\n',
 'sueña el que afana y pretende,\n',
 'sueña el que agravia y ofende,\n',
 'y en el mundo, en conclusión,\n',
 'todos sueñan lo que son,\n',
 'aunque ninguno lo entiende.\n']

![](Pics4PythonEssentials/picture_0_8.png)