# Built-in Data Structures, Functions,and Files

## Data Structures and Sequences

### Tuple
* A tuple is a fixed-length
* immutable sequence of Python objects. 
* The easiest way to create one is with a comma-separated sequence of values:

In [2]:
t = 1,2,3,4,5
t

(1, 2, 3, 4, 5)

#### Creating a Nested Tuple
* When you’re defining tuples in more complicated expressions, it’s often necessary to enclose the values in parentheses

In [3]:
nested_tuple = (1,2,3),(4,5)
nested_tuple

((1, 2, 3), (4, 5))

#### Converting to a Tuple
* You can convert any sequence or iterator to a tuple by invoking tuple:

In [4]:
tuple([1,2,3,4])

(1, 2, 3, 4)

In [5]:
tuple('Naga')

('N', 'a', 'g', 'a')

#### Modifying a mutable object inside a Tuple
* If an object inside a tuple is mutable, such as a list, you can modify it in-place:

In [10]:
t = (['naga',[1,2,3],True])
t

['naga', [1, 2, 3], True]

In [11]:
t[1].append(4)
t

['naga', [1, 2, 3, 4], True]

#### Concatenating Tuples
* You can concatenate tuples using the + operator to produce longer tuples:

In [12]:
(1,None,'Naga')+(6,1)+('Snigdha',)

(1, None, 'Naga', 6, 1, 'Snigdha')

* Multiplying a tuple by an integer, as with lists, has the effect of concatenating together that many copies of the tuple:

`When we perform this operation, note that the objects themselves are not copied, only the references to them.`

In [13]:
('Naga','Snigdha')*5

('Naga',
 'Snigdha',
 'Naga',
 'Snigdha',
 'Naga',
 'Snigdha',
 'Naga',
 'Snigdha',
 'Naga',
 'Snigdha')

In [14]:
(1,2)*4

(1, 2, 1, 2, 1, 2, 1, 2)

#### Unpacking tuples
* If you try to assign to a tuple-like expression of variables, Python will attempt to unpack the value on the righthand side of the equals sign:

In [17]:
t = (1,2,3)
a,b,c=t
a

1

* Even `sequences with nested tuples can be unpacked`:

In [22]:
t = 1,2,(3,4)
a,b,(c,d)=t
d

4

* Using this functionality you can easily swap variable names, a task which in many
languages might look like:
```
tmp = a
a = b
b = tmp
```
* But, in Python, the swap can be done like this:

In [23]:
a,b=1,2
a,b

(1, 2)

In [24]:
b,a=a,b
a,b

(2, 1)

#### When and Where do we use tuple unpacking??
* A common use of variable unpacking is iterating over sequences of tuples or lists
* Another common use is returning multiple values from a function

In [25]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
seq

[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [27]:
for a,b,c in seq:
    print("a={0}, b={1}, c={2}".format(a,b,c))

a=1, b=2, c=3
a=4, b=5, c=6
a=7, b=8, c=9


#### Advanced `tuple` unpacking
* If you want to “pluck” a few elements from the beginning of a tuple. 
* There is a special syntax `*rest`, which is also used in function signatures to capture an arbitrarily long list of positional arguments:

In [28]:
values = 1,2,3,4,5,6
values

(1, 2, 3, 4, 5, 6)

In [29]:
a,b,*rest= values
a,b

(1, 2)

##### Important Note on advanced `tuple` unpacking
* This rest bit is sometimes something you want to discard; there is nothing special about the rest name. 
* As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables:

In [32]:
a, b, *_ = values
a,b

(1, 2)

### Tuple Methods
1. `count()` - counts the number of occurences of a value
2. `index()` - gets the index of a value, throws ValueError if element not present in the tuple

In [36]:
t = 1,2,3,4

In [36]:
t.count(1)

1

In [37]:
t.index(2)

1

In [35]:
t.index(5)

ValueError: tuple.index(x): x not in tuple

## List
* In contrast with tuples, lists are variable-length and their contents can be modified in-place.
* You can define them using square brackets [] or using the list type function:

In [38]:
a = [1,2,4,None]
a

[1, 2, 4, None]

In [40]:
tup = (1,2,4,"Naga")
a = list(tup)
a

[1, 2, 4, 'Naga']

* Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.
* The `list` function is frequently used in data processing as a `way to materialize an iterator or generator expression`:

In [42]:
gen = range(10)
gen

range(0, 10)

In [43]:
list(gen)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

#### Adding and removing elements

In [82]:
a = [1,2,4,5]

* Elements can be appended to the end of the list with the `append` method:

In [83]:
a.append(6)
a

[1, 2, 4, 5, 6]

* Using `insert` method you can insert an element at a specific location in the list

In [84]:
a.insert(0,9)
a

[9, 1, 2, 4, 5, 6]

In [85]:
a.insert(-3,-3)
a

[9, 1, 2, -3, 4, 5, 6]

In [86]:
len(a)

7

In [87]:
a.insert(1000,"Naga")
a

[9, 1, 2, -3, 4, 5, 6, 'Naga']

In [88]:
a.insert(-1,"Snigdha")
a

[9, 1, 2, -3, 4, 5, 6, 'Snigdha', 'Naga']

#### Important Note on `insert` method
* `insert` method is computationally expensive compared with append, because references to subsequent elements have to be shifted internally to make room for the new element.
* If you need to insert elements at both the beginning and end of a sequence, you may wish to explore `collections.deque`, a double-ended queue, for this purpose.

#### Remove elements at a specified index - `pop`
* The inverse operation to `insert` is `pop`, which removes and returns an element at a particular index.

In [92]:
a

[1, 2, -3, 4, 5, 6, 'Snigdha', 'Naga']

In [93]:
a.pop(0)

1

In [94]:
a

[2, -3, 4, 5, 6, 'Snigdha', 'Naga']

#### Removes a value - `remove` method
* Elements can be removed by value with `remove`, which locates the first such value and removes it from the last:

In [97]:
b_list=['foo', 'red', 'baz', 'dwarf', 'foo']
b_list

['foo', 'red', 'baz', 'dwarf', 'foo']

In [98]:
b_list.remove('foo')

In [99]:
b_list

['red', 'baz', 'dwarf', 'foo']

* If performance is not a concern, by using append and remove, you can use a Python
list as a perfectly suitable “multiset” data structure.

#### Memebership test
* Check if a list contains a value using the `in` keyword
* The keyword `not` can be used to negate `in`

In [100]:
b_list

['red', 'baz', 'dwarf', 'foo']

In [101]:
'dwarf' in b_list

True

In [102]:
'dwarf' not in b_list

False

#### Important Note on mebership test
* Checking whether a list contains a value is a lot slower than doing so with dicts and sets, as Python makes a linear scan across the values of the list, whereas it can check the others (based on hash tables) in constant time.

#### Concatenating and combining lists
* Similar to tuples, adding two lists together with `+` concatenates them
* If you have a list already defined, you can append multiple elements to it using the `extend` method

In [104]:
a = [2, -3, 4, 5, 6, 'Snigdha', 'Naga']
b=['Hello world']
a+b

[2, -3, 4, 5, 6, 'Snigdha', 'Naga', 'Hello world']

In [105]:
a

[2, -3, 4, 5, 6, 'Snigdha', 'Naga']

In [108]:
x = [4,5,None]
a.extend(x)

In [109]:
a

[2, -3, 4, 5, 6, 'Snigdha', 'Naga', 4, 5, None]

#### Which is better `+` or `extend`?
* List concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. 
* Using `extend` to append elements to an existing list, especially if you are building up a large list, is usually preferable.
* Thus,

```
everything = []
for chunk in list_of_lists:
everything.extend(chunk)
```
is faster than the concatenative alternative:

```
everything = []
for chunk in list_of_lists:
everything = everything + chunk
```

#### Sorting
* You can sort a list in-place (without creating a new object) by calling its `sort` function.
* `sort` has a few options that will occasionally come in handy. 
    * One is the ability to pass a `secondary sort key`—that is, a function that produces a value to use to sort the objects.
    * For example, we could sort a collection of strings by their lengths

In [111]:
a=[4,5,2,3,1,0,9]
a

[4, 5, 2, 3, 1, 0, 9]

In [112]:
a.sort()
a

[0, 1, 2, 3, 4, 5, 9]

In [130]:
b = ['saw Tekion', 'Ms.small Tekion', 'Mr.He TCS', 'Ms.foxes Delloite', 'Mr.six']
b

['saw Tekion', 'Ms.small Tekion', 'Mr.He TCS', 'Ms.foxes Delloite', 'Mr.six']

In [131]:
def contains(x):
    if x.endswith('Tekion'):
        return 0
    else:
        return 1

In [133]:
b.sort(key=contains,reverse=True)
b

['Mr.He TCS', 'Ms.foxes Delloite', 'Mr.six', 'saw Tekion', 'Ms.small Tekion']

* `sorted` is a function in Python , which can produce a sorted copy of a general sequence.

#### Binary search and maintaining a sorted list
* The built-in `bisect` module implements binary search and insertion into a sorted list.
* `bisect.bisect` finds the location where an element should be inserted to keep it sorted
* `bisect.insort` actually inserts the element into that location

In [135]:
import bisect
c = [1, 2, 2, 2, 3, 4, 7]
c

[1, 2, 2, 2, 3, 4, 7]

In [139]:
bisect.bisect(c,5)

6

In [138]:
bisect.bisect(c,2)

4

In [140]:
bisect.insort(c,6)
c

[1, 2, 2, 2, 3, 4, 6, 7]

#### Important Note on `bisect` module
* The bisect module functions do not check whether the list is sorted, as doing so would be computationally expensive.
* Thus, using them with an unsorted list will succeed without error but may lead to incorrect results.

#### Slicing
* You can select sections of most sequence types by using slice notation, which in its basic form consists of `start:stop:step` passed to the indexing operator []:
* While the element at the start index is included, the stop index is not included, so that the number of elements in the result is `stop - start`.
* Either the `start` or `stop` can be omitted, in which case they default to the start of the sequence and the end of the sequence, respectively

In [147]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq

[7, 2, 3, 7, 5, 6, 0, 1]

In [149]:
seq[1:4]

[2, 3, 7]

##### Slices can also be assigned to with a sequence:

In [151]:
seq[3:4]=[6,3]
seq

[7, 2, 3, 6, 3, 5, 6, 0, 1]

In [152]:
seq[1:]

[2, 3, 6, 3, 5, 6, 0, 1]

In [153]:
seq[:4]

[7, 2, 3, 6]

##### Negative indices slice the sequence relative to the end:

In [155]:
seq[-4:]

[5, 6, 0, 1]

In [156]:
seq[-6:-2]

[6, 3, 5, 6]

##### A step can also be used after a second colon to, say, take every other element

In [157]:
seq[::1]

[7, 2, 3, 6, 3, 5, 6, 0, 1]

#### `A clever use of this is to pass -1, which has the useful effect of reversing a list or tuple:`

In [159]:
seq[::-1]

[1, 0, 6, 5, 3, 6, 3, 2, 7]

#### Illustration of Python slicing conventions

<img src='.//images//python_slicing_conventions.PNG' />

### Built-in Sequence Functions

#### `enumerate`
* It’s common when iterating over a sequence to want to keep track of the index of the
current item. 
* A do-it-yourself approach would look like:

```python
i = 0
for value in collection:
    # do something with value
    i += 1
```
* Since this is so common, Python has a built-in function, enumerate, which returns a sequence of (i, value) tuples:

```python
for i, value in enumerate(collection):
    # do something with value
```

* When you are indexing data, a helpful pattern that uses `enumerate` is computing a dict mapping the values of a sequence (which are assumed to be unique) to their locations in the sequence:

In [162]:
some_list = ['foo', 'bar', 'baz']
mapping={}
for i, v in enumerate(some_list):
    mapping[v] = i

mapping

{'foo': 0, 'bar': 1, 'baz': 2}

#### `sorted`
* The `sorted` function returns a new sorted list from the elements of any sequence:
* `The sorted function accepts the same arguments as the sort method on lists.`

In [164]:
sorted([3,4,5,2,6,23,5,31,5])

[2, 3, 4, 5, 5, 5, 6, 23, 31]

In [166]:
sorted('nagaraju budigam')

[' ',
 'a',
 'a',
 'a',
 'a',
 'b',
 'd',
 'g',
 'g',
 'i',
 'j',
 'm',
 'n',
 'r',
 'u',
 'u']

#### `zip`
* `zip` “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples:

In [167]:
seq1 = ['foo', 'bar', 'baz']
seq2 = ['one', 'two', 'three']
zipped = (zip(seq1,seq2))
zipped

<zip at 0x109c3adc8>

* materialize using list

In [168]:
list(zipped)

[('foo', 'one'), ('bar', 'two'), ('baz', 'three')]

* `zip` can take an arbitrary number of sequences, and the `number of elements it produces is determined by the shortest sequence`:

In [169]:
seq3 = [False, True]

In [170]:
list(zip(seq1,seq2,seq3))

[('foo', 'one', False), ('bar', 'two', True)]

* A very common use of `zip` is simultaneously iterating over multiple sequences, possibly also combined with `enumerate`:

In [171]:
for i, (a, b) in enumerate(zip(seq1, seq2)):
    print('{0}: {1}, {2}'.format(i, a, b))

0: foo, one
1: bar, two
2: baz, three


* Given a “zipped” sequence, `zip` can be applied in a clever way to “unzip” the sequence. 
* Another way to think about this is converting a list of rows into a list of columns. The syntax, which looks a bit magical, is:

In [173]:
pitchers = [('Naga', 'Budigam'), ('Snigdha', 'Budigam'),('Mani', 'Boina')]
pitchers

[('Naga', 'Budigam'), ('Snigdha', 'Budigam'), ('Mani', 'Boina')]

In [174]:
first_names,last_names=zip(*pitchers)

In [175]:
first_names

('Naga', 'Snigdha', 'Mani')

In [176]:
last_names

('Budigam', 'Budigam', 'Boina')

#### `reversed`
* `reversed` iterates over the elements of a sequence in reverse order.
* `reversed` is a generator, so it does not create the reversed sequence until materialized (e.g., with list or a for loop).

In [180]:
reversed_data = reversed(range(10))
reversed_data

<range_iterator at 0x1095994e0>

In [181]:
list(reversed_data)

[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]

#### `dict`
* A more common name for it is *hash map or associative array*. 
* It is a flexibly sized collection of key-value
pairs, where key and value are Python objects. 
* One approach for creating one is to use curly braces {} and colons to separate keys and values.

In [182]:
empty_dict={}
empty_dict

{}

In [183]:
d={'name':'Naga','age':25}
d

{'name': 'Naga', 'age': 25}

* We can access, insert, or set elements using the same syntax as for accessing elements of a `list` or `tuple`:

In [184]:
d['location']="Bangalore"
d

{'name': 'Naga', 'age': 25, 'location': 'Bangalore'}

In [185]:
d['age']

25

* We can check if a `dict` contains a key using the same syntax used for checking whether a `list` or `tuple` contains a value:

In [186]:
'age' in d

True

* We can delete values either using the `del` keyword or the `pop` method (which simultaneously returns the value and deletes the key):

In [187]:
d['edu']='B.Tech Dual Degree'
d

{'name': 'Naga',
 'age': 25,
 'location': 'Bangalore',
 'edu': 'B.Tech Dual Degree'}

In [188]:
d['dummy']='Dummy value'
d

{'name': 'Naga',
 'age': 25,
 'location': 'Bangalore',
 'edu': 'B.Tech Dual Degree',
 'dummy': 'Dummy value'}

In [189]:
del d['dummy']
d

{'name': 'Naga',
 'age': 25,
 'location': 'Bangalore',
 'edu': 'B.Tech Dual Degree'}

In [190]:
d['5']="Some value"
d

{'name': 'Naga',
 'age': 25,
 'location': 'Bangalore',
 'edu': 'B.Tech Dual Degree',
 '5': 'Some value'}

In [191]:
returned_value =d.pop('5')
returned_value

'Some value'

In [192]:
d

{'name': 'Naga',
 'age': 25,
 'location': 'Bangalore',
 'edu': 'B.Tech Dual Degree'}

* The keys and values method give you iterators of the dict’s keys and values, respectively.

In [194]:
d.keys()

dict_keys(['name', 'age', 'location', 'edu'])

In [195]:
d.values()

dict_values(['Naga', 25, 'Bangalore', 'B.Tech Dual Degree'])

* You can merge one `dict` into another using the `update` method:
* The `update` method changes dicts in-place, so any existing keys in the data passed to update will have their old values discarded.

In [196]:
d.update({'b':'foo','c':12})
d

{'name': 'Naga',
 'age': 25,
 'location': 'Bangalore',
 'edu': 'B.Tech Dual Degree',
 'b': 'foo',
 'c': 12}

#### Creating dicts from sequences
* It’s common to occasionally end up with two sequences that you want to pair up element-wise in a dict. As a first cut, you might write code like this:

```python
mapping = {}
for key, value in zip(key_list, value_list):
    mapping[key] = value
```

* Since a dict is essentially a collection of 2-tuples, the dict function accepts a list of 2-tuples:

In [199]:
mapping = dict(zip(range(5),reversed(range(5))))
mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

#### `dict` Default values
* It’s very common to have logic like:

```python
if key in some_dict:
    value = some_dict[key]
else:
    value = default_value
```

* Thus, the dict methods get and pop can take a  default value to be returned, so that the above if-else block can be written simply as:

```python
value = some_dict.get(key, default_value)
```

* `get` by default will return None if the key is not present, while `pop` will raise an exception.

In [209]:
words = ['apple', 'bat', 'bar', 'atom', 'book','cat']
words

['apple', 'bat', 'bar', 'atom', 'book', 'cat']

In [207]:
by_letter = {}
for word in words:
    letter = word[0]
    if letter not in by_letter:
        by_letter[letter]=[word]
    else:
        by_letter[letter].append(word)
by_letter

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book']}

* The `setdefault` dict method is for precisely this purpose. The preceding for loop can be rewritten as:

In [210]:
by_letter1={}
for word in words:
    letter=word[0]
    by_letter1.setdefault(letter,[]).append(word)
by_letter1

{'a': ['apple', 'atom'], 'b': ['bat', 'bar', 'book'], 'c': ['cat']}

* The built-in `collections` module has a useful class, `defaultdict`, which makes this even easier. To create one, you pass a type or function for generating the default value for each slot in the dict:

In [212]:
from collections import defaultdict
by_letter2 = defaultdict(list)
for word in words:
    by_letter2[word[0]].append(word)
by_letter2

defaultdict(list,
            {'a': ['apple', 'atom'],
             'b': ['bat', 'bar', 'book'],
             'c': ['cat']})

#### Valid dict key types
* While the values of a `dict` can be any Python object, the keys generally have to be immutable objects like scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too).
* The technical term here is `hashability`.
* We can check whether an object is hashable (can be used as a key in a dict) with the `hash` function:

In [213]:
hash('naga')

8443744479547293412

In [215]:
hash((1,2,(3,4)))

-2725224101759650258

#### Fails because lists are mutable

In [216]:
hash([1,2,3])

TypeError: unhashable type: 'list'

* To use a `list` as a key, one option is to convert it to a `tuple`, which can be hashed as
long as its elements also can:

In [218]:
d={}
d[tuple([1,2,3])]=5
d

{(1, 2, 3): 5}

### `set`
* A `set` is an unordered collection of unique elements. * You can think of them like dicts, but keys only, no values. 
* A `set` can be created in two ways: via the `set` function or via a set literal with curly braces:

In [220]:
s = set([2,3,1,1,42,1])
s

{1, 2, 3, 42}

In [221]:
s = {1,2,4,5,6,1,2,3,4,5}
s

{1, 2, 3, 4, 5, 6}

* Sets support mathematical set operations like union, intersection, difference, and symmetric difference. Consider these two example sets:

In [222]:
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

##### `union`
* The union of these two sets is the set of distinct elements occurring in either set. 
* This can be computed with either the `union` method or the `|` binary operator:

In [223]:
a.union(b)

{1, 2, 3, 4, 5, 6, 7, 8}

In [224]:
a|b

{1, 2, 3, 4, 5, 6, 7, 8}

##### `intersection`
* The intersection contains the elements occurring in both sets. The & operator or the `intersection` method can be used:

In [225]:
a.intersection(b)

{3, 4, 5}

### Python Set Operations
<img src='./images/python_set_operations.PNG'/>

* All of the logical set operations have in-place counterparts, which enable you to replace the contents of the set on the left side of the operation with the result. 

* For very large sets, this may be more efficient:

In [226]:
c = a.copy()
c

{1, 2, 3, 4, 5}

In [227]:
c |= b
c

{1, 2, 3, 4, 5, 6, 7, 8}

In [228]:
d = a.copy()
d

{1, 2, 3, 4, 5}

In [229]:
d &= b
d

{3, 4, 5}

* Like dicts, `set` elements generally must be immutable. 
* To have list-like elements, you
must convert it to a `tuple`:

In [232]:
my_data = [1, 2, 3, 4]
my_set = {tuple(my_data)}
my_set

{(1, 2, 3, 4)}

* Sets are equal if and only if their contents are equal:

In [233]:
{1, 2, 3} == {3, 2, 1}

True

* We can also check if a set is a subset of (is contained in) or a superset of (contains all
elements of) another set:

In [234]:
a_set = {1, 2, 3, 4, 5}
a_set

{1, 2, 3, 4, 5}

In [235]:
{1,2}.issubset(a_set)

True

In [236]:
a_set.issuperset({1,2,4})

True

## List, Set, and Dict Comprehensions
* List comprehensions are one of the most-loved Python language features. 
* They allow us to concisely form a new list by filtering the elements of a collection, transforming the elements passing the filter in one concise expression.

#### List Comprehension
* They take the basic form:

```python
[expr for val in collection if condition]
```
* This is equivalent to the following for loop:

```python
result = []
for val in collection:
    if condition:
    result.append(expr)
```

In [237]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
strings

['a', 'as', 'bat', 'car', 'dove', 'python']

In [238]:
[x.upper() for x in strings if len(x)>2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

### `dict` comprehension looks like this:
```python
dict_comp = {key-expr : value-expr for value in collection if condition}
```
### `set` comprehension
* A set comprehension looks like the equivalent list comprehension except with curly braces instead of square brackets:

```python
set_comp = {expr for value in collection if condition}
```

* Like list comprehensions, set and dict comprehensions are mostly conveniences, but they similarly can make code both easier to write and read.

In [240]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
strings

['a', 'as', 'bat', 'car', 'dove', 'python']

* As a simple dict comprehension example, we could create a lookup map of these
strings to their locations in the list:

In [241]:
loc_mapping = {val:index for index,val in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

#### Nested list comprehensions

In [242]:
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
            ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]
all_data

[['John', 'Emily', 'Michael', 'Mary', 'Steven'],
 ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

* Now, suppose we wanted to get a single list containing all names with two or more e’s in them. We could certainly do this with a simple for loop:

```python
names_of_interest = []
for names in all_data:
    enough_es = [name for name in names if name.count('e') >= 2]
    names_of_interest.extend(enough_es)
```
* We can actually wrap this whole operation up in a single nested list comprehension, which will look like:

In [243]:
result = [name for names in all_data for name in names if name.count('e')>=2]
result

['Steven']

In [245]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
some_tuples

[(1, 2, 3), (4, 5, 6), (7, 8, 9)]

In [246]:
[x for tup in some_tuples for x in tup]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

In [248]:
[[x for x in tup] for tup in some_tuples]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [250]:
d = {'name':'Naga','age':25,'gender':'Male','profession':'MLE'}
d

{'name': 'Naga', 'age': 25, 'gender': 'Male', 'profession': 'MLE'}

In [252]:
{key:value for key,value in d.items() if str(value).isnumeric()}

{'age': 25}

## Functions
* Functions are declared with the `def` keyword and returned from with the `return` keyword:

```python
def my_function(x, y, z=1.5):
    if z > 1:
        return z * (x + y)
else:
    return z / (x + y)
```
* There is no issue with having multiple return statements. * If Python reaches the end of a function without encountering a return statement, `None` is returned automatically.
* Each function can have positional arguments and keyword arguments. 
* Keyword arguments are most commonly used to specify default values or optional arguments.
* `The main restriction on function arguments is that the keyword arguments must follow the positional arguments (if any).`
* You can specify keyword arguments in any order; this frees you from having to remember which order the function arguments were specified in and only what their names are.
* It is possible to use keywords for passing positional arguments as well.

### Namespaces, Scope, and Local Functions
* Functions can access variables in two different scopes: *global* and *local*. 
* An alternative and more descriptive name describing a variable scope in Python is a *namespace.*
* Any variables that are assigned within a function by default are assigned to the local namespace. 
* The local namespace is created when the function is called and immediately populated by the function’s arguments. 
* After the function is finished, the local namespace is destroyed

##### local namespace example

In [255]:
def func():
    a = []
    for i in range(5):
        a.append(i)

# When func() is called, the empty list a is created, five elements are appended, and then a is destroyed when the function exits.

* Assigning variables outside of the function’s scope is possible, but those variables
must be declared as `global` via the `global` keyword:

In [260]:
a = None
def bind_a_variable():
    global a
    a = []

bind_a_variable()
a

[]

* We generally discourage use of the global keyword.
* Typically global variables are used to store some kind of state in a system.
* If you find yourself using a lot of them, it may indicate a need for objectoriented programming (using classes)

### Functions Are Objects

In [261]:
states = [' Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda','south carolina##', 'West virginia?']
states

[' Alabama ',
 'Georgia!',
 'Georgia',
 'georgia',
 'FlOrIda',
 'south carolina##',
 'West virginia?']

In [264]:
import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub('[!#?]', '', value)
        value = value.title()
        result.append(value)
    return result

clean_strings(states)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

* An alternative approach that you may find useful is to make a list of the operations
you want to apply to a particular set of strings:

In [265]:
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

clean_ops = [str.strip,remove_punctuation, str.title]

def clean_strings(strings,ops):
    result=[]
    for value in strings:
        for function in ops:
            value = function(value)
        result.append(value)
    return result

clean_strings(states,clean_ops)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South Carolina',
 'West Virginia']

* A more functional pattern like this enables you to easily modify how the strings are
transformed at a very high level. The clean_strings function is also now more reusable
and generic.
* You can use functions as arguments to other functions

In [266]:
for x in map(remove_punctuation,states):
    print(x)

 Alabama 
Georgia
Georgia
georgia
FlOrIda
south carolina
West virginia


### Anonymous (Lambda) Functions
* Python has support for so-called anonymous or lambda functions, which are a way of writing functions consisting of a single statement, the result of which is the return
value. 
* They are defined with the `lambda` keyword, which has no meaning other than “we are declaring an anonymous function”:

```python
def short_function(x):
    return x * 2
equiv_anon = lambda x: x * 2
```

* Sort a collection of strings by the number of distinct letters in each string:

In [267]:
strings = ['foo', 'card', 'bar', 'aaaa', 'abab']
strings

['foo', 'card', 'bar', 'aaaa', 'abab']

In [270]:
strings.sort(key=lambda x: len(list(set(x))))
strings

['aaaa', 'abab', 'foo', 'bar', 'card']

* One reason `lambda` functions are called anonymous functions is that , unlike functions declared with the def keyword, the function object itself is never given an explicit `__name__` attribute.

### Currying: Partial Argument Application

* Currying is computer science jargon (named after the mathematician Haskell Curry) that means deriving new functions from existing ones by partial argument application.

* For example, suppose we had a trivial function that adds two numbers together:

```python
def add_numbers(x, y):
    return x + y
```
* Using this function, we could derive a new function of one variable, add_five, that adds 5 to its argument:

```python
add_five = lambda y: add_numbers(5, y)
```

* The second argument to add_numbers is said to be **curried**. 
* There’s nothing very fancy here, as all we’ve really done is define a new function that calls an existing function.

* The built-in `functools` module can simplify this process using the `partial` function:

In [271]:
def add_numbers(x, y):
    return x + y

In [272]:
from functools import partial
add_five = partial(add_numbers, 5)

In [273]:
add_five(2)

7

### Generators
* An iterator is any object that will yield objects to the Python interpreter when used in a context like a for loop.
* A generator is a concise way to construct a new iterable object. 
* Whereas normal functions execute and return a single result at a time, generators return a sequence of multiple results lazily, pausing after each one until the next one is requested.
* To create a generator, use the `yield` keyword instead of `return` in a function

In [274]:
def squares(n=10):
    print('Generating squares from 1 to {0}'.format(n ** 2))
    for i in range(1, n + 1):
        yield i ** 2

In [276]:
gen = squares()
gen

<generator object squares at 0x10a6a7048>

* It is not until you request elements from the generator that it begins executing its code:

In [277]:
for x in gen:
    print(x)

Generating squares from 1 to 100
1
4
9
16
25
36
49
64
81
100


#### Generator expresssions
* The concise way to make a generator is by using a generator expression.
* This is a generator analogue to list, dict, and set comprehensions; to create one, enclose what would otherwise be a list comprehension within parentheses instead of brackets

In [278]:
gen = (x**2 for x in range(100))
gen

<generator object <genexpr> at 0x10a6a7200>

In [279]:
def _make_gen():
    for x in range(100):
        yield x**2
gen = _make_gen()

In [None]:
## both the above codes are same but the 2nd one is more verbose

* Generator expressions can be used instead of list comprehensions as function arguments in many cases:

In [280]:
# generator expression
sum(x**2 for x in range(100))

328350

In [282]:
# list comprehension
sum([x**2 for x in range(100)])

328350

In [283]:
dict((i,i**2) for i in range(5))

{0: 0, 1: 1, 2: 4, 3: 9, 4: 16}

### itertools module
* The standard library itertools module has a collection of generators for many common data algorithms. 
* For example, groupby takes any sequence and a function,
grouping consecutive elements in the sequence by return value of the function.

In [284]:
import itertools
first_letter = lambda x: x[0]
names = ['Naga','Snigdha','Ravinder','Mani','Ramakrisha']
names

['Naga', 'Snigdha', 'Ravinder', 'Mani', 'Ramakrisha']

In [285]:
for letter, names in itertools.groupby(names,first_letter):
    print(letter,list(names)) # names is a generator

N ['Naga']
S ['Snigdha']
R ['Ravinder']
M ['Mani']
R ['Ramakrisha']


#### Some useful itertools functions
<img src="./images/itertools_functions.PNG"/>

In [292]:
a = [1,2,3]
for i in itertools.combinations(a,2):
    print(i)

(1, 2)
(1, 3)
(2, 3)


In [293]:
for i in itertools.combinations_with_replacement(a,2):
    print(i)

(1, 1)
(1, 2)
(1, 3)
(2, 2)
(2, 3)
(3, 3)


In [294]:
for i in itertools.permutations(a,2):
    print(i)

(1, 2)
(1, 3)
(2, 1)
(2, 3)
(3, 1)
(3, 2)


In [296]:
for i in itertools.product(a,a):
    print(i)

(1, 1)
(1, 2)
(1, 3)
(2, 1)
(2, 2)
(2, 3)
(3, 1)
(3, 2)
(3, 3)


### Errors and Exception Handling
* You can catch multiple exception types by writing a tuple of exception types instead (the parentheses are required):

```python
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x
```
* In some cases, you may not want to suppress an exception, but you want some code to be executed regardless of whether the code in the try block succeeds or not. To do this, use `finally`:

```python
f = open(path, 'w')
try:
    write_to_file(f)
finally:
    f.close()
```

* Here, the file handle f will always get closed. Similarly, you can have code that executes only if the `try`: block succeeds using `else`:

```python
f = open(path, 'w')
try:
    write_to_file(f)
except:
    print('Failed')
else:
    print('Succeeded')
finally:
    f.close()
```