# Built-in Data Structures, Functions, 

## Data Structures and Sequences

> List, Tuple, Set, Dictionary ,

### Tuple

In [None]:
tup = 4,5,6
tup

In [None]:
tup = (4, 5, 6)
tup

In [None]:
tup = 2

In [None]:
tup

it’s often necessary to enclose the values in parentheses for complex tuples

In [None]:
nested_tup = (4, 5, 6), (7, 8)
nested_tup

You can convert any sequence or iterator to a tuple by invoking tuple:



In [None]:
tup = tuple('string')
tup

In [None]:
tuple([4, 0, 2])


Elements can be accessed with square brackets [] as with most other sequence 

In [None]:
tup

In [None]:
tup[0]

Tuple are not mutable

In [None]:
tup = tuple(['foo', [1, 2], True])
tup[2] = False

If an object inside a tuple is mutable, such as a list, you can modify it in-place:



In [None]:
tup[1].append(4)
tup

You can concatenate tuples using the + operator to produce longer tuples:



In [None]:
(4, None, 'foo') + (6, 0) + ('bar',)

Multiplying a tuple by an integer, as with lists, has the effect of concatenating together that many copies of the tuple:



In [None]:
('foo', 'bar') * 4

#### Unpacking tuples

In [None]:
tup = (4, 5, 6)
a, b, c = tup
b

In [None]:
tup = 4, 5, (6, 7)
a, b, (c, d) = tup
d

In [None]:
a = 3
b = 4

In [None]:
tmp = a
a = b
b = tmp

In [None]:
a

In [None]:
a, b = 1, 2


In [None]:

b, a = a, b


In [None]:
b

A common use of variable unpacking is iterating over sequences of tuples or lists:



In [None]:
seq = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
for a, b, c in seq:
    print('a={0}, b={1}, c={2}'.format(a, b, c))

The Python language recently acquired some more advanced tuple unpacking to help with situations where you may want to "pluck" a few elements from the beginning of a tuple. This uses the special syntax *rest, which is also used in function signatures to capture an arbitrarily long list of positional arguments:

In [None]:
values = 1, 2, 3, 4, 5
a, b  = values

In [None]:
values = 1, 2, 3, 6, 5
*rest, a, b , d = values

In [None]:
rest

In [None]:
a, b


In [None]:
rest

This rest bit is sometimes something you want to discard; there is nothing special about the rest name. As a matter of convention, many Python programmers will use the underscore (_) for unwanted variables:

In [None]:
a, b, *_ = values

#### Tuple methods

Since the size and contents of a tuple cannot be modified, it is very light on instance methods

In [None]:
a = (1, 2, 2, 2, 3, 4, 2)
a.count(2)

In [None]:
a.

## List

In contrast with tuples, lists are variable-length and their contents can be modified in-place. You can define them using square brackets [] or using the list type function:

- []
- list()

In [None]:
a_list = [2, 3, 7, "isa"]
tup = ('foo', 1, 'baz')

In [None]:
b_list = list(tup)
b_list

In [None]:
b_list[1] = 'peekaboo'
b_list

> Lists and tuples are semantically similar (though tuples cannot be modified) and can be used interchangeably in many functions.

In [None]:
gen = range(1,10)
gen


In [None]:
type(gen)

#### Adding and removing elements

Elements can be appended to the end of the list with the append method:



In [None]:
b_list

In [None]:
b_list.append('dwarf')
b_list

Using insert you can insert an element at a specific location in the list:



In [None]:
b_list.insert(1, 'red')
b_list

> insert is computationally expensive compared with append, because references to subsequent elements have to be shifted internally to make room for the new element. If you need to insert elements at both the beginning and end of a sequence, you may wish to explore collections.deque, a double-ended queue, for this purpose.



The inverse operation to insert is pop, which removes and returns an element at a particular index:



In [None]:
b_list.pop(2)


In [None]:
b_list

Elements can be removed by value with remove, which locates the first such value and removes it from the list:



In [None]:
b_list.append('foo')
b_list


In [None]:
b_list.remove('foo')
b_list

Check if a list contains a value using the in keyword:



In [None]:
'dwarf' in b_list

In [None]:
'dwarf' not in b_list

#### Concatenating and combining lists

Similar to tuples, adding two lists together with + concatenates them:



In [None]:
[4, None, 'foo'] + [7, 8, (2, 3)]

If you have a list already defined, you can append multiple elements to it using the extend method:



In [None]:
x = [4, None, 'foo']
x.extend([7, 8, (2, 3)])
x

> Note that list concatenation by addition is a comparatively expensive operation since a new list must be created and the objects copied over. Using extend to append elements to an existing list, especially if you are building up a large list, is usually preferable. Thus,



In [None]:
everything = []
for chunk in list_of_lists:
    everything.extend(chunk)

is faster than the concatenative alternative:



In [None]:
everything = []
for chunk in list_of_lists:
    everything = everything + chunk

#### Sorting

You can sort a list in-place (without creating a new object) by calling its sort function:



In [None]:
a = [7, 2, 5, 1, 3]
a.sort()
a

In [None]:
sorted = a.sort()

In [None]:
print(sorted)

sort has a few options that will occasionally come in handy. One is the ability to pass a secondary sort key—that is, a function that produces a value to use to sort the objects. For example, we could sort a collection of strings by their lengths:

In [None]:
b = ['saw', 'small', 'He', 'foxes', 'six']
b.sort(key=len)
b

#### Binary search and maintaining a sorted list

Bisect module implements binary search and insertion into a sorted list. bisect.bisect finds the location where an element should be inserted to keep it sorted, while bisect.insort actually inserts the element into that location

In [None]:
import bisect
c = [1, 2, 2, 2, 3, 4, 7]
bisect.bisect(c, 2)

In [None]:
bisect.insort(c, 6)
c

> The bisect module functions do not check whether the list is sorted, as doing so would be computationally expensive. Thus, using them with an unsorted list will succeed without error but may lead to incorrect results.


#### Slicing

You can select sections of most sequence types by using slice notation, which in its basic form consists of start:stop.The element at the start index is included, the stop index is not included. so the number of elements in the result is stop - start

In [None]:
start:stop

In [None]:
seq = [7, 2, 3, 7, 5, 6, 0, 1]
seq[1:5]

Slices can also be assigned to with a sequence:

In [None]:
seq[3:4] = [6, 3]
seq

> start or stop can be omitted, in which case they default to the start of the sequence and the end of the sequence, 

In [None]:
seq[:5]

In [None]:
seq

In [None]:
seq[3:]

Negative indices slice the sequence relative to the end:

In [None]:
seq

In [None]:
seq[-4:]

In [None]:
seq[-6:-2]

A step can also be used after a second colon to, say, take every other element: _**start: stop: step**_

In [None]:
seq

In [None]:
seq[::2]

Reversing a list or tuple

In [None]:
seq[::-1]

### Built-in Sequence Functions

Python has a handful of useful sequence functions that are handy

#### Enumerate: 
keeps track of the index of a sequence

In [None]:
some_list = ['foo', 'bar', 'baz']
for x,y in enumerate(some_list):
    print((x,y))


In [None]:
some_list = ['foo', 'bar', 'baz']
mapping = {}
for i, v in enumerate(some_list):
    mapping[v] = i
mapping

#### Sorted

The sorted function returns a new sorted list from the elements of any sequence:



In [None]:
 s = sorted([7, 1, 2, 6, 0, 3, 2])

In [None]:
s

In [None]:
sorted([7, 1, 2, 6, 0, 3, 2], reverse = True)

In [None]:
sorted('horse race')

#### zip

zip “pairs” up the elements of a number of lists, tuples, or other sequences to create a list of tuples:



In [None]:
seq1 = ['foo', 'bar', 'baz']
seq2 = ['one', 'two', 'three']
zipped = zip(seq1, seq2)
zipped

In [None]:
list(zipped)

zip can take an arbitrary number of sequences, and the number of elements it produces is determined by the shortest sequence

In [None]:
seq3 = [False, True]
list(zip(seq1, seq2, seq3))

A common use of zip is simultaneously iterating over multiple sequences, possibly also combined with enumerate

In [None]:
list(zip(seq1, seq2))

In [None]:
for i, (a, b) in enumerate(zip(seq1, seq2)):
    print('{0}: {1}, {2}'.format(i, a, b))

Given a “zipped” sequence, zip can be applied in a clever way to “unzip” the sequence

In [None]:
pitchers = [('Nolan', 'Ryan'), ('Roger', 'Clemens'),
            ('Curt', 'Schilling')]
first_names, last_names = zip(*pitchers)

In [None]:
first_names

In [None]:
last_names

#### reversed

reversed iterates over the elements of a sequence in reverse order:



In [None]:
reversed(range(10))

 Reversed is a generator (to be discussed in some more detail later), so it does not create the reversed sequence until materialized (e.g., with list or a for loop).



In [None]:
list(reversed(range(10)))

### dict

- dict may be the most important built-in Python data structure.

- A dict is an unordered collection of key-value pairs, where key and value are Python objects. Each key is associated with a value so that a value can be conveniently retrieved, inserted, modified, or deleted given a particular key. 
  
- One approach for creating one is to use curly braces {} and colons to separate keys and values:

In [1]:
empty_dict = {}
d1 = {'a' : 'some value', 'b' : [1, 2, 3, 4]}
d1

{'a': 'some value', 'b': [1, 2, 3, 4]}

You can access, insert, or set elements using the same syntax as for accessing elements of a list or tuple:



In [2]:
a = [1,3,4]

a[2] = 6 # list overwite 

a

[1, 3, 6]

In [3]:
d1[7] = 'an integer' # dictionary adding
d1


{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

In [4]:
d1['b']

[1, 2, 3, 4]

You can check if a dict contains a key using the same syntax used for checking whether a list or tuple contains a value

In [5]:
'b' in d1

True

You can delete values either using the del keyword or the pop method (which simultaneously returns the value and deletes the key):

In [6]:
d1[5] = 'some value'
d1


{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer', 5: 'some value'}

In [7]:
d1['dummy'] = 'another value'
d1


{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 5: 'some value',
 'dummy': 'another value'}

In [8]:
del d1[5]
d1


{'a': 'some value',
 'b': [1, 2, 3, 4],
 7: 'an integer',
 'dummy': 'another value'}

In [9]:
ret = d1.pop('dummy')
ret


'another value'

In [10]:
d1

{'a': 'some value', 'b': [1, 2, 3, 4], 7: 'an integer'}

The keys and values method give you iterators of the dict's keys and values. The order of the keys depends on the order of their insertion, and these functions output the keys and values in the same respective order:

In [11]:
d1.keys()

dict_keys(['a', 'b', 7])

In [12]:
type(d1.keys()) # gives iterators of dic keys and values

dict_keys

In [13]:
list(d1.keys())

['a', 'b', 7]

In [14]:
tuple(d1.keys())

('a', 'b', 7)

In [15]:
list(d1.values())

['some value', [1, 2, 3, 4], 'an integer']

You can merge one dict into another using the update method:



In [18]:
d1.update({'b' : 'foo', 'c' : 12}) # list we use extend
d1

{'a': 'some value', 'b': 'foo', 7: 'an integer', 'c': 12}

> The update method changes dicts in-place, so any existing keys in the data passed to update will have their old values discarded.

#### Creating dicts from sequences

It’s common to occasionally end up with two sequences that you want to pair up element-wise in a dict. As a first cut, you might write code like this:



In [None]:
mapping = {}
for key, value in zip(key_list, value_list):
    mapping[key] = value

Since a dict is essentially a collection of 2-tuples, the dict function accepts a list of 2-tuples:



In [19]:
 mapping = dict(zip(range(5), reversed(range(5))))# dict function accept list of 2-tupples
 mapping

{0: 4, 1: 3, 2: 2, 3: 1, 4: 0}

#### Valid dict key types

- The values of a dict can be any Python object, 

- The keys generally have to be immutable objects like scalar types (int, float, string) or tuples (all the objects in the tuple need to be immutable, too). The technical term here is hashability. 

- You can check whether an object is hashable (can be used as a key in a dict) with the hash function:

In [30]:
{'string' : [2,1], 5: "b" }

{'string': [2, 1], 5: 'b'}

In [29]:
hash('string')

-2908809811889502488

In [31]:
hash((1, 2, (2, 3)))

-9209053662355515447

In [32]:
hash((1, 2, [2, 3])) # fails because lists are mutable

TypeError: unhashable type: 'list'

To use a list as a key, one option is to convert it to a tuple, which can be hashed as long as its elements also can:



In [33]:
d = {}
d[tuple([1, 2, 3])] = 5
d

{(1, 2, 3): 5}

In [34]:
hash((1, 2, 3))

529344067295497451

In [35]:
hash(5)

5

### set

> A set is an unordered collection of unique elements. You can think of them like dict keys, but keys only, no values. 

A set can be created in two ways: via the set function or via a set literal with curly braces:

In [37]:
list((1,2))
dict
tuple
set

[1, 2]

In [38]:
set([2, 2, 2, 1, 3, 3])

{1, 2, 3}

In [39]:
{2, 2, 2, 1, 3, 3}

{1, 2, 3}

Sets support mathematical set operations like union, intersection, difference, and symmetric difference.

In [40]:
a = {1, 2, 3, 4, 5}
b = {3, 4, 5, 6, 7, 8}

The union of these two sets is the set of distinct elements occurring in either set. This can be computed with either the union method or the | binary operator

In [41]:
a.union(b)

{1, 2, 3, 4, 5, 6, 7, 8}

In [47]:
m = a | b
m

{1, 2, 3, 4, 5, 6, 7, 8}

The intersection contains the elements occurring in both sets. The & operator or the intersection method can be used:


In [43]:
a.intersection(b)


{3, 4, 5}

In [44]:
a & b

{3, 4, 5}

All of the logical set operations have in-place counterparts, which enable you to replace the contents of the set on the left side of the operation with the result

In [45]:
c = a.copy()
c

{1, 2, 3, 4, 5}

In [48]:
c |= b
c

{1, 2, 3, 4, 5, 6, 7, 8}

In [None]:
d = a.copy()
d &= b
d

Like a dict's keys, a set's elements generally must be immutable, and they must be hashable (which means that calling hash on a value does not raise an exception). In order to store list-like elements (or other mutable sequences) in a set, you can convert them to tuples:

In [49]:
my_data = {[1, 2, 3, 4]} # set elements must be immutable
my_data

TypeError: unhashable type: 'list'

In [50]:
my_data = {(1, 2, 3, 4)} # set elements must be immutable
my_data

{(1, 2, 3, 4)}

In [None]:
my_data = [1, 2, 3, 4]
my_set = {tuple(my_data)} # we can change the set element to tuple and then hashable  
my_set

You can also check if a set is a subset of (is contained in) or a superset of (contains all elements of) another set

In [51]:
a_set = {1, 2, 3, 4, 5}
{1, 2, 3}.issubset(a_set)

True

In [52]:
a_set.issuperset({1, 2, 3})

True

Sets are equal if and only if their contents are equal:



In [53]:
{1, 2, 3} == {3, 2, 1}

True

### List, Set, and Dict Comprehensions

#### Motivation : creating list using for loop

Assume we want to create a list containing the first ten perfect squares, then you can complete these steps in three lines of code:

In [55]:
list(range(10))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [56]:
squares = []
for i in range(10): # 0,1,....9
    squares.append(i * i)
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

> Here, you instantiate an empty list, squares. Then, you use a for loop to iterate over range(10). Finally, you multiply each number by itself and append the result to the end of the list.

#### Motivation : creating list using map function

> map() provides an alternative approach that’s based in functional programming. You pass in a function and an iterable, and map() will create an object. 

As an example, consider a situation in which you need to calculate the price after tax for a list of transactions:

In [57]:
txns = [1.09, 23.56, 57.84, 4.56, 6.78]
TAX_RATE = .08

def get_price_with_tax(txn):
    return txn * (1 + TAX_RATE)


In [58]:
final_prices = map(get_price_with_tax, txns)
list(final_prices)

[1.1772000000000002, 25.4448, 62.467200000000005, 4.9248, 7.322400000000001]

- List comprehensions are a third way of making lists. With this elegant approach, you could rewrite the for loop and the map approach in just a single line of code

  
  - transforming the elements passing the filter in one concise expression. 

             new_list = [expression for member in iterable]


In [59]:
squares = []
for i in range(10): # 0,1,....9
    squares.append(i * i)
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

In [61]:
import math

In [63]:
squares = [i * i for i in range(10)]
squares

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

Every list comprehension in Python includes three elements:

1. **expression** is the member itself, a call to a method, or any other valid expression that returns a value. In the example above, the expression i * i is the square of the member value.

2. **member** is the object or value in the list or iterable. In the example above, the member value is i.

3. **iterable** is a list, set, sequence, generator, or any other object that can return its elements one at a time. In the example above, the iterable is range(10)

#### List comprehension for  filtering the elements of a collection, 


             new_list = [expression for member in iterable (if conditional)]


 > Conditionals are important because they allow list comprehensions to filter out unwanted values, which would normally require a call to filter():


For example, given a list of strings, we could filter out strings with length 2 or less and also convert them to uppercase like this:



In [64]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
[x.upper() for x in strings if len(x) > 2]

['BAT', 'CAR', 'DOVE', 'PYTHON']

Example, filtering out any characters in sentence that aren’t a vowel.

In [65]:
sentence = 'the rocket came back from mars'
vowels = [i for i in sentence if i in 'aeiou']
vowels

['e', 'o', 'e', 'a', 'e', 'a', 'o', 'a']

Set and dict comprehensions are a natural extension, producing sets and dicts in an idiomatically similar way instead of lists. A dict comprehension looks like this:



In [None]:
dict_comp = {key-expr : value-expr for value in collection
             if condition}

A set comprehension looks like the equivalent list comprehension except with curly braces instead of square bracket

In [None]:
set_comp = {expr for value in collection if condition}

> Like list comprehensions, set and dict comprehensions are mostly conveniences, but they similarly can make code both easier to write and read. 

In [70]:
strings = ['a', 'as', 'bat', 'car', 'dove', 'python']
unique_lengths = {len(x) for x in strings}
unique_lengths

{1, 2, 3, 4, 6}

We could also express this more functionally using the map function:



In [None]:
set(map(len, strings))

As a simple dict comprehension example, we could create a lookup map of these strings to their locations in the list:



In [71]:
loc_mapping = {val : index for index, val in enumerate(strings)}
loc_mapping

{'a': 0, 'as': 1, 'bat': 2, 'car': 3, 'dove': 4, 'python': 5}

##### List comprehension for updating values

> You can place the conditional at the end of the statement for simple filtering, but what if you want to change a member value instead of filtering it out? In this case, it’s useful to place the conditional near the beginning of the expression:

                new_list = [expression (if conditional) for member in iterable]

For example, if you have a list of prices, then you may want to replace negative prices with 0 and leave the positive values unchanged:

In [72]:
original_prices = [1.25, -9.45, 10.22, 3.78, -5.92, 1.16]
prices = [i if i > 0 else 0 for i in original_prices]
prices

[1.25, 0, 10.22, 3.78, 0, 1.16]

#### Nested list comprehensions

Suppose we have a list of lists containing some English and Spanish names:

In [73]:
all_data = [['John', 'Emily', 'Michael', 'Mary', 'Steven'],
            ['Maria', 'Juan', 'Javier', 'Natalia', 'Pilar']]

Suppose we wanted to get a single list containing all names with two or more a’s in them. We could certainly do this with a simple for loop:

In [None]:
names_of_interest = []
for names in all_data:
    enough_es = [name for name in names if name.count('a') >= 2]
    names_of_interest.extend(enough_es)

names_of_interest

You can actually wrap this whole operation up in a single nested list comprehension, which will look like:



In [75]:
result = [name for names in all_data for name in names if name.count('a') >= 2]
result

['Maria', 'Natalia']

> At first, nested list comprehensions are a bit hard to wrap your head around. The for parts of the list comprehension are arranged according to the order of nesting, and any filter condition is put at the end as before. 

Here is another example where we “flatten” a list of tuples of integers into a simple list of integers:

In [2]:
some_tuples = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
flattened = [x for tup in some_tuples for x in tup]
flattened

[1, 2, 3, 4, 5, 6, 7, 8, 9]

Keep in mind that the order of the for expressions would be the same if you wrote a nested for loop instead of a list comprehension:

In [None]:
flattened = []

for tup in some_tuples:
    for x in tup:
        flattened.append(x)

t’s important to distinguish the syntax just shown from a list comprehension inside a list comprehension, which is also perfectly valid:

In [3]:
[[x for x in tup] for tup in some_tuples]

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

This produces a list of lists, rather than a flattened list of all of the inner elements.


Assume you have this nested list and you want convert each element in a list to float

In [None]:
l = [['40', '20'], ['20', '20', '20'], ['30', '20', '30']]

In [None]:
[[float(y) for y in x] for x in l]

In [None]:
l = [['40', '20'], ['20', '20', '20'], ['30', '20', '30']]

In [None]:
[float(y) for x in l for y in x]# flatenn the list

## Functions

> You should consider writing a function whenever you've copied and pasted a block of code more than twice (i.e. you now have three copies of the same code)... Hadley Wickham

- Functions are the primary and most important method of code organization and reuse in Python. 

- As a rule of thumb, if you anticipate needing to repeat the same or very similar code more than once, it may be worth writing a reusable function. 

- Functions can also help make your code more readable by giving a name to a group of Python statements.

> Functions are declared with the def keyword. A function contains a block of code with an optional use of the with the return keyword:

In [83]:
def sum(a,b):
    """_summary_

    Args:
        a (_type_): _description_
        b (_type_): _description_

    Returns:
        _type_: _description_
    """


    return a+b

sum(3,4)

7

In [84]:
sum?

[0;31mSignature:[0m [0msum[0m[0;34m([0m[0ma[0m[0;34m,[0m [0mb[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
_summary_

Args:
    a (int): omn
    b (int): _description_

Returns:
    int: return sum of 2 num 
[0;31mFile:[0m      /var/folders/l_/499_8k_5387b9tz3wp5g8lg00000gn/T/ipykernel_30799/3104519110.py
[0;31mType:[0m      function


In [90]:
def my_function(x, y, z=1.5): # x and y are called paramenets
    if z > 1:
        return z * (x + y)
    else:
        return z / (x + y)

In [92]:
my_function(5, 6, z=0.7) # 5 and 6 are argument


0.06363636363636363

In [87]:
my_function(10, 20)

45.0

> The terms parameter and argument can be used for the same thing: information that are passed into a function. From a function's perspective: A parameter is the variable listed inside the parentheses in the function definition. An argument is the value that are sent to the function when it is called.

If Python reaches the end of a function without encountering a return statement, None is returned automatically. For example:

In [98]:
def function_without_return(x):
    print(x)


In [99]:
function_without_return(4)

4


In [100]:
result = function_without_return('hello!')

hello!


In [102]:
result

In [101]:
print(result)


None


### Namespaces, Scope, and Local Functions

Functions can access variables created inside the function as well as those outside the function in higher (or even global) scopes

> An alternative and more descriptive name describing a variable scope in Python is a namespace. Any variables that are assigned within a function by default are assigned to the local namespace

In [106]:
def func():
    z = [] # a is local to the function
    for i in range(5):
        z.append(i)

In [107]:
func()

In [108]:
z

NameError: name 'z' is not defined

> When func() is called, the empty list a is created, five elements are appended, and then a is destroyed when the function exits. 


- Suppose instead we had declared a as follows

In [109]:
a = [] # a is global
def func():
    for i in range(5):
        a.append(i)

Each call to func will modify the list a:

In [110]:
func()

In [111]:
a

[0, 1, 2, 3, 4]

Assigning variables outside of the function's scope is possible, but those variables must be declared explicitly either using the global the global or nonlocal keywords:

In [None]:
a = None
def bind_a_variable():
    global a
    a = [1,2]


In [None]:
bind_a_variable()
print(a)

### Returning Multiple Values

In [115]:
def f():
    a = 5
    b = 6
    c = 7
    return (a, b, c)

In [113]:
a, b, c = f()

In [114]:
a

5

 What’s happening here is that the function is actually just returning one object, a tuple, which is then being unpacked into the result variables. In the preceding example, we could have done this instead:

In [116]:
return_value = f()
return_value

(5, 6, 7)

Previous one is same as this

In [117]:
def f():
    a = 5
    b = 6
    c = 7
    return (a, b, c)

a, b, c = f()

We can also return a dictionary

In [118]:
def f():
    a = 5
    b = 6
    c = 7
    return {'a' : a, 'b' : b, 'c' : c}

### Functions Are Objects

- Python functions are objects, many constructs can be easily expressed that are difficult to do in other language

- Suppose we were doing some data cleaning and needed to apply a bunch of transformations to the following list of strings:

In [119]:
states = [' Alabama ', 'Georgia!', 'Georgia', 'georgia', 'FlOrIda',
          'south   carolina##', 'West virginia?']

In [120]:
import re

def clean_strings(strings):
    result = []
    for value in strings:
        value = value.strip()
        value = re.sub('[!#?]', '', value)
        value = value.title()
        result.append(value)
    return result

In [121]:
clean_strings(states)

['Alabama',
 'Georgia',
 'Georgia',
 'Georgia',
 'Florida',
 'South   Carolina',
 'West Virginia']

> An alternative approach that you may find useful is to make a list of the operations you want to apply to a particular set of strings:

In [None]:
def remove_punctuation(value):
    return re.sub('[!#?]', '', value)

clean_ops = [str.strip, remove_punctuation, str.title]

def clean_strings(strings, ops):
    result = []
    for value in strings:
        for function in ops:
            value = function(value)
        result.append(value)
    return result

In [None]:
clean_strings(states, clean_ops)

A more functional pattern like this enables you to easily modify how the strings are transformed at a very high level. The clean_strings function is also now more reusable and generic.

You can use functions as arguments to other functions like the built-in map function, which applies a function to a sequence of some kind:

In [None]:
for x in map(remove_punctuation, states):
    print(x)

### Anonymous (Lambda) Functions

> Python has support for so-called anonymous or lambda functions, which are a way of writing functions consisting of a single statement, the result of which is the return value. They are defined with the **lambda** keyword, which has no meaning other than “we are declaring an anonymous function”:

In [126]:
def short_function(x,y):
    return x * y

In [127]:
short_function(3,4)

12

equivalent implementation using lambda

In [128]:
equiv_anon = lambda x,y: x * y

In [129]:
equiv_anon(3,4)

12

Lamda functions are especially convenient in data analysis and It’s often less typing (and clearer) to pass a lambda function as opposed to writing a full-out function declaration or even assigning the lambda function to a local variable.

consider this silly example:

In [130]:
def apply_to_list(some_list, f):
    return [f(x) for x in some_list]


In [131]:

ints = [4, 0, 1, 5, 6]
apply_to_list(ints, lambda x: x * 2)

[8, 0, 2, 10, 12]

As another example, suppose you wanted to sort a collection of strings by the number of distinct letters in each string:

In [None]:
strings = ['foo', 'card', 'bar', 'aaaa', 'abab']

Here we could pass a lambda function to the list’s sort method:

In [None]:
strings.sort(key=lambda x: len(set(list(x))))
strings

### Currying: Partial Argument Application

> Currying is computer science jargon (named after the mathematician Haskell Curry) that means deriving new functions from existing ones by partial argument application. For example, suppose we had a trivial function that adds two numbers together:

In [None]:
def add_numbers(x, y):
    return x + y

Using this function, we could derive a new function of one variable, add_five, that adds 5 to its argument:

In [None]:
add_five = lambda y: add_numbers(5, y)

> The second argument to add_numbers is said to be curried. There’s nothing very fancy here, as all we have done is define a new function that calls an existing function. 

The built-in functools module can simplify this process using the partial function:

In [None]:
from functools import partial
add_five = partial(add_numbers, 5)

### Generators

 > generator functions are a special kind of function that return a lazy iterator. These are objects that you can loop over like a list. However, unlike lists, lazy iterators do not store their contents in memory

> This is accomplished by means of the iterator protocol, a generic way to make objects iterable.

 For example, iterating over a dict yields the dict keys:

In [None]:
some_dict = {'a': 1, 'b': 2, 'c': 3}
for key in some_dict:
    print(key)

When you write for key in some_dict, the Python interpreter first attempts to create an iterator out of some_dict:

In [None]:
dict_iterator = iter(some_dict)
dict_iterator

In [None]:
next(dict_iterator)

In [None]:
next(dict_iterator)

In [None]:
next(dict_iterator)

> An iterator is any object that will yield objects to the Python interpreter when used in a context like a for loop. Most methods expecting a list or list-like object will also accept any iterable object. This includes built-in methods such as min, max, and sum, and type constructors like list and tuple:

In [None]:
dict_iterator = iter(some_dict)
dict_iterator

In [None]:
list(dict_iterator)

In [None]:
dict_iterator = iter(some_dict)
dict_iterator
tuple(dict_iterator)

In [None]:
dict_iterator = iter(some_dict)
dict_iterator
max(dict_iterator)

In [None]:
dict_iterator = iter(some_dict)
dict_iterator
min(dict_iterator)

> Generators do not hold entire result in memory

In [None]:
a = range(5)
a

> A generator is a convenient way, similar to writing a normal function, to construct a new iterable object. Whereas normal functions execute and return a single result at a time, generators return a sequence of multiple results lazily, pausing after each one until the next one is requested. To create a generator, use the yield keyword instead of return in a function:

In [None]:
def square(nums):
    result = []
    for i in nums:
        result.append(i * i)
    return result

my_num = square([1,2,3,4])

print(my_num)

How can we convert this to generators?

In [None]:
def square(nums):
    for i in nums:
      yield (i * i ) # this what makes it generator

my_num = square([1,2,3,4])

print(my_num)

Now the output is generator, not a list

> reason for this is because generators don't hold the entire result in memory it yields one result at a time so really this is waiting for us to ask for the next result so it has hasn't actually computed anything yet now

In [None]:
def square(nums):
    for i in nums:
      yield (i * i ) # this what makes it generator

my_num = square([1,2,3,4])



In [None]:
print(next(my_num))

In [None]:
print(next(my_num))

In [None]:
print(next(my_num))

In [None]:
print(next(my_num))

what if i was to run next one more time? We will grt an error here. This means that the entire generator has been exhausted and stop iteration just means that it's out of values.


In [None]:
print(next(my_num))

we still can use a for loop on these generators and this is how generators are use

In [None]:
def square(nums):
    for i in nums:
      yield (i * i ) # this what makes it generator

my_num = square([1,2,3,4])

for num in my_num:
     print(num)

In [None]:
def squares(n=10):
    print('Generating squares from 1 to {0}'.format(n ** 2))
    for i in range(1, n + 1):
        yield i ** 2

In [None]:
gen = squares()
gen

In [None]:
for x in gen:
    print(x, end=' ')

#### Generator expresssions

In [None]:
gen = (x ** 2 for x in range(100))
gen

def _make_gen():
    for x in range(100):
        yield x ** 2
gen = _make_gen()

In [None]:
sum(x ** 2 for x in range(100))
dict((i, i **2) for i in range(5))

#### itertools module

In [None]:
import itertools
first_letter = lambda x: x[0]
names = ['Alan', 'Adam', 'Wes', 'Will', 'Albert', 'Steven']
for letter, names in itertools.groupby(names, first_letter):
    print(letter, list(names)) # names is a generator

### Errors and Exception Handling

In [None]:
float('1.2345')
float('something')

In [None]:
def attempt_float(x):
    try:
        return float(x)
    except:
        return x

In [None]:
attempt_float('1.2345')
attempt_float('something')

In [None]:
float((1, 2))

In [None]:
def attempt_float(x):
    try:
        return float(x)
    except ValueError:
        return x

In [None]:
attempt_float((1, 2))

In [None]:
def attempt_float(x):
    try:
        return float(x)
    except (TypeError, ValueError):
        return x

f = open(path, 'w')

try:
    write_to_file(f)
finally:
    f.close()

f = open(path, 'w')

try:
    write_to_file(f)
except:
    print('Failed')
else:
    print('Succeeded')
finally:
    f.close()

#### Exceptions in IPython

In [10]: %run examples/ipython_bug.py
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
/home/wesm/code/pydata-book/examples/ipython_bug.py in <module>()
     13     throws_an_exception()
     14
---> 15 calling_things()

/home/wesm/code/pydata-book/examples/ipython_bug.py in calling_things()
     11 def calling_things():
     12     works_fine()
---> 13     throws_an_exception()
     14
     15 calling_things()

/home/wesm/code/pydata-book/examples/ipython_bug.py in throws_an_exception()
      7     a = 5
      8     b = 6
----> 9     assert(a + b == 10)
     10
     11 def calling_things():

AssertionError:

## Files and the Operating System

In [None]:
%pushd book-materials

In [None]:
path = 'examples/segismundo.txt'
f = open(path)

for line in f:
    pass

In [None]:
lines = [x.rstrip() for x in open(path)]
lines

In [None]:
f.close()

In [None]:
with open(path) as f:
    lines = [x.rstrip() for x in f]

In [None]:
f = open(path)
f.read(10)
f2 = open(path, 'rb')  # Binary mode
f2.read(10)

In [None]:
f.tell()
f2.tell()

In [None]:
import sys
sys.getdefaultencoding()

In [None]:
f.seek(3)
f.read(1)

In [None]:
f.close()
f2.close()

In [None]:
with open('tmp.txt', 'w') as handle:
    handle.writelines(x for x in open(path) if len(x) > 1)
with open('tmp.txt') as f:
    lines = f.readlines()
lines

In [None]:
import os
os.remove('tmp.txt')

### Bytes and Unicode with Files

In [None]:
with open(path) as f:
    chars = f.read(10)
chars

In [None]:
with open(path, 'rb') as f:
    data = f.read(10)
data

In [None]:
data.decode('utf8')
data[:4].decode('utf8')

In [None]:
sink_path = 'sink.txt'
with open(path) as source:
    with open(sink_path, 'xt', encoding='iso-8859-1') as sink:
        sink.write(source.read())
with open(sink_path, encoding='iso-8859-1') as f:
    print(f.read(10))

In [None]:
os.remove(sink_path)

In [None]:
f = open(path)
f.read(5)
f.seek(4)
f.read(1)
f.close()

In [None]:
%popd

## Conclusion