## 03 - 00 Python in Practice

- In the previous modules, we learnt about python's built-in 'simple' types. In this module, we will learn about the different ways in which the data of those types can be organized, so that it can be used (algorithmically) efficiently.
> we have already covered `str`, `int`, `float`, and `complex`. Here, we will also introduce the remaining two -- `bool` and `None` python's built-in types

- We will also take a look at how to reduce code duplication and improve code readability by using Functions.

- Finally we will look at Exception Handling in which we will learn about ways of making our code handle the errors without crashing.

## Data Structures
In the previous modules, we learnt about python's built-in 'simple' types. In this module, we will learn about the different ways in which the data of those types can be organized, so that it can be used (algorithmically) efficiently.
> we have already covered `str`, `int`, `float`, and `complex`. Here, we will also introduce the remaining two -- `bool` and `None` python's built-in types

Python's most basic data structure is the `sequence`. Each element of a sequence is assigned a number known as index number. Different types of sequences in Python are:
- `List`
- `Tuple`
- `Range`
- Text sequence types e.g. `str`
- Binary sequence types e.g. `bytearray`, `buffer`

Python has several built-in data structures (or compound types) that act as containers and hold the other types. They are:
- List
- Tuples
- Sets
- Dictionaries

## 03 - 01 Lists
The `list` is the most versatile datatype available in Python which can be written as a `list` of comma-separated values between square brackets. Creating a list is as simple as putting different items separated by comma between square brackets.

Example:

In [1]:
# Notice the mix data types
my_list = ['Python', 'Julia', 1, 3.1415]
print("Contents of my_list:.{} \nType: {} ".format(my_list, type(my_list)))

Contents of my_list:.['Python', 'Julia', 1, 3.1415] 
Type: <class 'list'> 


> `\n` is the escape character for line break

Python is zero-index based, thus to get the first item we simply ask for the item/ element on the 0th index.

In [2]:
my_list[0]

'Python'

We will now briefly go over the basic list manipulations (which are similar to strings) and then look at some more methods that makes lists unique.

#### .. 01.01 Slicing the List

A shallow copy of the list is performed and a new list is created containing the requested elements.

Example:

In [3]:
my_list = ['Python', 'Julia', 1, 3.1415]
# Performs shallow copy and returns a **new** list with first two elements
my_list[:2] # [:2] means between start=0 and stop=2 index values (excluding stop)

['Python', 'Julia']

The slicing can also  be done to get the **n-th** value from a list by passing n as the third argument.

In [4]:
my_list[1::2]   

['Julia', 3.1415]

In [5]:
my_list[0::2]

['Python', 1]

In [6]:
range(10)[1::3]

range(1, 10, 3)

In [7]:
# Remember, it doesn't include the nth index value when traversing a list.
my_list[-3:-2:1]  # Take every element between index value -3 and -2

['Julia']

In [8]:
my_list[-3:-1]

['Julia', 1]

#### .. 01.02 Updating the List

Unlike strings and tuples which are immutable, elements in list can be changed without having to create a new list, thus making it mutable.

Example:

In [2]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list[2] = 'Java'
print('Original Contents: \n', my_list)
print('Original Length of array: \n', len(my_list))
# Remove some elements/ changing the size
my_list[2:4] = []
print('Modified Contents: \n', my_list)
print('Modified Length of array: \n', len(my_list))

Original Contents: 
 ['Python', 'Julia', 'Java', 3.1415]
Original Length of array: 
 4
Modified Contents: 
 ['Python', 'Julia']
Modified Length of array: 
 2


In [11]:
my_list = ['Python', 'Julia', 1, 3.1415]
print('Original\n {} ' .format(my_list))

Original
 ['Python', 'Julia', 1, 3.1415] 


In [14]:
my_list = ['Python', 'Julia', 1, 3.1415]    #for format print must needed {}
print('original:\n' .format(my_list))

original:



#### .. 01.03 Appending to the List

New items can be easily added to the list by using the `append()` method.

Example:

In [15]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list.append('C++') # It will append the item to the end of the list
print(my_list)

['Python', 'Julia', 1, 3.1415, 'C++']


#### .. 01.04 Copying Lists

There are many ways to create a copy of the lists in python. Lets take a look at few techniques:
##### using copy package

copy packages is packaged with the python so you don't have to install it externally to use it. So what is the reason of creating a separate package? From our first week's session on variables, we know that it is very easy to create copy of objects,right? We do know one thing for sure that assignment statements ( '=' ) in python do not copy the objects, they merely create bindings between the target and the object, right? It so happens that for collections that are mutable or contains mutable items, a copy is sometimes needed so that one can change the content of the mutable item without changing the other. There are actually two ways of creating a copy of an object viz: shallow copy and deep copy. In shallow copy, python constructs a new object and then inserts references to into it that are found in the original list, whereas deepcopy, as you must've guessed it, creates an object and copies everything. (If you are curious to know more about it, head over the [`official documentation`](https://docs.python.org/3.4/library/copy.html) )

In [16]:
import copy
my_list  = ['Python', 'Julia', 1, 3.1415]
my_list1 = copy.copy(my_list)  # Shallow copy.. Fast
my_list2 = copy.deepcopy(my_list)  # Deep copy.. Slower
print(my_list1,my_list2)

['Python', 'Julia', 1, 3.1415] ['Python', 'Julia', 1, 3.1415]


##### using slice

In [17]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list1 = my_list[:2]
print(my_list1)

['Python', 'Julia']


##### using list constructor

In [18]:
my_list = ['Python', 'Julia', 1, 3.1415]
# when list method takes a list as a parameter, it creates a copy of that list
my_list1 = list(my_list)
print(my_list1)

['Python', 'Julia', 1, 3.1415]


#### .. 01.05 Delete List Elements
To remove the list elements, one has two options to either use del statement or lists's remove method ( will be discussed later ).

Example:

In [19]:
my_list = [1, 2, 3, 4, 5]
del(my_list[4])
my_list

[1, 2, 3, 4]

#### .. 01.06 Nested Lists
It is also possible to create a list of the lists.

Example:

In [20]:
my_list1 = [1, 2, 3, 4, 5]
my_list2 = ['a', 'b', 'c', 'd', 'e']
my_list3 = [my_list1, my_list2]
my_list3

[[1, 2, 3, 4, 5], ['a', 'b', 'c', 'd', 'e']]

#### .. 01.07 List Concatenation, Repetition, Membership:

These are simple list manipulation methods similar to strings. Take a look at following example:

In [21]:
my_list1 = [1, 2, 3, 4, 5]
my_list2 = ['a', 'b', 'c', 'd', 'e']
my_list1 + my_list2  # List Concatenation

[1, 2, 3, 4, 5, 'a', 'b', 'c', 'd', 'e']

In [22]:
# List Repition
my_list1 * 2

[1, 2, 3, 4, 5, 1, 2, 3, 4, 5]

In [23]:
# Membership operator, returns true if member of list
3 in my_list1

True

#### .. 01.08 Traversing a list:

The most straightforward way to traverse a list is using loops:
> We will look at loops in detail in loops module

##### for loop:
Example:

In [24]:
my_list = [1, 2, 3, 4, 5]
for element in my_list:
    print(element)

1
2
3
4
5


Lets traverse using the index numbers

Example:

In [25]:
for index in range(len(my_list)): # start from 0 and go till the length of the list.
    print("my_list[{}] : {}".format(index, my_list[index]))

my_list[0] : 1
my_list[1] : 2
my_list[2] : 3
my_list[3] : 4
my_list[4] : 5


##### While loop:
Just like for loop, we can traverse the list based on its index numbers (again, we'll learn about loops in next module):

Example:

In [26]:
index = 0
# till index is less than length of list
while index < len(my_list):
    print("my_list[{}] : {}".format(index, my_list[index]))
    # increment index by 1 at every iteration
    index += 1

my_list[0] : 1
my_list[1] : 2
my_list[2] : 3
my_list[3] : 4
my_list[4] : 5


#### .. 01.09 enumerate( )
Python has a built-in method called enumerate which returns both index value and value of list ( or any other iterable object).

Example:

In [27]:
for ind, val in enumerate(my_list):      #Better one
    print("my_list[{}] : {}".format(ind, val))

my_list[0] : 1
my_list[1] : 2
my_list[2] : 3
my_list[3] : 4
my_list[4] : 5


#### 01.10 List Comprehension
List comprehension is a syntactic way of creating a list based on the existing list, just like we did in copying the lists above. 
The basic structure of the syntax includes a for loop that traverses the list and evaluates a condition using if.. else condition and stores the output of the condition as a new list. Lets take a look at a quick example.

Example:

In [29]:
my_list1 = [x for x in my_list if x % 2 == 0] 
print(my_list1)

[2, 4]


We simply create a list `my_list1` from the elements in `my_list` that are completely divisible by 2.

There are many ways in which the list comprehension can be used. It is just a shorthand of writing better readable code.

#### 01.11 Built- in List Functions and Methods:

Python provides following methods for lists:

##### max:

This method returns the elements from the list with maximum value.

Example:

In [11]:
my_list1 = [1, 2, 3]
max(my_list1)

3

In [20]:
my_list1=[1,2,3,4]
my_list2=[1,2,3,4,5]
k=[my_list1,my_list2]
k

[[1, 2, 3, 4], [1, 2, 3, 4, 5]]

In [21]:
max(k)   #now max of two list is calculated by length

[1, 2, 3, 4, 5]

In [23]:
min(k)

[1, 2, 3, 4]


> What do you think will happen if we compare a list of the lists (nested list)?

##### min:

This method returns the element from the list with minimum value.

In [22]:
min(my_list)

1

##### list:

This method takes sequence types and converts them to lists. This is also used to convert a tuple to list.

Example:

In [28]:
my_list = list(('Python', 'Julia', 1, 3.1415))  # iterable as a tuple
my_list

['Python', 'Julia', 1, 3.1415]

In [29]:
len(my_list)

4

In [31]:
my_list.count(3.1415)

1

The above line might be a little confusing. `list` is a built-in function which can either create an empty list if it is called with no parameters, or create a new list of the iterable/ sequence that it is given as an input. That means that list can at most 1 argument. Thus we have to put our elements in a circular bracket (which makes it a `tuple`, btw) and then pass it as an argument to list method. 

##### list.count(obj):
This method returns the number of times the object, that is passed as a parameter, occurs in the list.

Example:

In [32]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list.count('Python')

1

##### list.extend(seq):
This method appends the contents of a sequence to a list.
> Equivalent operation using slicing -- `my_list1[len(my_list1):] = my_list2` and print `my_list1`.

In [33]:
my_list1 = ['Python', 'Julia', 1, 3.1415]
my_list2 = ['C++', 'Java', 2, 2.7182]
my_list1.extend(my_list2)
print(my_list1)

['Python', 'Julia', 1, 3.1415, 'C++', 'Java', 2, 2.7182]


In [30]:
my_list1 = ['Python', 'Julia', 1, 3.1415]    ##This is the difference between extend and append
my_list2 = ['C++', 'Java', 2, 2.7182]
my_list1.append(my_list2)
my_list1

['Python', 'Julia', 1, 3.1415, ['C++', 'Java', 2, 2.7182]]

In [34]:
my_list1[len(my_list1):]= my_list2
my_list1

['Python',
 'Julia',
 1,
 3.1415,
 'C++',
 'Java',
 2,
 2.7182,
 'C++',
 'Java',
 2,
 2.7182]

##### list.index(obj):

This method returns the lowest index in the list that object appears.

In [36]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list.index('Julia')

1

##### list.insert(index, obj):

This method is used to insert the object at the offset index.

In [37]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list.insert(3, 2.7182)
print(my_list)

['Python', 'Julia', 1, 2.7182, 3.1415]


##### list.pop(obj = list[-1]):

This method removes and returns the removed object from the list. If you don't pass the argument to the function, it will by default `pop` the last element from the list

In [43]:
my_list = ['Python', 'Julia', 1, 3.1415]
# pop the last element
my_list.pop(3)

3.1415

##### list.remove(obj):

This method is used to remove the object from the list. The `object` to be removed should be passed as an argument to the function. Unlike pop, this does not return anything.

In [44]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list.remove('Julia')
print(my_list)

['Python', 1, 3.1415]


##### list.reverse():

This method reverses the objects of list in place

In [45]:
my_list = ['Python', 'Julia', 1, 3.1415]
my_list.reverse()
print(my_list)

[3.1415, 1, 'Julia', 'Python']


##### list.sort([func]):

This method sorts objects of list by using the compare function passed as optional parameter. You can also sort the string in reverse by passing the optional parameter `reverse=True`

In [41]:
my_list = [7, 6, 1, 9, 2]
my_list.sort()
print("Sorted:         ", my_list)
my_list.sort(reverse=True)
print("Reverse Sorted: ", my_list)

Sorted:          [1, 2, 6, 7, 9]
Reverse Sorted:  [9, 7, 6, 2, 1]


In [48]:
x = [1,9,4,6,8,3,6]
x.sort()
x

[1, 3, 4, 6, 6, 8, 9]

#### .. 01.12 Performance Characteristics:

The list has following performance characteristics:

- The list object stores pointers to objects, not the actual objects themselves. The size of a list in memory depends on the number of objects in the list, not the size of the objects.
- The time needed to get or set an individual item is constant, no matter what the size of the list is (also known as “`O(1)`” behaviour).
- The time needed to append an item to the list is “amortized constant”; whenever the list needs to allocate more memory, it allocates room for a few items more than it actually needs, to avoid having to reallocate on each call (this assumes that the memory allocator is fast; for huge lists, the allocation overhead may push the behaviour towards `O(n\*n)`).
- The time needed to insert an item depends on the size of the list, or more exactly, how many items that are to the right of the inserted item (`O(n)`). In other words, inserting items at the end is fast, but inserting items at the beginning can be relatively slow, if the list is large.
- The time needed to remove an item is about the same as the time needed to insert an item at the same location; removing items at the end is fast, removing items at the beginning is slow.
- The time needed to reverse a list is proportional to the list size (`O(n)`).
- The time needed to sort a list varies; the worst case is `O(n log n)`, but typical cases are often a lot better than that.

## 03 - 02 Tuples
Python tuple is much like a list except that it is immutable or unchangeable once created. Tuples use parentheses and creating them is as easy as putting different items separated by a comma between parentheses.

In [54]:
my_tup = ('Python', 'Julia', 1, 3.1415)
type(my_tup)

tuple

Pretty easy.. so the next question is why do we need a new datatype? The answer can be summed up in three points:

Tuples are faster than lists. If you ever defined a set of constant values and all you ever want to do is read those values, you should use tuples instead of lists
Safer Code. Tuples are like 'write-protected' lists so that the data cannot be changed by accident.
Tuples are using in string formatting (we will see this in some examples below)

### .. 02.01 Creating a Tuple
We already saw one example above on how to create tuples with multiple items but to create a tuple with a single item, you need to include a comma after the first item.
> See what happens when you don't enter comma. What is the `type` of such object?

In [59]:
my_tup = ('test',)
type(my_tup)

tuple

In [60]:
my_tup = ('test')
type(my_tup)

str

### .. 02.02 Slicing the Tuple

Slicing a tuple is similar to slicing a list.

In [62]:
my_tup = ('Python', 'Julia', 1, 3.1415)
#print(my_tup[:-3])
# -ve sign indicates -ve indexing. So start from right and skip 2 elements
print(my_tup[::-2])

(3.1415, 'Julia')


From above example we can observe that just like lists, slicing a tuple returns a new shallow copied tuple containing the requested items.

### .. 02.03 Tuple Concatentation, Repetition, Membership

Tuples are immutable objects which means that yo cannot update, append, remove, modify the items in the tuple. However what you can do is take items from different tuples and create new tuples with those. Let's take a look at some examples:

In [31]:
tuple1 = (1, 2, 3, 4, 5)
tuple2 = ('a', 'b' , 'c' ,'d' , 'e')
# Tuple Concatentation
tuple1 + tuple2

(1, 2, 3, 4, 5, 'a', 'b', 'c', 'd', 'e')

In [32]:
# Tuple Repetition
tuple1 * 2

(1, 2, 3, 4, 5, 1, 2, 3, 4, 5)

In [33]:
# Membership operator, returns true if member of tuple
'a' in tuple2

True

### .. 02.04 Nested Tuples

It is also possible to create a tuple of tuples or tuple of lists. 

In [35]:
list1 = ['Python', 'Julia', 1, 3.1415]
 # List of tuples is possible too!
list2 = [('a', 'b'), ('c', 'd')]
tuple1 = (1, 2, 3, 4, 5)
# Concatenating the list and converting to tuple. 
# Then adding two tuples and storing it in another tuple
tuple2 = tuple(list1 + list2)  + tuple1
tuple2

('Python', 'Julia', 1, 3.1415, ('a', 'b'), ('c', 'd'), 1, 2, 3, 4, 5)

> <img src="./images/addnl_info.png", width=30, height=30, align='left'> 
> - Remember, we cannot concatenate a `list` and a `tuple` so we concatenate two lists and convert the new list into tuple by using the tuple built-in function. Then we concatenate that to tuple1 and store the new tuple as tuple2.

### .. 02.05 Traversing a Tuple

Tuples can be traversed using the index value of the items. The most straightforward way of traversing a tuple is by using loops.
Example: (extending the above example and using the items of tuple2)

In [36]:
for ind in enumerate(tuple2):
    print('tuple2[{0[0]}]: {0[1]}'.format(ind))

tuple2[0]: Python
tuple2[1]: Julia
tuple2[2]: 1
tuple2[3]: 3.1415
tuple2[4]: ('a', 'b')
tuple2[5]: ('c', 'd')
tuple2[6]: 1
tuple2[7]: 2
tuple2[8]: 3
tuple2[9]: 4
tuple2[10]: 5


Almost everything that we did in lists apply over here. Ofcourse, except for the fact that we can modify the lists items but not the tuples. Remember, tuples are immutable.

#### .. 02.06 Tuple Comprehension
We know that list comprehension is performed using a for loop that traverses the list and evaluates a condition using if.. else condition and creates a new list with the output. So for tuples it should be same as the list, right? Let's see:

In [32]:
tuple1 = (1, 2, 3, 4, 5)
# Same example as in list comprehension
tuple2 = (elem for elem in tuple1 if elem%2 == 0)
type(tuple2)

generator

In [33]:
next(tuple2)


2

Remember when we talked about list comprehension and got all happy looking at such an easier way to create new lists? It so happens that the 'comprehension' for lists and dictionaries is just a syntactic sugar to use a generator expression that outputs a specific type. 

We learned the basics about generators when we saw `range` function. List comprehension, under the covers, creates a generator expression that outputs a list (just like we did above using next() method). Now that you know the truth behind the comprehension you might feel that you don't need list comprehension but believe me, it is awfully handy for lists when you start writing your codes in python using lists. So if you want  to use comprehension in tuples, you will get a generator expression and you can obtain your results using the next method. This also doesn't require the invention of another brace or bracket.

> <img src="./images/addnl_info.png" width=30, height=30, align="left"> 
> - How do you obtain all the elements of generator without using loops? 

### .. 02.07 Built-in Tuple Functions and Methods

Python provides following methods for tuples:

##### tuple.count()
This method returns the number of times the object, that is passed as a parameter, occurs in the tuple.

In [39]:
tuple1.count(3)

1

##### tuple.index()
This method returns the lowest index in the tuple that object appears at.

In [40]:
tuple1.index(3)

2

## 03 - 03 Dictionaries
Python dictionary is an interesting and useful data structure in python. It is a container of key-value pairs. Just like lists, python dictionaries are mutable and can contain mixed types, however the key **must** be immutable (i.e the object being used as a key must be [hashable](https://docs.python.org/3/glossary.html#term-hashable)) -- e.g. like strings or numbers or even Tuples and must be unique within a dictionary. 

Python dictionaries are also known as hash tables in other programming languages. Each key is separated from its value by a colon ( `:` ) and just like lists the items are separated by commas and the whole thing is enclosed in curly braces (`{}`).

An important thing to remember is that, by design, python dictionaries do not maintain any ordering (i.e the sequence in which the objects were entered). This lack of ordering allows random elements to be accessed quickly, regardless of the size of the dictionary.

> For curious minds who wants to know more about `hash`ing (it is not *required*), [wikipedia](https://en.wikipedia.org/wiki/Hash_table) has a good writeup and then maybe this [StackOverflow](http://stackoverflow.com/questions/2061222/what-is-the-true-difference-between-a-dictionary-and-a-hash-table) link.

### .. 03.01 Creating Dictionaries

In [41]:
student = {'Name': 'Achilles', 'Class': 'Python Skills', 'Course': 'Python'}
print(student)
print(type(student))

{'Name': 'Achilles', 'Class': 'Python Skills', 'Course': 'Python'}
<class 'dict'>


In [42]:
# Creating empty dictionary
states = {}
# Keys are inside square brackets and values on the right side of assignment
states['WB'] = 'West Bengal'
states['AP'] = 'Andhra Pradesh'
print(states)

{'WB': 'West Bengal', 'AP': 'Andhra Pradesh'}


You can convert other data structures like lists and tuples to dictionaries too.. Lets look at some of the ways to achieve that:

#### .. 03.01.01 fromkeys()

In [43]:
states_key_list   = ['WB', 'AP']
# Instead of 0 you can leave the field empty
states_dict       = {}.fromkeys(states_key_list, 0)
print("Just added keys with default values as 0: ", states_dict)
states_dict['WB'] = 'West Bengal'
states_dict['AP'] = 'Andhra Pradesh'
print(states_dict)

Just added keys with default values as 0:  {'WB': 0, 'AP': 0}
{'WB': 'West Bengal', 'AP': 'Andhra Pradesh'}


#### .. 03.01.02 zip()
This is a python built-in method to group every element from each iterable passed as an argument.
`zip` will return the group of size equal to the smallest iterable.

In [44]:
states_key_list = ['WB', 'AP']
states_val_list = ['West Bengal', 'Andhra Pradesh']
states_dict     = dict(zip(states_key_list, states_val_list))
print(states_dict)

{'WB': 'West Bengal', 'AP': 'Andhra Pradesh'}


In [1]:
# what happens if we have more keys than values
states_key_tup  = ('WB', 'AP', 'MH', 'DL')
states_val_list = ['West Bengal', 'Andhra Pradesh', 'Maharashtra']
states_dict     = dict(zip(states_key_tup, states_val_list))
print(states_dict)

{'WB': 'West Bengal', 'AP': 'Andhra Pradesh', 'MH': 'Maharashtra'}


### .. 03.03 Accessing Dictionary Items

#### .. 03.03.01 Passing key as index to the dictionary

In [2]:
student = {'Name': 'Achilles', 'Class': 'Python Skills', 'Course': 'Python'}
student['Name']

'Achilles'

#### .. 03.03.02 get(key [, default])

In [6]:
# If key is in Dictionary, it will return the value else return the 
# default.. in this case default = "Not Found"
#print("Course:   ", student.get('Course'  , 'Not Found'))
# This will print Not Found since there is not key 'Location'
print("Location: ", student.get('Location', 'Not Found'))

Location:  Not Found


In [5]:
print("Course:   ", student.get('Course'  , 'Not Found'))

Course:    Python


### .. 03.04 Updating Dictionary:

The dictionary can be updated by adding a new entry or a new key-value pair, modifying existing entry and/ or deleting an entry.

#### .. 03.04.01 Passing key as index and assigning value

In [7]:
student = {'Name': 'Achilles', 'Class': 'Python Skills', 'Course': 'Python'}
student['Degree']   = 'Data Science Certification'
student['Location'] = 'Kolkata'
student

{'Name': 'Achilles',
 'Class': 'Python Skills',
 'Course': 'Python',
 'Degree': 'Data Science Certification',
 'Location': 'Kolkata'}

#### .. 03.04.02 setdefault( )

In python, the value (of a key-value pair) is mutable. However at times you might not want to overwrite the key-value pair if it already exists. 
You can achieve this by using `setdefault()` method. setdefault method returns a value if a key is present. Otherwise it inserts a key with the specified value and returns the value

In [8]:
student = {'Name': 'Achilles', 'Class': 'Python Skills', 'Course': 'Python'}
# This will add the 'Degree:Masters' key value pair since it doesn't exist
# and return the value added
student.setdefault('Degree', 'Data Science Certification')

'Data Science Certification'

In [9]:
# This will return the existing value for key = "Class"
student.setdefault('Class','Data Analysis')

'Python Skills'

In [10]:
print(student)

{'Name': 'Achilles', 'Class': 'Python Skills', 'Course': 'Python', 'Degree': 'Data Science Certification'}


#### .. 03.04.03 update( )

The update method adds (joins) the two dictionary together.

In [11]:
states_dict = {'WB': 'West Bengal', 'AP': 'Andhra Pradesh'}
states_dict2 = {'MH': 'Maharashtra', 'DL': 'Delhi'}
states_dict.update(states_dict2)
states_dict

{'WB': 'West Bengal',
 'AP': 'Andhra Pradesh',
 'MH': 'Maharashtra',
 'DL': 'Delhi'}

### .. 03.05 Removing elements from Dictionary

#### .. 03.05.01 pop( )

Pop() method removes the key-value pair based on the key passed as an argument. It returns the value that is being 'popped' from the dictionary

In [12]:
states_dict = {'WB': 'West Bengal', 'MH': 'Maharashtra', 'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
states_dict.pop('WB')

'West Bengal'

In [13]:
states_dict

{'MH': 'Maharashtra', 'DL': 'Delhi', 'AP': 'Andhra Pradesh'}

#### .. 03.05.02 del()
del() method can be used to perform the above operation and also can be used to remove an entire dictionary. It does not return anything.

In [14]:
states_dict = {'WB': 'West Bengal', 'MH': 'Maharashtra', 'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
del states_dict['WB']
states_dict

{'MH': 'Maharashtra', 'DL': 'Delhi', 'AP': 'Andhra Pradesh'}

In [15]:
# delete the whole dictionary
del states_dict             #its first deleted , so value to print
print(states_dict)

NameError: name 'states_dict' is not defined

#### .. 03.05.03 clear( )

clear method clears all items from the dictionary but does not delete the dictionary

In [16]:
states_dict = {'MH': 'Maharashtra', 'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
states_dict.clear()
states_dict

{}

### .. 03.06 Traversing a Dictionary

#### .. 03.06.01 for loop:

A dictionary can be traversed using for loops.

In [17]:
states_dict = {'WB': 'West Bengal', 'MH': 'Maharashtra', 'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
for k in states_dict:
    print("{key}: {val}".format(key=k, val=states_dict[k]))

WB: West Bengal
MH: Maharashtra
DL: Delhi
AP: Andhra Pradesh


In [18]:
 # We will see items() method in next sub- topic
for k, v in states_dict.items():
    print(': '.join((k, v)))

WB: West Bengal
MH: Maharashtra
DL: Delhi
AP: Andhra Pradesh


> Check the print statements in above two examples.

#### .. 03.06.02 keys( ) , values( ) and items( )

The keys() method returns a list of keys in dictionary. The values() method returns a list of all the values and items() returns a list of all key-value tuples.

> - For python2 users: The above methods will directly return a list (note: a `list`!). 
- For python3 users: The above methods return dict_keys, dict_values and dict_items respectively which are basically `view` objects (note: `view`!). 
    - view objects are faster and require small and fixed amount of memory and processor time. (python2 equivalent of these are `viewkeys()`, `viewvalues()` and `viewitems()`).
    - `views` are *dynamic view* of the dictionary which shows the contents of the dictionary even after it changes. They offer features that differ from those of lists: a list of keys contain a copy of the dictionary keys at a given point in time, while a view is dynamic and is much faster to obtain, as it does not have to copy any data (keys or values) in order to be created.
    - These `views` can be converted to lists by passing them to list constructor. e.g. `list(dict_keys)`. This technique will also work for python2 users.
- Python2 users: wherever you see `.keys()` or `.items()` or `.values()` you can also use `.viewkeys()`, `.viewitems()` and `.viewvalues()` respectively. These will return the `dict_keys`, `dict_items` and `dict_values`.

In [19]:
states_dict = {'WB': 'West Bengal', 'MH': 'Maharashtra', 
               'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
list(states_dict.keys())

['WB', 'MH', 'DL', 'AP']

In [20]:
list(states_dict.values())

['West Bengal', 'Maharashtra', 'Delhi', 'Andhra Pradesh']

In [21]:
list(states_dict.items())

[('WB', 'West Bengal'),
 ('MH', 'Maharashtra'),
 ('DL', 'Delhi'),
 ('AP', 'Andhra Pradesh')]

### 03.07 Sorting

Dictionaries in python can be sorted using keys or values in ascending or descending order. First let us look as the built-in function sorted() and then the method specific to collections class.

#### .. 03.07.01 sorted( )

Sorted method returns a new list containing sorted items from iterable (in our case it is a dictionary). It can also take a boolean value for reverse, which, if set as True, will sort the iterable in descending order.

In this example, we will sort the dictionary by keys.

In [23]:
states_dict = {'WB': 'West Bengal', 'MH': 'Maharashtra', 
               'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
sorted_keys = sorted(list(states_dict.keys()), reverse=False)
for key in sorted_keys:
    print('{} : {}'.format(key, states_dict[key]))

AP : Andhra Pradesh
DL : Delhi
MH : Maharashtra
WB : West Bengal


#### .. 03.07.02 sort()
This method sorts the list inplace!

> Remember: Creating a copy of lists is expensive. You should always try to reduce the copies that you create.

In [34]:
states_dict = {'WB': 'West Bengal', 'MH': 'Maharashtra', 
               'DL': 'Delhi', 'AP': 'Andhra Pradesh'}
k = list(states_dict.keys())
k.sort(reverse=False)
for key in k:
    print('{key} : {val}'.format(key=key, val=states_dict[key]))

AP : Andhra Pradesh
DL : Delhi
MH : Maharashtra
WB : West Bengal


*Remember, If you have any difficulty using any function or method in Jupyter notebook, type the function or method name followed by a '?' or `shift + Tab` and it will print the docstring/ help manual for you.*

## 03 - 04 Functions
Imagine that you have to open a file, read the contents of the file and close it. Pretty trivial, right? Now imagine that you have to read ten files, print their output or perform some computation on the contents and then close it. Now you don't want to sit there and type file i/o operations for every file. What if there are over 500 files? 

This is where the functions come in. A function is a block of organized and reusable code in a program that performs a specific task which can be incorporated into a larger program or reused by passing different set of parameters. 

The advantages of using functions are:
- Allowing code reuse.
- Reducing code duplication.
- Improving readability while reducing the complexity of the code.

There are two basic types of functions: 
- Built-in functions 
- User defined functions. 

We have been using built-in functions for quite some time without actually understanding how a function works. This is the beauty of python. According to Guido van Rossum, all objects in python are first class citizens. Meaning all the objects (like function, strings, integers, etc) can be assigned to variables, placed in lists, stored in dictionaries, passed as arguments and so forth. We have been doing this the whole time, right? Now lets see how you can create your own functions and call them in your code.

### 03.04.01 Defining Functions

A function is defined using the `def` keyword followed by the name of the function. The parameters or the arguments should be placed within the parentheses followed by the function name. The code block within every function starts with a colon and should be indented.

In [35]:
def mul(a, b):
    return('{} * {} = {}'.format(a, b, a*b))
    
print(mul(4, 5))

4 * 5 = 20


In the above code we have used a keyword return. A function may or may not have a return value. The job of return is just to return the expression/ object to the the calling function. 

### 03.04.02 Function Arguments

A function can be called by using following types of formal arguments

- Required arguments
- Keyword arguments
- Default arguments
- Variable-length arguments

#### .. 04.02.01 Required Arguments:

Required arguments are passed to a function in correct positional order. The number of arguments being passed should be equal to the number or arguments expected by the function that is defined. Let's take a look at the example:

In [38]:
# Lame example
def info(name, sem):
    print('My name is: ',name)
    print('This is the Year: ',int(sem))

In [39]:
info('Achilles', 2018)

My name is:  Achilles
This is the Year:  2018


In [None]:
# What if we change the order in which we are passing the arguments?
info(2018, 'Achilles')

> We'll learn about how to prevent such errors from breaking our code in our next module on Exception Handling

#### .. 04.02.02 Keyword Arguments:

Keyword arguments are related to the function calls. When you use keyword arguments in a function call, the caller identifies the arguments by the parameter name. This allows you to skip the arguments or place them out of order because python's interpreter will be able to match the values with parameters. Let's modify the way we are calling the above function.

In [40]:
def info(name, year=2018):
    print('My name is: ',name)
    print('This is the Year: ',int(year))

In [41]:
# The order of the parameter does not matter.
info(year=2018, name='Achilles')

My name is:  Achilles
This is the Year:  2018


In [42]:
# Not providing second argument.
info('Achilles')

My name is:  Achilles
This is the Year:  2018


#### .. 04.02.03 Variable Length Arguments: 

At some point, you may need to process the function for more than the arguments that you specified when you defined the function. These arguments can be of variable length and are not named in the function definition, unlike required and default arguments. So how do you handle this?

In [96]:
def check(A, *B):
    print('Name of course: ', *A)
    print('Name of students in the course:')
    for b in B:
        print(*b, sep='\n')

In [97]:
course = ['Python']
names = ['Saha', 'Rohit', 'Raj', 'Anup', 'Achilles', 'Sayan', 'Ansar', 'Deep', 'Joy', 'Dutta', 'Abhishek']

In [98]:
check(course, names)

Name of course:  Python
Name of students in the course:
Saha
Rohit
Raj
Anup
Achilles
Sayan
Ansar
Deep
Joy
Dutta
Abhishek


An asterisk (`*`) is placed before the variable name that holds the values of all non keyword variable arguments. This tuple remains empty if no additional arguments are specified during the function call. 

### 03.04.03 Anonymous Functions

Anonymous functions do not have a name! They are not declared in the standard manner (using the `def` keyword). To create an anonymous function you use `lambda` keyword. They are part of the functional paradigm incorporated in python.

- Lambda forms can take any number of arguments but they return just one value in the form of an expression. They cannot contain commands or multiple expressions. 
- Lambda functions have their own local namespace (just like regular functions) and cannot access variables other than those in their parameter list or those in the global namespace.
- Lambda function cannot be a direct call to print function.

In [99]:
mul = lambda a, b: a*b

In [100]:
print(mul(4, 5))

20


### 03.04.04 Map function
**Syntax:** 

`map(function, iterable)`

As the name suggests, `map` applies the `function` to every element in the `iterable`. 

In [101]:
numbers = range(1, 10)
#numbers = 0,1,2..9 
def square(num):
    return num**2

In [103]:
list(map(square, numbers))
# The equivalent of this function is:
# result = []
# for i in range(1, 10):
#     result.append(i**2)

[1, 4, 9, 16, 25, 36, 49, 64, 81]

Even better, we can write the whole thing in a single line

In [104]:
list(map(lambda x: x**2, range(1, 10)))

[1, 4, 9, 16, 25, 36, 49, 64, 81]

### 03.04.05 Filter function
**Syntax:** 

`filter(function, iterable)`

Just like map, `filter` applies the function to every element of the `iterable` but instead of returning the output of function, it returns the `list` of elements for which the function returns `True`

In [106]:
# Return all values for which %2 is non zero.. (List of all odd numbers, right?)
list(map(lambda x: x%2, range(1, 10)))

[1, 0, 1, 0, 1, 0, 1, 0, 1]

In [107]:
list(filter(lambda x: x%2, range(1, 10)))     #it's the diff of map & filter


[1, 3, 5, 7, 9]

## 03 - 05 Exception Handling
An exception is a python object that represents an error. It is an event, which occurs during the execution of a program that disrupts the normal flow of the program's instructions. When such a situation occurs and if python is not able to cope with it, it raises and exception. We have been seeing errors like TypeError and NameError or IndentationError throughout our tutorial which caused our application or that code to stop the execution. To prevent this from happening, we have to handle such exceptions.
Following is a hierarchy for built-in exceptions in python:

```
BaseException
 +-- SystemExit
 +-- KeyboardInterrupt
 +-- GeneratorExit
 +-- Exception
      +-- StopIteration
      +-- StandardError
      |    +-- BufferError
      |    +-- ArithmeticError
      |    |    +-- FloatingPointError
      |    |    +-- OverflowError
      |    |    +-- ZeroDivisionError
      |    +-- AssertionError
      |    +-- AttributeError
      |    +-- EnvironmentError
      |    |    +-- IOError
      |    |    +-- OSError
      |    |         +-- WindowsError (Windows)
      |    |         +-- VMSError (VMS)
      |    +-- EOFError
      |    +-- ImportError
      |    +-- LookupError
      |    |    +-- IndexError
      |    |    +-- KeyError
      |    +-- MemoryError
      |    +-- NameError
      |    |    +-- UnboundLocalError
      |    +-- ReferenceError
      |    +-- RuntimeError
      |    |    +-- NotImplementedError
      |    +-- SyntaxError
      |    |    +-- IndentationError
      |    |         +-- TabError
      |    +-- SystemError
      |    +-- TypeError
      |    +-- ValueError
      |         +-- UnicodeError
      |              +-- UnicodeDecodeError
      |              +-- UnicodeEncodeError
      |              +-- UnicodeTranslateError
      +-- Warning
           +-- DeprecationWarning
           +-- PendingDeprecationWarning
           +-- RuntimeWarning
           +-- SyntaxWarning
           +-- UserWarning
           +-- FutureWarning
       +-- ImportWarning
       +-- UnicodeWarning
       +-- BytesWarning
```

Let's take a look at an example

In [108]:
1 / 0

ZeroDivisionError: division by zero

Quite straightforward example where we are trying to divide a number by 0. Python raises a `ZeroDivisionError` and the execution halts. There are basically two ways to handle this error. 
First way -- Check and make sure that the divisor is not zero. This is left as an exercise for the students. The other way to handle the error is by using `try.. catch` block where we place the code to be executed inside the try block and the exception to be handled in the except block.

In [109]:
for i in range(3, -3, -1):
    try:
        print(1.0 / i)
    except ZeroDivisionError:
        print("So, you're trying to divide by zero huh?")

0.3333333333333333
0.5
1.0
So, you're trying to divide by zero huh?
-1.0
-0.5


As observed from the above example, our execution continued even after we tried dividing 1.0 by zero.
> Python2 users.. Check the division that is being performed. To obtain floating point answer, you have to convert any one of the operand to float

> Python3 users.. you don't HAVE to do this. You can simply perform integer division and get floating point output

### 03.05.01 Argument of an Exception

An exception can have an argument, which is a value that gives additional information about the problem that caused the exception. The contents of argument vary by exception.

In [110]:
for i in range(3, -3, -1):
    try:
        print(1.0 / i)
    except ZeroDivisionError as err:
        print('Zero Division Error: ', str(err.args[0]))

0.3333333333333333
0.5
1.0
Zero Division Error:  float division by zero
-1.0
-0.5


### 02.05.02 Hierarchy of Exceptions

The exceptions are organized in an hierarchy as observed from above tree. This means that we can have multiple exceptions handled by the except block.

In [111]:
# time module has a sleep method which will help slow down the execution of loop
import time 
i = -2
while i < 5:
    i = i + 1
    try:
        print(1.0 / i)
        # This will halt the code for 2 seconds
        time.sleep(2)
    except KeyboardInterrupt:
        # Lets raise the exception that we just caught!
        raise KeyboardInterrupt('Ctrl C pressed')
    except ZeroDivisionError:
        print("So, you're trying to divide by zero huh?")

-1.0
So, you're trying to divide by zero huh?
1.0
0.5
0.3333333333333333
0.25
0.2


After starting the execution of the code, wait for few seconds and then press `i` key twice on your keyboard and you should get the above error (another way is to click on `Kernel -> Interrupt` ). The new thing that we can observe in this code is that we have used a keyword `raise`. This helps in raising a particular exception and as a parameter to the exception you can pass the string that you want to print. 

The raise statement does two things: it creates an exception object, and immediately leaves the expected program execution sequence to search the enclosing `try` statements for a matching `except` clause. The effect of `raise` statement is to either divert execution in a matching `except` suite, or to stop the program (if no proper exception handling was performed).

Now lets see the above example with hierarchy in action:

In [112]:
import time
i = -2
while i < 5:
    i = i + 1
    try:
        print(1.0 / i)
        time.sleep(2)
    except BaseException:
        # Lets raise the exception that we just caught!
        print('Some Exception occurred..')

-1.0
Some Exception occurred..
1.0
0.5
0.3333333333333333
0.25
0.2


In the above example we can see that we did implement a handler (a very bad kind, I must say). If you check from the hierarchy list, you can observe that `KeyboardInterrupt` and `ArithmeticError`(which includes `ZeroDivideError`) are subclass of `BaseException` class. So since we implemented `BaseException` handler, all the errors under the base class are handled. 

Avoid raising a generic exception like we did in this example because you will not be able to understand what actually caused the exception and allows bugs to pass through. Instead use the most specific Exception constructor that semantically fits your issue.

### 03.05.03 Finally

`finally` keyword is a clause which contains the block of code that will always be executed regardless of whether there was any exception in the code or not. This is generally used to cleanup some resources in a program.. especially when using file I/O operations

In [1]:
fhandler = None
try:
    # Open file in read-only mode. Try renaming file to test1.txt
    fhandler = open('./Python_Studies/test.txt', 'r')
    # Read all lines
    print(fhandler.readlines())
except IOError:
    print('Error Opening File')
finally:
    # If the file was opened
    if fhandler:
        # Close the file
        fhandler.close

Error Opening File


In the above example we can observe that we are trying to open a file and read its contents. If the file doesn't exist, it will raise an IOError exception. We are handling that, no worries. However once the file has been read, we need to close the file so that other processes or other functions in our code can access it. (Remember: when accessing/ modifying a file, the file is locked to that process which is performing the I/O operation on it. Unless the lock is released, no other process will be able to modify it.. ) 

To make sure we release the resources, in the `finally` block we are checking if fhandler is not null and closing it. 

## 03 - 06 File I/O
Before we jump in to file I/O functions, lets first look at some basic I/O functions that are available to use in Python. 
In Python, there are three basic I/O connections, Standard Input, Standard Output and Standard Error. As the name suggests, Standard Input is the data that goes to the program through the keyboard. keyboard being the standard input. Standard output is the terminal console, unless redirected..(guess where?!!) and Standard error is the stream where the programs write their error messages which is again to the terminal unless redirected.

### 03.06.01 Standard Input, Output and Error:

The `input( )` function reads one line from the standard input and returns it as a string.

In [113]:
name = input("Enter your name: ")
print("Hello {name}".format(name=name))

Enter your name: ansar
Hello ansar


In [114]:
# Lets use list comprehension in the input
some_val = input("Enter something: ")
print("You entered: {}".format(some_val))

Enter something: 2
You entered: 2


In [2]:
Python2 users would have observed the output as `You entered: [0, 2, 4, 6, 8]`
and Python3 users would have observed the output as `[x for x in range(10) if x%2 == 0]`

Until Python3.2 there used to be two methods for accepting input -- `raw_input` and `input` the behavior that python3 users are seeing is the actual behavior of `raw_input`. However after Python3.2, it was decided to drop the `input` method.
> For the sake of staying on topic, we cannot discuss more on this.. but for the curious souls.. head here: https://www.python.org/dev/peps/pep-3111/

We saw the Standard Error in the Exception Handling module. 

We've also been using Standard Output since `module 1`, I guess.. using the `print` function (for python3) or `print` statement (for python2), right?

SyntaxError: invalid syntax (<ipython-input-2-1489e9165bc5>, line 1)

### 03.06.02 File I/O
Until now you have been reading and writing to the standard input. Lets now perform the same function to the files. Now we will see how we can read and write to the files.

#### .. 06.02.01 Opening Files: 
Files can be opened using python's built-in `open()` function. This function creates a file object which is used for performing different operations on the file. It will become much clear when we look at a complete example. For now, just remember that we need to create a file object before performing any file I/O and try to remember the syntax.
`fhandler = open(file_name, access mode, encoding)`
`file_name`: The file name that you would like to perform your I/O operations on.
encoding: Encoding tells python what encoding scheme to use to convert the stream of bytes to text. 
`access_mode`: This is the mode which determines if the file is to opened as read only,read-write, write only etc modes. The ways in which a file can be opened are mentioned below:

|access_mode | Its Function|
|:------|------------:|
|r	|Opens a file as read only|
|rb	|Opens a file as read only in binary format|
|r+	|Opens a file for reading and writing|
|rb+	|Opens a file for reading and writing in binary format|
|w	|Opens a file for writing only|
|wb	|Opens a file for writing only in binary format|
|w+	|Opens a file for both reading and writing|
|wb+	|Opens a file for writing and reading in binary format|
|a	|Opens a file for appending|
|ab	|Opens a file for appending in binary|
|a+	|Opens a file for appending and reading|
|ab+	|Opens a file for appending and reading in binary format|

#### .. 06.02.02 Reading and Writing
Once we have created a file object we can perform many operations on the file object which, like all objects, has methods to take care of nitty gritty details and perform the operations on the file. Before we jump into the functions, lets take a look at a complete File I/O example.



In [3]:
try:
    fhandler = open('test.txt', 'w') 
    fhandler.write('Hello World')
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)
finally:
    if fhandler:
        fhandler.close()

> If the above code ran without any exception, then it should have created a file test.txt (and if it existed before, it will overwrite it because of access_mode as `w`)

Now before we proceed ahead, don't you think the total number of lines that we wrote to achieve just a small objective (of writing a string to file) is too much effort? Let's look at an alternative way of writing to a file

In [4]:
try:
    with open('test.txt', 'w') as fhandler:
        fhandler.write('Hello World')
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)

Much better, eh?

In fact, It is good practice to use the `with` keyword when dealing with file objects. This has the advantage that the file is properly closed (`fhandler.close()`) after its suite finishes, even if an exception is raised on the way. It is also much shorter than writing equivalent `try-except-finally` blocks.

Want another reason for using `with` statement? Ok, if you want to open multiple files together, you can do that using with statement by simply separating the different file handlers by a comma.. 
something like this
```
with open('test0.txt', 'w') as fh0, open('test1.txt', 'w') as fh1 ...
```

So what is `with` statement? Putting it very very simply.. `with` has `__enter__()` and `__exit__()` functions where stuff like opening and closing the file handler can take place. The `with` statement guarantees that if the `__enter__()` method returns without an error, then `__exit__()` will always be called.

> We cannot discuss more about `with` statement here as it can take up a whole module but again for curious minds out there, read this `pep`: https://www.python.org/dev/peps/pep-0343/

Alright, coming back to our discussion on File I/O,  lets now look at some of the functions that you may end up using.

##### .. 02.01 file_object.close()

This method will close the file that we have currently open. You should always call this method once you are done performing I/O operations on the file using the file object unless you are using `with` statement. `with` statement already does that for you.

##### .. 02.02 file_object.mode

This is a read-only attribute that is the value of the mode string used in the open call that created the file_object

##### .. 02.03 file_object.readline([size])

This method reads strings from the file till it reaches new line character ( `\n` ) if the `size` parameter is empty. If an integer is provided as size parameter, then this method returns string of length size.

##### .. 02.04 file_object.readlines([size])

This method basically calls the `readline()` method till it reaches the end of file.

##### .. 02.05 file_object.seek(pos, how=0)

Sets the file_object's current position to the signed integer byte offset by pos from the reference point. The how parameter, which is 0 by default, indicates the reference point. `how`=1 is the reference of current position and `how`=2 is the reference of the end of the file.

##### .. 02.06 file_object.tell()

This method tells the current file position when you are reading from or writing to a file.

##### .. 02.07 file_object.truncate([size])

This method truncates the file to be at most of size size.  If you don't mention the size it takes the size from `tell()` method as the new size.

##### .. 02.08 file_object.write(str)

Writes the bytes of string str to the file.

##### .. 02.09 file_object.writelines(lst)

Writes sequence of strings to file. No new line is added automatically.

In [None]:
try:
    with open('test.txt', 'r+') as fhandler:
        print(fhandler.readline())
        fhandler.writelines(['.', 'This is', ' Python'])
        # Go to the starting of file
        fhandler.seek(0)
        # Print the content of file
        print(fhandler.readlines())
        fhandler.truncate(20)
        fhandler.seek(0)
        print('After truncate: ',fhandler.readlines())
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)

Great! So now we opened the file, read its contents, added multiple strings, truncated and closed it! This covers pretty much everything that you will need when you are working with almost any kind of file that has some text.

### 03.06.03 CSV File I/O
In the above examples, we saw how to perform read-write operations on a file. This is generally used for files that have multiple lines of strings. However if you have data like this:

||||
|:--|:--|:--|
|Data1	|Data2	|Data3|
|Example1	|Example2	|Example3|

It is stored in a file with this format:
```
Data1, Data2, Data3
Example01, Example02, Example03
Example11, Example12, Example13
```

As can be seen in the above example, each row is a new line, and each column is separated with a comma. Many online services allow its users to export tabular data from the website into a CSV file. These files can then be opened and viewed offline using a Spreadsheet program such as Google Sheets, Numbers or Microsoft Excel.

#### So why do we need such CSV files? 
There are two primary reasons for the existence of this format:

- CSV are plain-text files which makes them easy to store and read from
- CSV files are stored as sequence of human readable characters, thus making it easy for humans to interpret the data without requiring any format conversion.

CSV is a delimited text file that uses a comma to separate values (many implementations of CSV import/export tools allow other separators to be used). Simple CSV implementations may prohibit field values that contain a comma or other special characters such as newlines. More sophisticated CSV implementations permit them, often by requiring " (double quote) characters around values that contain reserved characters (such as commas, double quotes, or less commonly, newlines). Embedded double quote characters may then be represented by a pair of consecutive double quotes, or by prefixing an escape character such as a backslash (for example in Sybase Central). The name "CSV" indicates the use of the comma to separate data fields. Nevertheless, the term "CSV" is widely used to refer a large family of formats, which differ in many ways. Some implementations allow or require single or double quotation marks around some or all fields; and some reserve the very first record as a header containing a list of field names. An official standard for the CSV file format does not exist.

Download a Sample\* CSV file from [`HERE`](./sample_datasets/sample_names.csv 'Sample CSV') and save it in your current folder location. 

>Disclaimer: The data generated is completely random using a third party website [`https://www.fakenamegenerator.com`](https://www.fakenamegenerator.com 'FakeName Generator')

#### .. 06.03.01 Reading CSV files

`reader()` can be used to create an object that is used to read the data from a csv file. The reader can be used as an iterator to process the rows of the file in order. Lets take a look at an example:

In [5]:
import csv
row = []
try:
    with open('./sample_datasets/sample_names.csv', 'r') as fh:
        reader = csv.reader(fh)
        for info in reader:
            row.append(info)
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)

print(row[0:10])

Error performing I/O operations on the file:  [Errno 2] No such file or directory: './sample_datasets/sample_names.csv'
[]


`reader()` is a method available in csv package so the first line basically imports the csv package. The `reader()` method takes sequence or an iterable file object, and returns an iterator. As the csv file is being read, each row of the input data is converted to a list of strings. The parser handles the line breaks embedded within the strings which is why using row is not always the output that you might get when taking a line input from file. 

#### .. 06.03.02 Writing CSV files

Writing csv files is just as easy as reading them. To write to a csv file, we can use `writer()` method to create an object for writing and then iterate over the rows using csv's `writerow()` method to write it.

In [None]:
import csv
try:
    with open('test.csv', 'w') as fh:
        writer = csv.writer(fh)
        for num in range(10):
            writer.writerow((num, num**1, num**2))
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)

Now try opening the file just like we did before and see the contents

#### .. 06.03.03 DictReader

In addition to working with sequences or data, the `csv` module includes classes for working with rows as dictionaries so that the fields can be named. The `DictReader` and `DictWriter` classes translate rows to dictionaries instead of lists. Keys for the dictionary can be passed in, or inferred from the first row in the input.

In [None]:
import csv
row = []
try:
    with open('./sample_datasets/sample_names.csv', 'r') as fh:
        reader = csv.DictReader(fh)
        for info in reader:
            row.append(info)
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)

print(row[0:10])

> Python 2 users should see a `list of dictionaries` and Python3 users should see a list of `OrderedDict`. The `OrderedDict`, as the name suggests, will preserve the order in which the entries were inserted. (You can guess why!)

#### .. 06.03.03 DictWriter

Similar to DictReader, we also have DictWriter which needs to be given a list of field names so it know how to order the columns in the output file. 

In [None]:
import csv
try:
    fieldnm = ('Title1', 'Title2', 'Title3')
    with open('test_dict.csv', 'w') as fh:
        writer = csv.DictWriter(fh, fieldnames=fieldnm)
        headers = dict((hdr, hdr) for hdr in fieldnm)
        for num in range(10):
            writer.writerow({'Title1':num, 'Title2':num+1, 'Title3':num+2})
except IOError as ex:
    print("Error performing I/O operations on the file: ",ex)

The above DictReader and DictWriter techniques are good when the filesize (or the number of columns) is not very big. When the row numbers starts scaling up, the list that is created by the reader() method starts growing in memory and makes the process very very slow. 

We will generally be dealing with the files that have over a million row entries and this method is not the most efficient way of dealing with such files. To handle such 'Big Data', we will study python packages like `Numpy` and `Pandas` in next few modules.