#### pd.read_csv chucksize

The machine usually **does not have enough memory to read the entire CSV into a DataFrame at one time**. Assuming we do not need the entire dataset in memory all at one time, one way to avoid the problem would be to process the CSV in chunks (by specifying the chunksize parameter)

**The chucksize parameter specifies the number of rows per chunk. (The last chunk may contain fewer than chunksize rows, of course.)**

In pd.read_csv call, in this case, does not return a DataFrame object. Instead, it**returns a TextFileReader object, which is an iterator**. It is an iterator of DataFrame objects, each the size of the integer passed to the chunksize parameter. We can't just call df.index because, simply, an iterator object does not have an index attribute. This does not mean that we cannot access the DataFrames inside the iterator. What it means is that you would either have to loop through the iterator to access one DataFrame at a time or you would have to use some kind of way of concatenating all those DataFrames into one giant one.

If you are considering **just working with one DataFrame at a time**, then the following is what you would need to do to print the indexes of each DataFrame:

In [283]:
file = 'op_firstrun_raw.csv'
chunks = pd.read_csv(file, chunksize=1000)
df = next(chunks)
type(df)

pandas.core.frame.DataFrame

In [284]:
df.head()

Unnamed: 0,deal_uuid,deal_key,sf_id,deal_year,deal_month,actual_start_date,actual_end_date,voucher_age,days_ran,division_key,...,market_ly_60_merchant_count,market_ly_60_avg_discount,market_ly_60_avg_price,sh_365_avg_deal_gb,sh_365_avg_udv_30,sh_365_avg_udv_90,sh_365_deal_count,sh_365_merchant_count,sh_365_avg_discount,sh_365_avg_price
0,b8f33fe6-4776-4326-984a-071823c0d7dd,43563263,001C000001YfzvGIAR,2017,1,2017-01-15,2017-01-18,,4,3500034,...,,,,,,,,,,
1,b8fb9088-2dcf-42b1-aa05-ab09018a43e6,43609398,001C000001ZT9XnIAL,2017,2,2017-02-10,2017-02-13,,4,3500116,...,,,,123.0,64.0,95.0,1.0,1.0,96.0,60.666668
2,b8fcbb48-4093-4af2-8b2d-ef0a71700b8b,43624780,001C000001ZalFqIAJ,2017,2,2017-02-16,2017-02-19,,4,3500179,...,,,,2888.0,1550.0,2495.0,1.0,1.0,22.333334,66.666664
3,b9043a56-193a-4cbf-bd6a-e608acdd36af,43632667,001C000001ZTkrZIAT,2017,2,2017-02-22,2017-02-21,,0,3500021,...,,,,292.0,623.0,624.0,1.0,1.0,34.875,36.375
4,b90c37fa-4375-4f52-b406-5722717152c9,43614626,001C000001ZZwFjIAL,2017,2,2017-02-08,2017-02-11,,4,3500161,...,,,,26195.0,1815.0,3240.0,1.0,1.0,183.5,51.0


In [280]:
chunks

<pandas.io.parsers.TextFileReader at 0x10e74dac8>

In [285]:
#for loop to access all dataframes
import pandas as pd
file = 'op_firstrun_raw.csv'
chunks = pd.read_csv(file, chunksize=1000)

for chunk in chunks:
#     print(chunk.index)
    # do something
    chunk.to_csv('output_file.csv', mode='a', index=False)
#With the mode parameter set to a, the operations should append to the file. As a result, nothing should be overwritten.

In [286]:
test_data = pd.read_csv('output_file.csv')
test_data.shape

(222981, 160)

In [288]:
# Initialize an empty dictionary: counts_dict
counts_dict = {}
chunks = pd.read_csv('op_firstrun_raw.csv', chunksize=10000)
# Iterate over the file chunk by chunk
for chunk in chunks:
    # Iterate over the column in DataFrame
    for entry in chunk['deal_month']:
        if entry in counts_dict.keys():
            counts_dict[entry] += 1
        else:
            counts_dict[entry] = 1

# Print the populated dictionary
print(counts_dict)

{1: 14543, 2: 14552, 3: 16353, 6: 6948, 4: 11560, 5: 12166, 7: 5887, 8: 6059, 9: 5805, 12: 5393, 10: 5973, 11: 6140}


If the goal for you is to **concatenate all the DataFrames** into one giant DataFrame, then the following would perhaps be a better path:

In [289]:
file = 'op_firstrun_raw.csv'
chunks = pd.read_csv(file, chunksize=10000)
df = pd.concat(chunks)

In [290]:
type(chunks)

pandas.io.parsers.TextFileReader

In [291]:
type(df)

pandas.core.frame.DataFrame

In [292]:
df.shape

(111379, 160)

#### args and kwargs

The special syntax *args in function definitions in python is used to pass a variable numbmr of arguments to a function. It is used to pass a non-keyworded, variable-length argument list

* The syntax is to use the symbol * to take in a variable number of arguments; by convention, it is often used with the word args.
* What *args allows you to do is take in more arguments than the number of formal arguments that you previously defined. With *args, any number of extra arguments can be tacked on to your current formal parameters (including zero extra arguments).
For example : we want to make a multiply function that takes any number of arguments and able to multiply them all together. It can be done using *args.
* Using the *, the variable that we associate with the * becomes an iterable meaning you can do things like iterate over it, run some higher order functions such as map and filter, etc.

In [347]:
# Python program to illustrate   
# *args for variable number of arguments 
def myFun(*argv):  
    for arg in argv:  
        print (arg) 
        
args = ("Geeks", "for", "Geeks") 
myFun(*args) 

Geeks
for
Geeks


**kwargs

The special syntax **kwargs in function definitions in python is used to pass a keyworded, variable-length argument list. We use the name kwargs with the double star. The reason is because the double star allows us to pass through keyword arguments (and any number of them).

* A keyword argument is where you provide a name to the variable as you pass it into the function.
* One can think of the kwargs as being a dictionary that maps each keyword to the value that we pass alongside it. That is why when we iterate over the kwargs there doesn’t seem to be any order in which they were printed out.

In [352]:

# Python program to illustrate   
# *kargs for variable number of keyword arguments 
  
def myFun(**kwargs):  
    for key, value in kwargs.items(): 
        print ("%s == %s" %(key, value)) 

kwargs = {'first' :'Geeks', 'mid' :'for', 'last': 'Geeks'}
myFun(**kwargs)   

first == Geeks
mid == for
last == Geeks


#### lambda functions

In [19]:
#lambda can take two arguments
# Define echo_word as a lambda function: echo_word
echo_word = (lambda word1, echo: word1 * echo)

# Call echo_word: result
result = echo_word('hey', 5)

# Print result
print(result)

heyheyheyheyhey


#### map() and lambda functions

In [293]:
# Take a list of numbers.  
my_list = [12, 65, 54, 39, 102, 339, 221, 50, 70] 

result = list(map(lambda x: x % 13, my_list))
  
# printing the result 
print(result)  

[12, 0, 2, 0, 11, 1, 0, 11, 5]


#### filter() and lambda functions

The function filter() offers a way to filter out elements from a list that don't satisfy certain criteria.

In [294]:
# Take a list of numbers.  
my_list = [12, 65, 54, 39, 102, 339, 221, 50, 70, ] 
  
# use anonymous function to filter and comparing  
# if divisible or not 
result = list(filter(lambda x: (x % 13 == 0), my_list))  
  
# printing the result 
print(result)  

[65, 39, 221]


#### reduce() and lambda functions

The reduce() function is useful for performing some computation on a list and, unlike map() and filter(), returns a **single value** as a result. To use reduce(), you must **import it from the functools module**.

**replacement of for loop**

In [295]:
# Import reduce from functools
from functools import reduce 

# Create a list of strings: stark
stark = ['robb', 'sansa', 'arya', 'brandon', 'rickon']

# Use reduce() to apply a lambda function over stark: result
result = reduce(lambda item1, item2: item1 + item2, stark)

# Print the result
print(result)

robbsansaaryabrandonrickon


In [296]:
# Import reduce from functools
from functools import reduce 

l = [[1,2],[3,4],[5,6]]

reduce(lambda x1, x2: x1 + x2,map(lambda x: x[0], l))

9

#### Error handling with try-except

The try block lets you test a block of code for errors.

The except block lets you handle the error.

The finally block lets you execute code, regardless of the result of the try and except blocks.

https://docs.python.org/3/tutorial/errors.html

In [48]:
try:
#   print(y)
    raise
except NameError as e:
    
  print("Variable y is not defined:", e)
except Exception as e:
  print("Something else went wrong:", e)

Something else went wrong: No active exception to reraise


In [75]:
# constant = 1.6
def convert_to_km(miles: float) -> float:
    km = None
    try:
        km = miles * constant
    except NameError as e: # error
        print('NameError:', e)
        km = miles * 1.6
    except Exception as e:
        print('Error', e)
    return km

convert_to_km(10)

Error unsupported operand type(s) for *: 'int' and 'NoneType'



#### Error handling by raising an error

#### Iterators vs Iterables

Iterable is an object, which one can iterate over. It generates an Iterator when passed to iter() method. 

Iterator is an object, which is used to iterate over an iterable object using __next__() method. 

Iterators have __next__() method, which returns the next item of the object.



In [97]:
for city in ["Berlin", "Vienna", "Zurich"]: 
    print(city) 
  
print("\n") 
      
for language in ("Python", "Perl", "Ruby"): 
    print(language) 
  
print("\n") 
      
for char in "Iteration is easy": 
    print(char, end = " ") 

Berlin
Vienna
Zurich


Python
Perl
Ruby


I t e r a t i o n   i s   e a s y 

In [96]:
# list of cities 
cities = ["Berlin", "Vienna", "Zurich"] 
  
# intialize the object 
iterator_obj = iter(cities) 
  
print(next(iterator_obj)) 
print(next(iterator_obj)) 
print(next(iterator_obj))
# print(next(iterator_obj))

Berlin
Vienna
Zurich


In [106]:
# Check object is iterable or not
def is_iterable(obj):
    try:
        iter(obj)
        return True
    except: 
        return False

In [107]:
for element in [34, [4, 5], (4, 5), 
             {"a":4}, "dfsdf", 4.5]: 
                   
    print(element, " is iterable : ", is_iterable(element)) 

34  is iterable :  False
[4, 5]  is iterable :  True
(4, 5)  is iterable :  True
{'a': 4}  is iterable :  True
dfsdf  is iterable :  True
4.5  is iterable :  False


In [94]:
a = {'a': 4} 
b = iter(a)
b

<dict_keyiterator at 0x117948c78>

In [95]:
next(b)

'a'

**range() doesn't actually create the list; instead, it creates a range object with an iterator that produces the values until it reaches the limit**

In [108]:
# Create an iterator for range(3): small_value
small_value = iter(range(3))
# small_value

# Print the values in small_value
print(next(small_value))
print(next(small_value))
print(next(small_value))

0
1
2


#### Enumerate

A lot of times when dealing with iterators, we also get a need to **keep a count of iterations**. Python eases the programmers’ task by providing a built-in function enumerate() for this task.

enumerate() **returns an enumerate object** that produces **a sequence of tuples**, and each of the tuples is an index-value pair.

**Syntax**:

enumerate(iterable, start=0)

Parameters:
* Iterable: any object that supports iteration
* Start: the index value from which the counter is to be started, by default it is 0 

In [304]:
# Create a list of strings: mutants
mutants = ['charles xavier', 
            'bobby drake', 
            'kurt wagner', 
            'max eisenhardt', 
            'kitty pryde']

enumerate(mutants)

<enumerate at 0x195b575e8>

In [305]:
# Create a list of tuples: mutant_list
mutant_list = list(enumerate(mutants))

# Print the list of tuples
print(mutant_list)

[(0, 'charles xavier'), (1, 'bobby drake'), (2, 'kurt wagner'), (3, 'max eisenhardt'), (4, 'kitty pryde')]


In [306]:
# Unpack and print the tuple pairs
for index1, value1 in enumerate(mutants):
    print(index1, value1)

0 charles xavier
1 bobby drake
2 kurt wagner
3 max eisenhardt
4 kitty pryde


In [307]:
# Change the start index
for index2, value2 in enumerate(mutants, start=1):
    print(index2, value2)

1 charles xavier
2 bobby drake
3 kurt wagner
4 max eisenhardt
5 kitty pryde


In [308]:
# Python program to illustrate 
# enumerate function in loops 
l1 = ["eat","sleep","repeat"] 
  
# printing the tuples in object directly 
for ele in enumerate(l1): 
    print(ele)

(0, 'eat')
(1, 'sleep')
(2, 'repeat')


In [309]:
# changing index and printing separately 
#starting index is 100
for count,ele in enumerate(l1,100): 
    print (count,ele )

100 eat
101 sleep
102 repeat


#### Using zip

zip() takes any number of iterables and **returns a zip object** that is an iterator of tuples. If you wanted to print the values of a zip object, you can convert it into a list and then print it. 

The purpose of zip() is to map the similar index of multiple containers so that they can be used just using as single entity.

**Syntax :**

zip(*iterators)

Parameters : 
     Python iterables or containers ( list, string etc )

Return Value : 
     Returns a single iterator object, having mapped values from all the containers.



In [139]:
# Python code to demonstrate the working of  
# zip() 
  
# initializing lists 
name = [ "Manjeet", "Nikhil", "Shambhavi", "Astha" ] 
roll_no = [ 4, 1, 3, 2 ] 
marks = [ 40, 50, 60, 70 ] 
  
# using zip() to map values 
mapped = zip(name, roll_no, marks) 

In [140]:
print(mapped)

<zip object at 0x117dc9608>


In [141]:
# converting values to print as list 
mapped = list(mapped) 
  
# printing resultant values  
print(mapped) 

[('Manjeet', 4, 40), ('Nikhil', 1, 50), ('Shambhavi', 3, 60), ('Astha', 2, 70)]


In [142]:
for x, y, z in mapped:
    print(x, y, z)

Manjeet 4 40
Nikhil 1 50
Shambhavi 3 60
Astha 2 70


In [143]:
# converting values to print as set 
mapped = set(mapped) 
  
# printing resultant values   
print (mapped) 

{('Astha', 2, 70), ('Manjeet', 4, 40), ('Shambhavi', 3, 60), ('Nikhil', 1, 50)}


In [144]:
for x, y, z in mapped:
    print(x, y, z)

Astha 2 70
Manjeet 4 40
Shambhavi 3 60
Nikhil 1 50


##### **list to dictionary function**

In [312]:
feature_names = ['CountryName',
 'CountryCode',
 'IndicatorName',
 'IndicatorCode',
 'Year',
 'Value']
row_vals = ['Arab World',
 'ARB',
 'Adolescent fertility rate (births per 1,000 women ages 15-19)',
 'SP.ADO.TFRT',
 '1960',
 '133.56090740552298']
# Zip lists: zipped_lists
zipped_lists = zip(feature_names, row_vals)

# Create a dictionary: rs_dict
rs_dict = dict(zipped_lists)

# Print the dictionary
print(rs_dict)

{'CountryName': 'Arab World', 'CountryCode': 'ARB', 'IndicatorName': 'Adolescent fertility rate (births per 1,000 women ages 15-19)', 'IndicatorCode': 'SP.ADO.TFRT', 'Year': '1960', 'Value': '133.56090740552298'}


In [313]:
# Define lists2dict()
def lists2dict(list1, list2):
    """Return a dictionary where list1 provides
    the keys and list2 provides the values."""

    # Zip lists: zipped_lists
    zipped_lists = zip(list1, list2)

    # Create a dictionary: rs_dict
    rs_dict = dict(zipped_lists)

    # Return the dictionary
    return(rs_dict)
    

# Call lists2dict: rs_fxn
rs_fxn = lists2dict(feature_names, row_vals)

# Print rs_fxn
print(rs_fxn)

{'CountryName': 'Arab World', 'CountryCode': 'ARB', 'IndicatorName': 'Adolescent fertility rate (births per 1,000 women ages 15-19)', 'IndicatorCode': 'SP.ADO.TFRT', 'Year': '1960', 'Value': '133.56090740552298'}


**unzip**

Unzipping means converting the zipped values back to the individual self as they were. This is done with the help of “*” operator.

In [154]:
# unzip 
  
# initializing lists 
  
name = [ "Manjeet", "Nikhil", "Shambhavi", "Astha" ] 
roll_no = [ 4, 1, 3, 2 ] 
marks = [ 40, 50, 60, 70 ] 
  
# using zip() to map values 
mapped = zip(name, roll_no, marks) 
  
# converting values to print as list 
mapped = list(mapped) 
  
# printing resultant values  
print ("The zipped result is : ",end="") 
print (mapped) 
  
print("\n") 

The zipped result is : [('Manjeet', 4, 40), ('Nikhil', 1, 50), ('Shambhavi', 3, 60), ('Astha', 2, 70)]




In [155]:
# unzipping values 
name, roll_no, marks = zip(*mapped)
  
print ("The unzipped result: \n",end="") 
  
# printing initial lists 
print ("The name list is : ",end="") 
print (namz) 
  
print ("The roll_no list is : ",end="") 
print (roll_noz) 
  
print ("The marks list is : ",end="") 
print (marksz) 

The unzipped result: 
The name list is : ('Manjeet', 'Nikhil', 'Shambhavi', 'Astha')
The roll_no list is : (4, 1, 3, 2)
The marks list is : (40, 50, 60, 70)


In [314]:
# Python code to demonstrate the application of zip() 

# initializing list of players. 
players = [ "Sachin", "Sehwag", "Gambhir", "Dravid", "Raina" ] 
  
# initializing their scores 
scores = [100, 15, 17, 28, 43 ] 
  
# printing players and scores. 
for pl, sc in zip(players, scores): 
    print ("Player :  %s     Score : %d" %(pl, sc)) 

Player :  Sachin     Score : 100
Player :  Sehwag     Score : 15
Player :  Gambhir     Score : 17
Player :  Dravid     Score : 28
Player :  Raina     Score : 43


#### list comprehension

##### Nested list comprehensions

In [211]:
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)]for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]


In [209]:
[col for col in range(5)]

[0, 1, 2, 3, 4]

In [210]:
for col in range(5):
    print(col)

0
1
2
3
4


##### Using conditionals in comprehensions (1)

[ output expression for iterator variable in iterable if predicate expression ].

In [212]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]

# Print the new list
print(new_fellowship)


['samwise', 'aragorn', 'legolas', 'boromir']


##### Using conditionals in comprehensions (2)

In [213]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]

# Print the new list
print(new_fellowship)

['', 'samwise', '', 'aragorn', 'legolas', 'boromir', '']


#### Dict comprehensions

In [319]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create dict comprehension: new_fellowship
new_fellowship = {member: len(member) for member in fellowship}

# Print the new dictionary
print(new_fellowship)


{'frodo': 5, 'samwise': 7, 'merry': 5, 'aragorn': 7, 'legolas': 7, 'boromir': 7, 'gimli': 5}


#### Python Generators

https://www.programiz.com/python-programming/generator

https://www.pythoncentral.io/python-generators-and-yield-keyword/

The idea of generators is to calculate a series of results one-by-one on demand (on the fly). In the simplest case, a generator can be used as a list, where each element is calculated lazily

Simply speaking, a generator is a function that returns an object (iterator) which we can iterate over (one value at a time).

However, although the generator is iterable, it is not a collection, and thus has no length. **Collections (lists, tuples, sets, etc) keep all values in memory** and we can access them whenever needed. A generator calculates the values on the fly and forgets them, so it does not have any overview about the own result set.

Generators are especially useful for memory-intensive tasks, where there is no need to keep all of the elements of a memory-heavy list accessible at the same time. Calculating a series of values one-by-one can also be useful in situations where the complete result is never needed, yielding intermediate results to the caller until some requirement is satisfied and further processing stops.

##### How to create a generator in Python?

It is fairly simple to create a generator in Python. It is as easy as defining a normal function **with yield statement instead of a return statement.**

In [321]:
def test_function():
    yield 1 
    yield 2 
    yield 3

a = test_function()

In [322]:
next(a)

1

In [323]:
next(a)

2

In [324]:
next(a)

3

In [326]:
a = test_function()
for i in a:
    print(i)

1
2
3


In [325]:
# Create a list of strings
lannister = ['cersei', 'jaime', 'tywin', 'tyrion', 'joffrey']

# Define generator function get_lengths
def get_lengths(input_list):
    """Generator function that yields the
    length of the strings in input_list."""

    # Yield the length of a string
    for person in input_list:
        yield(len(person))
        

# Print the values generated by get_lengths()
for value in get_lengths(lannister):
    print(value)

6
5
5
6
7


In [246]:
a = get_lengths(lannister)
next(a)

6

In [247]:
next(a)

5

In [223]:
# A simple generator function
def my_gen():
    n = 1
    print('This is printed first')
    # Generator function contains yield statements
    yield n

    n += 1
    print('This is printed second')
    yield n

    n += 1
    print('This is printed at last')
    yield n

In [224]:
# It returns an object but does not start execution immediately.
a = my_gen()
# We can iterate through the items using next().
next(a)

This is printed first


1

In [226]:
# Once the function yields, the function is paused and the control is transferred to the caller.
# Local variables and theirs states are remembered between successive calls.
next(a)

This is printed second


2

In [227]:
next(a)

This is printed at last


3

In [228]:
# Finally, when the function terminates, StopIteration is raised automatically on further calls.
next(a)

StopIteration: 

In [230]:
# Using for loop
for item in my_gen():
    print(item)    

This is printed first
1
This is printed second
2
This is printed at last
3


In [327]:
len('hello')

5

In [328]:
for i in range(5):
    print(i)

0
1
2
3
4


In [333]:
for i in range(5, 0, -1):
    print(i)
#         yield my_str[i]

5
4
3
2
1


In [334]:
#Let's take an example of a generator that reverses a string.
def rev_str(my_str):
    length = len(my_str)
    for i in range(length - 1, -1, -1):
        yield my_str[i]


# For loop to reverse the string
# Output:
# o
# l
# l
# e
# h
for char in rev_str("hello"):
     print(char)

o
l
l
e
h


##### Why generators are used in Python?

1. Easy to Implement
Generators can be implemented in a clear and concise way as compared to their iterator class counterpart. Following is an example to implement a sequence of power of 2's using iterator class.

2. Memory Efficient
A normal function to return a sequence will create the entire sequence in memory before returning the result. This is an overkill if the number of items in the sequence is very large.Generator implementation of such sequence is memory friendly and is preferred since it only produces one item at a time.

In [335]:
def PowTwoGen(max = 10):
    n = 0
    while n <= max:
        yield 2 ** n
        n += 1

In [336]:
a = PowTwoGen()

In [337]:
next(a)
next(a)
next(a)
next(a)
next(a)

16

3. Represent Infinite Stream
Generators are excellent medium to represent an **infinite stream of data**. Infinite streams cannot be stored in memory and since generators produce only one item at a time, it can represent infinite stream of data.


In [258]:
#The following example can generate all the even numbers (at least in theory).
def all_even():
    n = 0
    while True:
        yield n
        n += 2

In [259]:
even = all_even()
next(even)

0

In [260]:
next(even)

2

In [261]:
next(even)

4

In [255]:
def hold_client(name):
    yield 'Hello, %s! You will be connected soon' % name
    yield 'Dear %s, could you please wait a bit.' % name
    yield 'Sorry %s, we will play a nice music for you!' % name
    yield '%s, your call is extremely important to us!' % name

In [256]:
a = hold_client('ruby')

In [250]:
next(a)

'Hello, ruby! You will be connected soon'

In [251]:
next(a)

'Dear ruby, could you please wait a bit.'

In [252]:
next(a)

'Sorry ruby, we will play a nice music for you!'

In [253]:
next(a)

'ruby, your call is extremely important to us!'

In [257]:
for item in a:
    print(item)

Hello, ruby! You will be connected soon
Dear ruby, could you please wait a bit.
Sorry ruby, we will play a nice music for you!
ruby, your call is extremely important to us!


#### List comprehensions vs generators

**The major difference between a list comprehension and a generator expression is that while list comprehension produces the entire list, generator expression produces one item at a time.**

In [215]:
# List of strings
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# List comprehension
fellow1 = [member for member in fellowship if len(member) >= 7]

# Generator expression
fellow2 = (member for member in fellowship if len(member) >= 7)

In [216]:
type(fellow1)

list

In [217]:
type(fellow2)

generator

In [219]:
# Create generator object: result
result = (num for num in range(31))

# Print the first 5 values
print(next(result))
print(next(result))
print(next(result))
print(next(result))
print(next(result))

# Print the rest of the values
# for value in result:
#     print(value)


0
1
2
3
4


In [221]:
# Create a list of strings
lannister = ['cersei', 'jaime', 'tywin', 'tyrion', 'joffrey']

# Define generator function get_lengths
def get_lengths(input_list):
    """Generator function that yields the
    length of the strings in input_list."""

    # Yield the length of a string
    for person in input_list:
        return(len(person))
        

# # Print the values generated by get_lengths()
# for value in get_lengths(lannister):
#     print(value)

In [222]:
get_lengths(lannister)

6

#### context manager Open a connection to the file

In [None]:

with open('world_dev_ind.csv') as file:
