### Iterators vs. iterables

we can loop over **iterables**: lists, strings range objects, dictionaries and file connections

> not all iterables are actual lists

The actual definition of an iterable is an object that has an associated iter method. Once this iter method is applied to an iterable, an iterator object is created. 

> a loop takes an iterable, creates the associated iterator object, and iterates over it

An **iterator** is defined as an object that as an associated next method that produces the consecutive values.

> You can been using the `iter()` function to get an iterator object, as well as the `next()` function to retrieve the values one by one from the iterator object.

Once we have the iterator defined, we pass it to the function `next()` and this returns the first value. Calling `next()` again on the iterator returns the next value until there are no values left to return and then it throws us a `StopIteration` error.

> The `star` operator unpacks all elements of an iterator or an iterable.

> To iterate over the key-value pairs of a Python dictionary, we need to unpack them by applying the `items()` method to the dictionary as you can see here.


There are also functions that take iterators and iterables as arguments. For example, the `list()` and `sum()` functions return a list and the sum of elements, respectively.

In [3]:
# Create an iterator for range(3): small_value
small_value = iter(range(3))

# Print the values in small_value
print(next(small_value))
print(next(small_value))
print(next(small_value))


0
1
2


In [4]:

# Loop over range(3) and print the values
for num in range (3):
    print(num)


0
1
2


In [5]:

# Create an iterator for range(10 ** 100): googol
googol = iter(range(10**100))

# Print the first 5 values from googol
print(next(googol))
print(next(googol))
print(next(googol))
print(next(googol))
print(next(googol)) 

0
1
2
3
4


In [6]:
# Create a range object: values
values = range(10,21)

# Print the range object
print(values)

# Create a list of integers: values_list
values_list = list(values)

# Print values_list
print(values_list)

# Get the sum of values: values_sum
values_sum = sum(values)

# Print values_sum
print(values_sum)

range(10, 21)
[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]
165


In [7]:
# Create a list of strings: mutants
mutants = ['charles xavier', 
            'bobby drake', 
            'kurt wagner', 
            'max eisenhardt', 
            'kitty pryde']

# Create a list of tuples: mutant_list
mutant_list = list(enumerate(mutants))

# Print the list of tuples
print(mutant_list)

# Unpack and print the tuple pairs
for index1, value1 in enumerate(mutants):
    print(index1, value1)

# Change the start index
for index2, value2 in enumerate(mutants, start=1):
    print(index2, value2)

[(0, 'charles xavier'), (1, 'bobby drake'), (2, 'kurt wagner'), (3, 'max eisenhardt'), (4, 'kitty pryde')]
0 charles xavier
1 bobby drake
2 kurt wagner
3 max eisenhardt
4 kitty pryde
1 charles xavier
2 bobby drake
3 kurt wagner
4 max eisenhardt
5 kitty pryde


`zip()` takes any number of iterables and returns a zip object that is an iterator of tuples. 

If you wanted to print the values of a zip object, you can convert it into a list and then print it. Printing just a zip object will not return the values unless you unpack it first.

if you are pulling huge amounts of data, one solution is to load the data in chunks, perform the desired operation or operations on each chuck, store the result, discard the chunk and then load the next chunk

```
# Initialize an empty dictionary: counts_dict
counts_dict = {}

# Iterate over the file chunk by chunk
for chunk in pd.read_csv('xx.csv', chunksize = 10):

    # Iterate over the column in DataFrame
    for entry in chunk['lang']:
        if entry in counts_dict.keys():
            counts_dict[entry] += 1
        else:
            counts_dict[entry] = 1

# Print the populated dictionary
print(counts_dict)
```

**list comprehensions** collapse for loops for building lists into a single line and the required components are 
1) an iterable
2) an iterator variable that represents the members of the iterable 
3) an output expression

In [9]:
# Create a 5 x 5 matrix using a list of lists: matrix
matrix = [[col for col in range(5)]  for row in range (5)]

# Print the matrix
for row in matrix:
    print(row)

[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]
[0, 1, 2, 3, 4]


 between []:
output expression `for` iterator variable `in` iterable `if predicate expression 
 


In [10]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [member for member in fellowship if len(member) >= 7]

# Print the new list
print(new_fellowship)

['samwise', 'aragorn', 'legolas', 'boromir']


In [11]:
# Create list comprehension: new_fellowship
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]

# Print the new list
print(new_fellowship)

['', 'samwise', '', 'aragorn', 'legolas', 'boromir', '']


In [12]:
# Create dict comprehension: new_fellowship
new_fellowship = {member:len(member) for member in fellowship}

# Print the new dictionary
print(new_fellowship)

{'frodo': 5, 'samwise': 7, 'merry': 5, 'aragorn': 7, 'legolas': 7, 'boromir': 7, 'gimli': 5}


a **generator** is like a list comprehension except it does not store the list in memory: it does not construct the list, but is an object we can iterate over to produce elements of the list as required.

> like any other iterator, we can pass a generator to the function next in order to iterate through its elements

> This can help a great deal when working with extremely large sequences as you don't want to store the entire list in memory, which is what comprehensions would do



Container sequences are iterable