# Python Data Science ToolBox(Part2)

## Chapter I 

### Using iterators in PythonLand 

#### Iterating with a for loop

* We can iterate over a list using a for loop

```python
employees = ['Nick', 'Lore', 'Hugo']
for employee in employees:
    print(employee)

# Output == > Nick, lore, Hugo
            
```

* We can also iterate overa string using a for loop

```python
for letter in 'DataCamp':
    print(letter)
```

* We can iterate over a range object using a for loop

```python
for i in range(4):
    print(i)
```

#### Iterators vs. Iterables

- Iterable

    * Examples: lists, strings, dictionaries, file connections
    * An object with an associated iter() method
    * Applying iter() to an iterable creates an iterator

- Iterator

    * Produces next Value with next()




In [None]:
## Iterating Over iterables: next()

word = 'Da'
it = iter(word)
next(it)
next(it)
# next(it) ==> this will throw an StopIteration error

In [None]:
## Iterating at once with *

word = 'Data'
it = iter(word)
print(*it)

print(*it) # ==> no more values to go through 

In [None]:
## Iterating over dictionaries

pythonistas = {'hugo': 'bowne-anderson', 'francis': 'castro'}

for key,val in pythonistas.items():
    print(key, val)

In [None]:
## Iterating over file connections

file = open('DataSets\\tweets.csv')
it = iter(file)
print(next(it))

In [None]:
# Create a list of strings: flash

flash = ['jay garrick', 'barry allen', 'wally west', 'bart allen']

# Print each list item in flash using a for loop
for person in flash:
    print(person)

# Create an iterator for flash: superhero
superhero = iter(flash)

# Print each item from the iterator
print(next(superhero))
print(next(superhero))
print(next(superhero))
print(next(superhero))

In [None]:
# Create a range object: values
values = range(10,21)

# Print the range object
print(values)

# Create a list of integers: values_list
values_list = list(values)

# Print values_list
print(values_list)

# Get the sum of values: values_sum
values_sum = sum(values)

# Print values_sum
print(values_sum)

### Playing with Iterators


In [None]:
## Using enumerate()

avangers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
e = enumerate(avangers)
print(type(e))

e_list = list(e)
print(e_list)


In [None]:
## Enumerate() and unpack

avangers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
for index, val in enumerate(avangers):
    print(index, val)
print('\n-----Starting from 1-------\n')
for index, val in enumerate(avangers, start= 1):
    print(index, val)

In [None]:
## Using zip()

avangers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']
z = zip(avangers, names)
print(type(z))

print('\n--------\n')
z_list = list(z)
print(z_list)

In [None]:
## zip() and unpack

avangers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']

for z1, z2 in zip(avangers, names):
    print(z1, z2)

In [None]:
## Print zip with *

avangers = ['hawkeye', 'iron man', 'thor', 'quicksilver']
names = ['barton', 'stark', 'odinson', 'maximoff']
z = zip(avangers, names)

print(*z)


### Using iterators to load large files into memory

* There can be too much data to hold in memory

* Solution: load data in **chunks!**

- Pandas function: read_csv()
    * Specify the chunk: chunk_size

```python
import pandas as pd
result= []

for chunk in pd.read_csv('data.csv', chunk_size=10000):
    result.append(sum(chunk['x']))
total = sum(result)
print(total)
```


## Chapter II 

### List comprehensions and generators

#### List Comprehensions

* Collapse for loops for building lists into a single line

- **Components**
    * Iterable
    * Iterator variable(represent members of iterable)
    * Output expression

In [None]:
## Populate a list with a for loop

nums = [12, 8, 21, 3, 16]
new_nums = []

for num in nums:
    new_nums.append(num+1)

print(new_nums)

In [None]:
## A list Comprehension

nums = [12, 8, 21, 3, 16]
new_nums = [num + 1 for num in nums]
print(new_nums)

In [None]:
## List Comprehension with range()

result = [num for num in range(11)]
print(result)

In [None]:
## Nested Loops

pairs_1 = []
for num1 in range(0, 2):
    for num2 in range(6, 8):
        pairs_1.append((num1, num2))

print(pairs_1)

In [None]:
## Nested Loops with list Comprehension

pairs_2 = [(num1, num2) for num1 in range(0, 2) for num2 in range(6, 8)]
print(pairs_2)

# Tradeoff: Readability 

In [None]:
# Create a 5 x 5 matrix using a list of lists:

matrix = [[col for col in range(5)] for row in range(5)]

# Print the matrix
for row in matrix:
    print(row)

### Advanced Comprehensions

#### Conditionals in comprehensions

```
[ output expression for iterator variable in iterable if predicate expression ]
```

* Conditionals on the iterable

```python
[num ** 2 for num in range(10) if num % 2 == 0]

# Output 
'''
[0, 4, 16, 36, 64]
'''
```
* Conditionals on the output expression

```python
[num ** 2 if num % 2 == 0 else 0 for num in range(10)]

# Output
'''
[0, 0, 4, 0, 16, 0, 36, 0, 64, 0]
'''
```

#### Dict Comprehensions

* They do create dictionaries

* Use curly {} instead of square [] brackets

```python
pos_neg = {num: -num for num in range(9)}
print(pos_neg)

# Output
'''
{0: 0, 1: -1, 2: -2, 3: -3, 4: -4, ....}
'''
```


In [None]:
# Create a list of strings: fellowship
fellowship = ['frodo', 'samwise', 'merry', 'aragorn', 'legolas', 'boromir', 'gimli']

# Create list comprehension: new_fellowship
new_fellowship = [ member for member in fellowship if len(member) >= 7 ]

# Print the new list
print(new_fellowship)

print('\n--------\n')
new_fellowship = [member if len(member) >= 7 else '' for member in fellowship]
print(new_fellowship)

### Introduction to Generator Expressions

#### Generator expressions

* List comprehension - returns a list

* Generators - returns a generator object

* Both can be iterated over

* use () instead of [] to create a generator object

```python
(2 * num for num in range(10))
```

##### Printing values from generators

```python
result = (num for num in range(6))
for num in result:
    print(num)
```
* Lazy evaluation

```python
result = (num for num in range(6))

print(next(result)) # ==> 0
print(next(result)) # ==> 1
print(next(result)) # ==> 2
print(next(result)) # ==> 3
print(next(result)) # ==> 4
print(next(result)) # ==> 5
```

##### Generator expressions

```python
even_nums = (num for num in range(10) if num % 2 == 0)
print(list(even_nums))

# Output
'''
[0, 2, 4, 6, 8]
'''
```

#### Generator Functions

* Produces generator objects when called

* Defined like a regular function 'def'

* **Yields** a sequence of values instead of returning a single value

* Generates a value with 'yield' keyword

```python
def num_sequence(n):
    """Generate values from 0 to n"""
    i = 0
    while i < n :
        yield i
        i += 1
```

#### Wrap-up

* Basic Syntax

```
[output expression for iterator variable in iterable]
```

* Advanced Syntax

```
[output expression + conditional on output for iterator variable in iterable + conditional on iterable]
```