## Iterables
Include comprehensions, iterable objects, and iterators. We can briefly discuss the lazy evaluation model with generators.

### List Comprehensions.
The shorthand of comprehension will make your code more readable, expressive, and effective.

In [19]:
# a big string
words = "Today I am very happy to learn about Comprehensions".split()
words

['Today', 'I', 'am', 'very', 'happy', 'to', 'learn', 'about', 'Comprehensions']

In [20]:
# Create a new list with the length of each string from words
lengths = []
for word in words:
  lengths.append(len(word))

print(words)
print(lengths)

['Today', 'I', 'am', 'very', 'happy', 'to', 'learn', 'about', 'Comprehensions']
[5, 1, 2, 4, 5, 2, 5, 5, 14]


In [21]:
# Use a list comprehension instead
lengths = [len(word) for word in words]
print(lengths)

[5, 1, 2, 4, 5, 2, 5, 5, 14]


In [27]:
# Using a list comprehension, calculate the length of the first 20 factorial numbers
from math import factorial as fact
factorial_lengths = [len(str(fact(number))) for number in range(1,21)]
print(factorial_lengths)
print([fact(number) for number in range(1,21)])

[1, 1, 1, 2, 3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 13, 14, 15, 16, 18, 19]
[1, 2, 6, 24, 120, 720, 5040, 40320, 362880, 3628800, 39916800, 479001600, 6227020800, 87178291200, 1307674368000, 20922789888000, 355687428096000, 6402373705728000, 121645100408832000, 2432902008176640000]


### Set Comprehensions
General form:
`{expr(item) for item in iterable}`

In [31]:
# Task: Create a list of unique length of the first 20 factorials


### Dictionary Comprehensions
General Form:
`{key_expr:value for item in iterable}`

In [34]:
element_protons = {'Hydrogen':'1','Helium':'2','Lithium':'3'}
# Create a dictionary comprehension
protons_element = {protons:element for element, protons in element_protons.items()}
print(protons_element)
print(element_protons)

{'1': 'Hydrogen', '2': 'Helium', '3': 'Lithium'}
{'Hydrogen': '1', 'Helium': '2', 'Lithium': '3'}


### Filtering Predicates
You may use `optional` filtering predicate

General form:
`[expr(item) for item in iterable if predicate(item)]`

In [1]:
from math import sqrt

def is_prime(n):
    if n <= 1:
        return False
    return all(n % i != 0 for i in range(2, int(sqrt(n)) + 1))


prime_list = [i for i in range(101) if is_prime(i)]
print(prime_list)

[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]


### TODO:
- Iteration P=rotocools
- Generators

### Correlation Analysis
- What is correlation Analysis?
- How does correlation analysis help with data cleaning?
- Coding Example

### Correlation Analysis
Is a statistical technique used to examine the strength and direction of the relationship
 between two or more variables.

Analyze the degree to which changes from one variable are associated with changes from another variable.

### How to do it?
Use  `Correlation Coefficients`, which is a measurement of the strength and the direction of the relationship between the two variables

$$
\{X,Y\}
$$

### Popular Correlationn Coefficients

#### Pearson
- Use for continuous data
- Measures the strength of the `linear relationship` between the variables
- Sensitive to outliers

#### Spearman
- Use for ordinal or ranked data
- Measures the strength of the `monotonic relationship` between the variables, which can be linear or non-linear
- More robust toward outliers

Correlation analysis can identify variables that are highly correlated to each other.

The analysis may indicate if one variable is `redundant` and can be eliminated

### Dealing with Categorical Data
- Data has many `non-numeric` features. You CAN NOT feed them into a learning model. They need to be converted
- Use the pd.dtypes() to see the data types

### Two Main Types
- Lable encoding
- One-hot encoding

### Label encoding
Each `unique category in the categorical variable is assigned a numerical label. Typically starting at 0, 1, etc


### One-hot Encoding
A new binary feature is created `for each` category, and the value of that feature is set to 1 if the observation belongs to that category