## What We Looked At Last Time
* We wrapped up our discussion of sets.
* We introduced NumPy arrays.

## What We'll Look At Today
* We'll work through a few exercises relevant to lists, dictionaries, and sets.
* We'll take a look at _one way_ to create animations in Python.
* We'll continue to discuss NumPy array concepts.


# Some exercises

# Exercise 5.4

Create a 2-by-3 list then use a nested loop to:
1. Set each element's value to an integer indicating the order in which it was processed by the nested loop. 
2. Display the elements in tabular format. Use the column indices as headings across the top, and the row indices of the left to each row. 

In [None]:
values = [[0, 0, 0],[0, 0, 0]]
print(values)

In [None]:
count = 0
for row in range(len(values)):
    for col in range(len(values[row])):
        count += 1
        values[row][col] = count
        
print(values)

In [None]:
print('   ', end = '')

for col in range(len(values[0])): #Print a line for column indices
    print(f'{[col]}', end = ' ')
print()

for i, row in enumerate(values):
    print(f'{[i]}', end = ' ')
    for value in row:
        print(f'{value}   ', end = '')
    print()

# Exercise 5.5
Create a string called alphabet containing 'abcdefghijklmnopqrstuvwxyz', then perform the following separate slice operations to obtain:

In [None]:
alphabet = 'abcdefghijklmnopqrstuvwxyz'

## The first half of the string using starting and ending indices.

In [None]:
alphabet[0:len(alphabet)//2]

## The first half of the string using only the ending index.

In [None]:
alphabet[:len(alphabet) // 2]

## The second half of the string using starting and ending indices.

In [None]:
alphabet[len(alphabet) //2: len(alphabet)]

## The second half of the string using only the starting index.

In [None]:
alphabet[len(alphabet) //2:]

## Every second letter in the string starting with 'a'.

In [None]:
alphabet[::2]

## The entire string in reverse. 

In [None]:
alphabet[::-1]

## Every third letter of the string in reverse starting with 'z'.

In [None]:
alphabet[::-3]

# Exercise 5.7
Create a function that receives a list and returns a list containing only the unique values in sorted order. Test your function with a list of numbers and a list of strings.  

In [None]:
def uniquesorted(values):
    non_duplicates = []
    for value in values:
        if value not in non_duplicates:
            non_duplicates.append(value)
    non_duplicates.sort()        
    return non_duplicates
                

In [None]:
numbers = [11, 11, 2, 2, 7, 7 , 5, 5, 3, 3]
print(uniquesorted(numbers))

In [None]:
colors = ['red','red','orange','orange','yellow','green','green','yellow']
print(uniquesorted(colors))

# Exercise 6.6
Write a function that receives a list of words, then determines and displays in alphabetical order only the unique words. Treat uppercase and lowercase the same. The function should use a set to get the unique words in the list. Test your function with several sentences 

In [None]:
def unique_words(words):
    uniques = set(word.lower() for word in words)
    print(sorted(uniques))
    
text = ('This is sample text with several words '
        'This is more sample text with some different words')

unique_words(text.split())

# Intro to Data Science: Dynamic Visualizations
* In this section, we make things “come alive” with _dynamic visualizations_. 

### The Law of Large Numbers
* For a six-sided die, each value 1 through 6 should occur one-sixth of the time, so the probability of any one of these values occurring is 1/6th or about 16.667%.  
* In the dynamic visualization, the more rolls we perform, the closer each die value’s percentage of the total rolls gets to 16.667% and the heights of the bars gradually become about the same. 
* This is a manifestation of the _law of large numbers_. 

## How Dynamic Visualization Works 
* The Matplotlib **`animation`** module’s **`FuncAnimation`** function, updates a visualization _dynamically_.
* Each **animation frame** specifies what to change during one plot update (animation goes from one frame to the next). 
* Stringing together many updates over time creates an animation. 
* This example displays an animation frame every 33 milliseconds—yielding approximately 30 (1000 / 33) frames-per-second. 

### Running `RollDieDynamic.py`
1. Access the command line in Jupyter with **File > New > Terminal**.
2. `cd /yourfolder`.
3. Execute

>```
python RollDieDynamic.py 6000 1
```

>* 6000 is the number of animation frames to display. 
>* 1 is the number of die rolls to summarize in each animation frame.


* To see the law of large numbers in action, increase the execution speed by rolling the die more times per animation frame: 
```python
python RollDieDynamic.py 10000 600
```
*  In this case, `FuncAnimation` perform 10,000 updates, with 600 rolls-per-frame for a total of 6,000,000 rolls. 

## Implementing a Dynamic Visualization 

### Importing the Matplotlib `animation` Module
* We focus primarily on the new features used in this example. 
* We import the Matplotlib `animation` module to access ``FuncAnimation`.

```python 
# RollDieDynamic.py
"""Dynamically graphing frequencies of die rolls."""
from matplotlib import animation
import matplotlib.pyplot as plt
import random 
import seaborn as sns
import sys
```

### Function `update`
```python
def update(frame_number, rolls, faces, frequencies):
    """Configures bar plot contents for each animation frame."""

```
* `FuncAnimation` calls the `update` function once per animation frame. 
* This function must receive at least one argument. 
* Parameters:
    * `frame_number`—The next value from `FuncAnimation`’s `frames` argument. Though `FuncAnimation` requires the `update` function to have this parameter, we do not use it in this `update` function.
    * `rolls`—The number of die rolls per animation frame.
    * `faces`—The die face values used as labels along the graph’s _x_-axis.
    * `frequencies`—The list in which we summarize the die frequencies.


### Function `update`: Rolling the Die and Updating the `frequencies` List
* Roll the die `rolls` times and increment the appropriate `frequencies` element for each roll. 

```python
    # roll die and update frequencies
    for i in range(rolls):
        frequencies[random.randrange(1, 7) - 1] += 1 
```

### Function `update`: Configuring the Bar Plot and Text 
* The `matplotlib.pyplot` module’s **`cla`** (clear axes) function removes the existing bar plot elements before drawing new ones for the current animation frame. 
```python
    # reconfigure plot for updated die frequencies
    plt.cla()  # clear old contents contents of current Figure
    axes = sns.barplot(faces, frequencies, palette='bright')  # new bars
    axes.set_title(f'Die Frequencies for {sum(frequencies):,} Rolls')
    axes.set(xlabel='Die Value', ylabel='Frequency')  
    axes.set_ylim(top=max(frequencies) * 1.10)  # scale y-axis by 10%

    # display frequency & percentage above each patch (bar)
    for bar, frequency in zip(axes.patches, frequencies):
        text_x = bar.get_x() + bar.get_width() / 2.0  
        text_y = bar.get_height() 
        text = f'{frequency:,}\n{frequency / sum(frequencies):.3%}'
        axes.text(text_x, text_y, text, ha='center', va='bottom')
```

### Variables Used to Configure the Graph and Maintain State
* The `sys` module’s `argv` list contains the script’s command-line arguments. 
* The `matplotlib.pyplot` module’s **`figure`** function gets a `Figure` object in which `FuncAnimation` displays the animation &mdash; one of `FuncAnimation`’s required arguments. 

```python
# read command-line arguments for number of frames and rolls per frame
number_of_frames = int(sys.argv[1])  
rolls_per_frame = int(sys.argv[2])  

sns.set_style('whitegrid')  # white background with gray grid lines
figure = plt.figure('Rolling a Six-Sided Die')  # Figure for animation
values = list(range(1, 7))  # die faces for display on x-axis
frequencies = [0] * 6  # six-element list of die frequencies
```

### Calling the `animation` Module’s `FuncAnimation` Function
* `FuncAnimation` returns an object representing the animation. 
* You _must_ store the reference to the animation; if you do _not_, Python immediately assumes the object is no longer needed (remember, no references point to it), and returns its memory to the system. 
* Remember, since you are calling from the command-line, you also _must_ use plt.show(), or the window will never actually display.

```python
# configure and start animation that calls function update
die_animation = animation.FuncAnimation(
    figure, update, repeat=False, frames=number_of_frames, interval=33,
    fargs=(rolls_per_frame, values, frequencies))

plt.show()  # display window

```

`FuncAnimation` has two required arguments:
* `figure`—the `Figure` object in which to display the animation, and
* `update`—the function to call once per animation frame.

* Optional keyword arguments:
    * **`repeat`**—If `True` (the default), when the animation completes it restarts from the beginning.
    * **`frames`**—The total number of animation frames, which controls how many times `FunctAnimation` calls `update`. 
    * **`interval`**—The number of milliseconds between animation frames (the default is 200).
    * **`fargs`** (short for “function arguments”)—A tuple of other arguments to pass to the function you specified in `FuncAnimation`’s second argument. 
* [`FuncAnimation`’s other optional arguments](https://matplotlib.org/api/_as_gen/matplotlib.animation.FuncAnimation.html)


# `array` Operators
* `array` operators perform operations on **entire `array`s**. 
* We can perform arithmetic **between `array`s and scalar numeric values**, and **between `array`s of the same shape**.

In [None]:
import numpy as np

In [None]:
numbers = np.arange(1, 6)
print(numbers)
print(numbers*2)
print(numbers**3)

In [None]:
numbers += 10 #We can also use standard assignment operators with arrays.
print(numbers)

### Broadcasting 
* Arithmetic operations require as operands two `array`s of the **same size and shape**. 
* **`numbers * 2`** is equivalent to **`numbers * [2, 2, 2, 2, 2]`** for a 5-element array.
* Applying the operation to every element is called **broadcasting**. 


In [None]:
numbers2 = np.linspace(1.1, 5.5, 5)
numbers3 = [10,10,10,10,10]
print(numbers2*10)
print(numbers2*numbers3)

## Comparing `array`s
* We can compare `array`s with individual values and with other `array`s
* Comparisons are performed **element-wise**
* The result is an `array` of Boolean values in which each element’s `True` or `False` value indicates the comparison result

In [None]:
print(numbers)
print(numbers2)
print(numbers>=13)
print(numbers2>=13)


In [None]:
print(numbers2 < numbers)
print(numbers == numbers)

## NumPy Calculation Methods
* Many calculations are applicable regardless of the number of elements, and thus can be used on arrays of any shape.
* Consider an `array` representing four students’ grades on three exams: we can compute various information to summarize the data (AKA programming **reductions**).
* The most common methods are **`sum`**, **`min`**, **`max`**, **`mean`**, **`std`** (standard deviation) and **`var`** (variance)
* [Other Numpy `array` Calculation Methods](https://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html)

In [None]:
grades = np.array([[87, 96, 70], [100, 87, 90],
                   [94, 77, 90], [100, 81, 82]])
print(grades.sum())
print(grades.min())
print(grades.mean())
print(grades.std())


## Calculations by Row Or Column
* You can perform calculations by column or row (or other dimensions in arrays with more than two dimensions)
* Each 2D+ array has [**one axis per dimension**](https://docs.scipy.org/doc/numpy-1.16.0/glossary.html)
* In a 2D array, **`axis=0`** indicates calculations should be **column-by-column**, while **`axis=1`** indicates calculations should be **row-by-row** 

In [None]:
grades = np.array([[87, 96, 70], [100, 87, 90],
                   [94, 77, 90], [100, 81, 82]])
print(grades)
print(grades.max(axis=0)) #highest grade per exam
print(grades.mean(axis=1)) #student averages

# Universal Functions
* Standalone [**universal functions** (**ufuncs**)](https://docs.scipy.org/doc/numpy/reference/ufuncs.html) perform **element-wise operations** using one or two `array` or array-like arguments (like lists)
* Each returns a **new `array`** containing the results
* Some ufuncs are called implicitly when you use `array` operators like `+` and `*`

In [None]:
numbers = np.array([1, 4, 9, 16, 25, 36])
print(np.sqrt(numbers))
print(np.log2(numbers))
