### NumPy First Steps
### University of Virginia
### Programming for Data Science
### Last Updated: March 22, 2021
---  

### PREREQUISITES
- Part I
  - import
  - functions
  - for ... in
- Part II
  - lists (for part II)
  - hist (for part II)
  - indexing
- Part III
  - [it's like Luke going into the cave on Dagobah](https://youtu.be/wTXV59f6m0g?t=70)

### SOURCES 
- **Python for Data Analysis, Chapter 4 (be sure to read this)**
- https://numpy.org/
- https://en.wikipedia.org/wiki/NumPy
- https://www.scipy.org/
- https://en.wikipedia.org/wiki/SciPy
- https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html
- https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html
- https://numpy.org/doc/stable/reference/random/index.html
- https://docs.scipy.org/doc/numpy-1.15.1/reference/generated/numpy.random.randint.html




### OBJECTIVES
- Take your first steps with numpy
 


### CONCEPTS

- The numpy package contains useful functions for math operations
- The ndarray is the workhorse of the package


---

### numpy

In this lesson we will explore the python package numpy.  This package is super powerful but we will withold all of that for now. Our focus will be on importing and calling a couple functions to get oriented.

### Motivation

I need a random number! I heard numpy can do that!  

Be sure numpy is installed, or you'll see the error below:

In [None]:
import numpy as np
np.random.randint(1,7)

In [None]:
# this is the remedy for the above error (NB: "module not found ---> pip install or conda install")
!pip install numpy

After installing the numpy package, everything is business as usual:

In [None]:
import numpy as np
np.random.randint(1,7)

Learn what this function does:

In [None]:
help(np.random.randint)

### Motivation - (musical, click image for youtube)

[![Dazed and Confused](http://i3.ytimg.com/vi/yO2n7QoyieM/hqdefault.jpg)](https://www.youtube.com/watch?v=yO2n7QoyieM "Dazed and Confused")

---

## Lesson: Part I - reading
As always we start by reading the manual. Head over to https://numpy.org/, and complete the following checklist:
* skim the page to see what's there, note any questions you have
* go to the section with case studies, pick one, and read about it, note any questions you have
* try to answer your questions
  * reading the manual
  * checking message boards
  * check with your fellow students
  * reach out to the professor during office hours
* continue until you have no questions
* explain to your grandmother what numpy is (or someone else as long as they aren't in this program, I'm assuming your grandmother isn't in our program)
  * if your grandmother happens to be someone as cool as [Grace Hopper](https://en.wikipedia.org/wiki/Grace_Hopper) then please invite her to come and speak
  * the approach of explaining to your grandmother is a technique often used by Albert Einstein, it shows you understand what you are talking about because we often have more difficulty lying to our grandmothers than to ourselves
  
* NB: we are going to do this process a lot: read, question, answer, explain. learn it, love it.

---

# Lab Exercises
TRY FOR YOURSELF (UNGRADED EXERCISES)

### Exercise 1: first steps in generating data **(beginner)**

Prompt: create 10 random integers ranging from 1 to 6 inclusive

In [None]:
# your solution here


In [None]:
# one solution
import numpy as np
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)
np.random.randint(1,7)

In [None]:
# another solution
import numpy as np
for i in range(10): print(np.random.randint(1,7)

In [None]:
# another solution, again, but without the syntax error
import numpy as np
for i in range(10): print(np.random.randint(1,7))

In [None]:
# another solution
import numpy as np
np.random.randint(1,7,10)

#### end of exercise wrap up
* Explain to your grandmother the differences in the example solutions.
* Think about when you would use each one.

### Exercise 2: **(intermediate)**

In this exercise we are going to explore the object that the function np.random.randint(...) returns. The array...

In [None]:
# import
import numpy as np

# first step, call our function and wrap it in the 'type' function to see what it returns
print(np.random.randint(10))
print(type(np.random.randint(10))) # this should return 'int'

# now we change the input to the randint function
print()
print(np.random.randint(1,21,5))
print(type(np.random.randint(1,21,5)))

ok, let's break that down.

The first call produces a number and we learn it is an 'int' aka integer.

The second call produces something that looks like a list but we see that it is actually a 'numpy.ndarray'

Read the [documentation](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html), write down questions, get those questions answered, explain what an ndarray is to your grandmother.


**Prompt:** Make a [histogram](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.hist.html) based on the randint distribution to confirm that it is actually an random integer distribution.

In [None]:
#!pip install matplotlib

In [None]:
import matplotlib
import matplotlib.pyplot as plt

%matplotlib inline

#[INSERT CALL TO HISTOGRAM HERE]

In [None]:
# SOLUTION

import matplotlib
import matplotlib.pyplot as plt

%matplotlib inline

plt.hist(np.random.randint(1,21,100000),bins=20)

#### end of exercise wrap up
* Explain to your grandmother how your plot demonstrates the randint function is doing its job.
* Try checking a different distribution like np.random.normal

### Exercise 3: More NumPy ndarray Operations


In [None]:
# generate matrix of random normals

x = np.random.randn(2, 3)
x

In [None]:
help(np.random.randn)

In [None]:
x

In [None]:
# scale the matrix - notice how this operates elementwise

x * 2

In [None]:
# addition

x + x

In [None]:
# reciprocal

1 / x

#### Many ways to create new arrays

In [None]:
x=np.zeros(5)
print(x)

In [None]:
x.shape

In [None]:
y=np.zeros([5,1])

In [None]:
y

In [None]:
y.shape

In [None]:
z=np.zeros([1,5])
print(z)

In [None]:
z.shape

In [None]:
q=np.zeros((2,3))
print(q)

In [None]:
q.shape

In [None]:
np.ones((2,3))

In [None]:
np.identity(4)

#### Indexing and Slicing

In [None]:
z = np.random.randn(5)
z

In [None]:
# select from start to one before end
z[1:4]

In [None]:
# does this make sense?
z[1:-1]

In [None]:
# boolean indexing
z[z>0.15]

In [None]:
# assignment
z[1] = 3
z

In [None]:
# 2D
w = np.random.randn(3,3)
w

In [None]:
# subset rows and columns
w[1:, :2]

#### TRY FOR YOURSELF
Write code to generate a new array *w2* that starts with *w* and sets all negative values to 0.
Then print *w2* and *w*.  

Be sure to use:  
w2 = w.copy()  
or w will get updated as well

In [None]:
w2 = w.copy()
w2[w2 < 0] = 0
print('w2:\n', w2)
print('')
print('w:\n', w)

### Exercise 4 (OPTIONAL): let's simulate risk **(advanced)**

**Prompt:** In the classic board game Risk players control armies in a quest for [world domination](https://www.linuxjournal.com/article/3676). This catastrophic scenario is governed by the roll of dice. In a battle each side rolls dice and then casualties are assigned. The goal of this exercise is to simulate a battle in the game of risk. An army of _N_ units is attacking another army of _N_ units. The question to answer is, who is more likely to win for all values of _N_ between 5 and 50? Assume the attacker always attacks with the full number of armies and the defender defends with the full number of armies, no retreating.

* NB: Please share your solutions on github. Collaboration is strongly encouraged.

For full details of the game, please consult: [The Classic Board Game Risk](https://en.wikipedia.org/wiki/Risk_(game))

If you would like to discuss further, please contact the author, Pete Alonzi: lpa2a@virginia.edu

##### solution

As if I was going to give a solution here, where would the fun be :) However I will offer some tips. I found making some functions to be useful. For example:

In [None]:
def rollDice(n):
    ''' This function returns a sorted list of integers of length n. Each integer is from 1 to 6 inclusive. '''
    x = np.random.randint(1,7,n)
    x = x.tolist()
    return sorted(x,reverse=True)

print(rollDice(3))

#### end of exercise wrap up
* Make a plot showing the probability of victory for the attacker as a function of _N_.
* Explain the plot to your grandmother.

* NB: the wrap up has nothing to do with the coding or numpy, but everything to do with solving the problem in the prompt.

### victory lap (click image for youtube)
[![Everything Is AWESOME](https://img.youtube.com/vi/StTqXEQ2l-Y/0.jpg)](https://www.youtube.com/watch?v=StTqXEQ2l-Y "Everything Is AWESOME")

---

## What to focus on for proficiency

To develop better practice with numpy ask yourself the following question while you are coding.


* **Is there a numpy routine for that?**


If you are unfamiliar with the inner game focus approach, check out: [The Inner Game of Tennis by Gallwey](https://search.lib.virginia.edu/sources/uva_library/items/u619947)