# Homework 2 - Numpy Array Methods and Functions (45 pts)

* In order for automatic grading and validation to work, it is very important that you make use of the exact variable names specified in the problem statement.
* I will always place these variable names that must be used for automatic grading in **bold**
* One of the goal of this assignment is for you to explore some numpy functions on your own.  
* Specifically, I will be asking you to look up how to use some functions in numpy rather than providing examples.  
* For these problem, I have hidden the tests (because otherwise you wont have to do research on the method/function).
* To look up a function, i usually type in `numpy` and the function I am interested in e.g., `mean` in the browser search box, and I go look at the numpy docs.  
* You dont have to know the exact name, just have key words and you will find it.
* You can also type in complete sentences like, "How do I compute mean in numpy?"

* Don't forget to  import the numpy module, and for this exercise, let's import the random submodule as a separate import 

In [1]:
import numpy as np
from numpy import random

* Initialize the random number generator with a seed of your choice.  Enter any integer after the equal sign

In [2]:
rng = random.default_rng(seed = 108)

1. Generate a vector **v** with 20 random floating point numbers ranging from 0 to 100. 

In [3]:
### BEGIN SOLUTION
v = rng.uniform(0,100,20)
### END SOLUTION

In [4]:
assert v.size == 20
assert v.dtype == 'float64'

2. Create a variable **max_v** which contains the maximum value in **v**, and the variable **index_max_v** which contatins the index of the element in the array v with maximum value.

In [5]:
### BEGIN SOLUTION
max_v = np.max(v)
index_max_v = np.argmax(v)
### END SOLUTION

In [6]:
assert v[index_max_v] == max_v

3. Generate a matrix **M** with 5 rows and 3 columns containing random integers with values ranging from -10 to 10. Make sure your range includes -10 and 10.    

In [7]:
### BEGIN SOLUTION
M = rng.integers(-10,11,(5,3))
### END SOLUTION

In [8]:
assert M.shape[0] == 5
assert M.shape[1] == 3

4. Create a variable **M_colmin** which is an array that contains the minimum value of each column of M.  Create a variable **M2_min** which contains the minimum value of the 2nd column of M 

In [9]:
### BEGIN SOLUTION
M_colmin = np.min(M,axis=0)
M2_min = np.min(M[:,1])
print(M_colmin)
#alternatively
#M2_min = min(M[:,1])
### END SOLUTION

[-8 -9 -8]


In [10]:
assert M_colmin.size == 3

### Computing Summary Statistics using numpy

One of the goal of this assignment is for you to explore some numpy functions on your own.  Specifically, I will be asking you to look up in the either the numpy documentation, or by simply searching the web how to use some functions in numpy that compute simple summary statisics of data. 

* `mean`, compute the mean or average of data array 
* `median`, compute the middle value of a data array  
* `percentile`, compute the value of the q-th percentile of an array  

I am not going to provide any instruction on how to use them here.  The challenge for you here is to figure it out
for yourself (you can ask for help after you have tried if it doesnt work.)  The internet is your friend.  


5.  Make a random number generator object called `rng` with seed = 77.  This is critical or the tests will fail!

In [20]:
### BEGIN SOLUTION
rng = random.default_rng(seed = 77)
### END SOLUTION

In [21]:
assert rng.integers(-9,9,1) == -8
### BEGIN HIDDEN TESTS
rng = random.default_rng(seed = 77)
### END HIDDEN TESTS

6.  Generate a vector **u** containing 50 random numbers from a uniform distribution ranging from 0 to 100.  Compute variables called **mean_u** and **median_u** that contain the mean and median of u using numpy functions `mean` and `median`. 

In [22]:
### BEGIN SOLUTION
u = rng.uniform(0,100,50)
mean_u = np.mean(u)
median_u = np.median(u) 
### END SOLUTION

In [23]:
assert np.size(u) == 50
### BEGIN HIDDEN TESTS
assert mean_u == np.mean(u)
assert median_u == np.median(u) 
### END HIDDEN TESTS

7.  Generate a vector **v** containing 1000 random floating point numbers from a uniform distribution ranging from 0 to 100.  The objective of this exercise is to compute the 10th, 25th, 50th, 75th, and 90th percentile of **v** using `percentile`. The results should be placed in either an array or a list of length 5 called **v_percentiles**.  So, if i look at v_percentiles[0] it should be the 10th percentile, v_percentiles[1] should be 25th percentile, etc. The answers should make sense to you as approximately corresponding fractions of 100.    


In [24]:
### BEGIN SOLUTION
v = rng.uniform(0,100,1000)
prct = np.array([10,25,50,75,90])
v_percentiles = np.percentile(v,prct)
#Alternate Solution 
#v_percentiles = list() 
#v_percentiles.append(np.percentile(v,10))
#v_percentiles.append(np.percentile(v,25))
#v_percentiles.append(np.percentile(v,50))
#v_percentiles.append(np.percentile(v,75))
#v_percentiles.append(np.percentile(v,90))
### END SOLUTION

In [25]:
assert np.size(v) == 1000
### BEGIN HIDDEN TESTS
assert np.size(v_percentiles) == 5
### END HIDDEN TESTS

8.  Generate a matrix **dice** containing 100 random integers ranging from 1-6 for each of 9 dice.  There should be 100 rows and 9 columns, mimicking 100 rolls of 9 dice.    Compute the mean and median for each die (column) in variables named **dice_median**, **dice_mean**. . Hint: you need to specify the *axis* as explained in these functions   

In [26]:
### BEGIN SOLUTION
dice = rng.integers(1,7,(100,9))
dice_mean = np.mean(dice,axis =0)
dice_median = np.median(dice,axis =0)
### END SOLUTION

In [27]:
assert np.size(dice_mean) == 9 
### BEGIN HIDDEN TESTS
assert np.all(dice_mean == np.mean(dice,axis =0))
assert np.all(dice_median == np.median(dice,axis =0))
### END HIDDEN TESTS

9. Using matrix **dice** generated in the problem above, compute the mean and median for each roll (row) of 9 dice as **roll_mean** and **roll_median**.   

In [28]:
### BEGIN SOLUTION
roll_mean = np.mean(dice,axis =1)
roll_median = np.median(dice,axis =1)
### END SOLUTION

In [29]:
assert np.size(roll_mean) == 100 
### BEGIN HIDDEN TESTS
assert np.all(roll_mean == np.mean(dice,axis =1))
assert np.all(roll_median == np.median(dice,axis =1))
### END HIDDEN TESTS

10. Create an array **v** containing 20 random integers between 10 and 99.  Find the index of the largest element of **v** and call it **imax**.  Replace the largest element of **v** with the value -1000.    

In [30]:
### BEGIN SOLUTION
v = rng.integers(10,100,20)
imax = np.argmax(v)
v[imax] = -1000
### END SOLUTION

In [31]:
assert imax >= 0
### BEGIN HIDDEN TESTS
assert np.min(v) == -1000
assert np.argmin(v) == imax
### END HIDDEN TESTS


11.  Create list **x** containing the following numbers : 6, -5, -3.5, 10, 21.  Convert list **x** into a numpy array **x**.  Sort the array **x** into ascending (increasing) order.   The sorted result should be stored in the array **x**. (Remember = means, place the answer in the variable to the left of the equality).  

In [32]:
### BEGIN SOLUTION
x = [6,-5,-3.5,10,21]
x = np.array(x)
x = np.sort(x)
### END SOLUTION

In [33]:
assert x.size == 5
assert x[-1] == 21
assert x[0] == -5

12. The array **height** and **weight** provide data on 5 males height (in inches) and weight (in lbs).  Sort **height** in increasing order and sort **weight** in the same order that was used to sort **height**, so that each subject is in the same position in both lists.  The resulting sorted lists should be stored in **height** and **weight**  

In [34]:
height = np.array([70,60,66,77,65])
weight = np.array([160,120,150,190,165])
### BEGIN SOLUTION 
horder = np.argsort(height)
height = np.sort(height)
weight = weight[horder]
### END SOLUTION

In [35]:
assert height[0] < height[-1]

In [36]:
assert weight[1] == 165

13. Create a function called **logn** which computes the log of a number **x** with any base **n**.  
Recall that you can compute the log with any base using the base 10 log as 

$ log_n(x) = \frac{log_{10}(x)}{log_{10}(n)} $  

The function **logn** should have input arguments **x** and **n** and should return the value of the logarithm. The base 10 log function can be accessed as `np.log10`

In [37]:
### BEGIN SOLUTION
def logn(x,n):
    log_n = np.log10(x)/np.log10(n)
    return log_n
### END SOLUTION

In [38]:
log_n = logn(4,2)
assert log_n == 2
log_n = logn(27,3)
assert log_n == 3

14.  Write a function called **circle_area** that computes the area of a circle given its radius **r**, using this formula
$$ A = \pi r^2 $$
The function should have one input argument (**r**) and return the area **A**
The input argument **r** should take a default value of 1.  
The value of pi can be obtained from `np.pi` 
After writing and executing the definition of the function, test that it works with the three tests below 

In [39]:
### BEGIN SOLUTION
def circle_area(r=1):
    A = np.pi*r**2
    return A
### END SOLUTION

In [40]:
r = 2
Area  = circle_area(r) 
assert Area == np.pi*r**2

In [41]:
Area = circle_area()
assert Area == np.pi*1**2 

In [42]:
Area = circle_area(r=3)
assert Area == np.pi*3**2

15.  Write a function called **cylinder_area** that computes the surface area of a the cylinder given input arguments, radius (**r**) and height (**h**).  

The surface area of the cylinder is the area of the top and the bottom (both circles of radius **r**) added to area of the side of the cylinder.

The area of the side of the cylinder is given as $2 \pi rh$ where **h** is the height of the cylinder 

Your new function **cylinder_area** should make use of the function **circle_area** to get the area of the top and bottom of the cylinder.  

In [43]:
### BEGIN SOLUTION
def cylinder_area(r,h):
    A = 2*circle_area(r)+2*np.pi*r*h
    return A
### END SOLUTION

In [44]:
assert cylinder_area(r=2,h=1) == 12*np.pi