### Numpy basics

1\. Find the row, column and overall means for the following matrix:

```python
m = np.arange(12).reshape((3,4))
```

In [5]:
import numpy as np
m = np.arange(12).reshape((3,4))
rows = [np.mean(x) for x in m]
cols = [np.mean(x) for x in m.T]
overall = np.mean(m.flatten())
print('row mean:', rows)
print('column mean:', cols)
print('overall mean:', overall)

row mean: [1.5, 5.5, 9.5]
column mean: [4.0, 5.0, 6.0, 7.0]
overall mean: 5.5


2\. Find the outer product of the following two vecotrs

```python
u = np.array([1,3,5,7])
v = np.array([2,4,6,8])
```

Do this in the following ways:

   * Using the function outer in numpy
   * Using a nested for loop or list comprehension
   * Using numpy broadcasting operatoins


In [12]:
u = np.array([1,3,5,7])
v = np.array([2,4,6,8])
print(np.outer(u,v),'\n')

ep1 = []
for i in u:
    ep1.append([i*j for j in v])
print(np.array(ep1),'\n')

ep2 = []
for i in u:
    ep2.append(i*v)
print(np.array(ep2))

[[ 2  4  6  8]
 [ 6 12 18 24]
 [10 20 30 40]
 [14 28 42 56]] 

[[ 2  4  6  8]
 [ 6 12 18 24]
 [10 20 30 40]
 [14 28 42 56]] 

[[ 2  4  6  8]
 [ 6 12 18 24]
 [10 20 30 40]
 [14 28 42 56]]


3\. Create a 10 by 6 matrix of random uniform numbers. Set all rows with any entry less than 0.1 to be zero

Hint: Use the following numpy functions - np.random.random, np.any as well as Boolean indexing and the axis argument.

In [24]:
from numpy import random as npr
m = npr.rand(10,6)
print('matrix:\n',m)
mask = np.array([(a < 0.1) for a in m])
print('mask:\n',mask)
print('\n', np.any(mask,1))

matrix:
 [[0.02763658 0.61800045 0.84615372 0.420171   0.30795003 0.86457292]
 [0.02645259 0.61022742 0.45997494 0.87479438 0.2477955  0.26302524]
 [0.71213374 0.04217039 0.81575721 0.02608726 0.75636381 0.89883005]
 [0.41784378 0.76542112 0.70388301 0.26740971 0.84079326 0.01747332]
 [0.85532064 0.18487919 0.03825182 0.66761144 0.19653586 0.42221113]
 [0.75396342 0.56564692 0.13121157 0.94894467 0.47148994 0.9022558 ]
 [0.07924143 0.15312197 0.19870221 0.76630206 0.34256343 0.86493057]
 [0.3412305  0.93956116 0.03679299 0.70659938 0.0544951  0.66268637]
 [0.38595269 0.89901364 0.36533272 0.78234084 0.67683899 0.22684219]
 [0.26298921 0.28772265 0.30201947 0.61636763 0.47812819 0.97302451]]
mask:
 [[ True False False False False False]
 [ True False False False False False]
 [False  True False  True False False]
 [False False False False False  True]
 [False False  True False False False]
 [False False False False False False]
 [ True False False False False False]
 [False False  True 

4\. Use np.linspace to create an array of 100 numbers between 0 and 2π (includsive).

  * Extract every 10th element using slice notation
  * Reverse the array using slice notation
  * Extract elements where the absolute difference between the sine and cosine functions evaluated at that element is less than 0.1
  * Make a plot showing the sin and cos functions and indicate where they are close

5\. Create a matrix that shows the 10 by 10 multiplication table.

 * Find the trace of the matrix
 * Extract the anto-diagonal (this should be ```array([10, 18, 24, 28, 30, 30, 28, 24, 18, 10])```)
 * Extract the diagnoal offset by 1 upwards (this should be ```array([ 2,  6, 12, 20, 30, 42, 56, 72, 90])```)

6\. Use broadcasting to create a grid of distances

Route 66 crosses the following cities in the US: Chicago, Springfield, Saint-Louis, Tulsa, Oklahoma City, Amarillo, Santa Fe, Albuquerque, Flagstaff, Los Angeles
The corresponding positions in miles are: 0, 198, 303, 736, 871, 1175, 1475, 1544, 1913, 2448

  * Construct a 2D grid of distances among each city along Route 66
  * Convert that in km (those savages...)

7\. Prime numbers sieve: compute the prime numbers in the 0-N (N=99 to start with) range with a sieve (mask).
  * Constract a shape (100,) boolean array, the mask
  * Identify the multiples of each number starting from 2 and set accordingly the corresponding mask element
  * Apply the mask to obtain an array of ordered prime numbers
  * Check the performances (timeit); how does it scale with N?
  * Implement the optimization suggested in the [sieve of Eratosthenes](https://en.wikipedia.org/wiki/Sieve_of_Eratosthenes)

8\. Diffusion using random walk

Consider a simple random walk process: at each step in time, a walker jumps right or left (+1 or -1) with equal probability. The goal is to find the typical distance from the origin of a random walker after a given amount of time. 
To do that, let's simulate many walkers and create a 2D array with each walker as a raw and the actual time evolution as columns

  * Take 1000 walkers and let them walk for 200 steps
  * Use randint to create a 2D array of size walkers x steps with values -1 or 1
  * Build the actual walking distances for each walker (i.e. another 2D array "summing on each raw")
  * Take the square of that 2D array (elementwise)
  * Compute the mean of the squared distances at each step (i.e. the mean along the columns)
  * Plot the average distances (sqrt(distance\*\*2)) as a function of time (step)
  
Did you get what you expected?

9\. Analyze a data file 
  * Download the population of hares, lynxes and carrots at the beginning of the last century.
    ```python
    ! wget https://www.dropbox.com/s/3vigxoqayo389uc/populations.txt
    ```

  * Check the content by looking within the file
  * Load the data (use an appropriate numpy method) into a 2D array
  * Create arrays out of the columns, the arrays being (in order): *year*, *hares*, *lynxes*, *carrots* 
  * Plot the 3 populations over the years
  * Compute the main statistical properties of the dataset (mean, std, correlations, etc.)
  * Which species has the highest population each year?

Do you feel there is some evident correlation here? [Studies](https://www.enr.gov.nt.ca/en/services/lynx/lynx-snowshoe-hare-cycle) tend to believe so.