# More numpy arrays

## Concatenating arrays in multiple dimensions
Let's take two 2D arrays:

In [None]:
import numpy as np

arr1 = np.arange(15).reshape(5,3)
arr2 = arr1 * 10
print(arr1)
print(arr2)
print(arr1.shape, arr2.shape)

By default, concatenation happens along the first dimension (i.e. axis 0):

In [None]:
conc1 = np.concatenate([arr1, arr2]) # same as passing axis=0
print(conc1)
print(conc1.shape)

To concatenate arrays along other axes, you can specify axis parameter, which tells the function the axis along which you want to concatenate the arrays:

In [None]:
conc2 = np.concatenate([arr1, arr2], axis=1)
print(conc2)

Think-pair-share: which of these 2 operations leads to an error?

In [None]:
arr15 = np.arange(15).reshape(5,3)
arr10 = np.arange(10).reshape(5,2)
print(arr15.shape, arr10.shape)
print(arr15)
print(arr10)

## This:
# np.concatenate((arr15, arr10), axis=0)
## or this:
# np.concatenate((arr15, arr10), axis=1)

Multiplying matrices of different shapes

In [None]:
mat1 = np.arange(20).reshape(2,10)
vec1 = np.arange(10)
vec2 = np.arange(2)

print(mat1)
print(vec1)
print(vec2)

Here, the vector has to have the same size as the last dimension of your matrix!

In [None]:
mat1 > vec1

The same goes for other operations like addition, division, comparison, and so on.

In [None]:
# This won't work: 
# mat1 > vec2

# But this will:
print(mat1.transpose())
mat1.transpose() > vec2 # I'm comparing the elements of vec2
                        # with the elements of mat1 line by line

# More numpy practice 

In [None]:
from matplotlib import pyplot as plt

# Saving and loading arrays

In [None]:
energy_arr = np.linspace(3,9,15)              
spectrum_arr = 100 - energy_arr * 3

plt.scatter(energy_arr, spectrum_arr)

The next line creates a file called `myspectrum.csv` and saves the content of the arrays into it.

By zipping the two arrays I'm simply writing them into the file as columns rather than lines, which makes the file easier to understand if you open it with a text editor.

In [None]:
np.savetxt("myspectrum.csv", list(zip(energy_arr, spectrum_arr)), fmt='%.3e', delimiter=',')

The next line reads the data from the file `myspectrum.csv` and loads it back into numpy arrays.

Setting `unpack=True` is a useful trick that allows me to save the two columns of the file separately into different numpy arrays, in this case called `specx` and `specy`.

In [None]:
specx, specy = np.loadtxt("myspectrum.csv", delimiter=',', unpack=True)

plt.scatter(specx, specy)

# Random numbers with numpy

In [1]:
import numpy as np

random_integers  = np.random.randint(400,500,15)
print(random_integers)

[455 437 418 465 455 437 481 478 464 483 470 435 493 464 494]


In [9]:
random_floats_1d = np.random.rand(15) # 1D array of random floating point 
                                      # numbers between 0 and 1.
                                      # Every time you re-run this cell,
                                      # you'll be creating a new set of 
                                      # pseudo-random numbers
print(random_floats_1d)

[0.58444778 0.49704002 0.22229817 0.73342475 0.09442711 0.49928806
 0.63804571 0.78439638 0.4933939  0.30147879 0.77483631 0.65216507
 0.9178812  0.30901098 0.4313561 ]


In [14]:
random_floats_2d = np.random.rand(3,6) # 2D array of random floating point 
                                       # numbers between 0 and 1
print(random_floats_2d)

[[0.92573543 0.218826   0.72487616 0.29309899 0.74711794 0.20828531]
 [0.48636809 0.14818377 0.5473694  0.57211237 0.15184377 0.04498987]
 [0.33240254 0.06183079 0.89772828 0.3446951  0.93632928 0.91019161]]


### Seeding:

To make your experiment reproducible, you may want to *seed* the random numbers you produce:

In [18]:
np.random.seed(10) # This fixes the random numbers that are produced next

random_but_predictable = np.random.rand(3,3) # If you re-run this cell, these 
                                             # values won't change
print(random_but_predictable)

[[0.77132064 0.02075195 0.63364823]
 [0.74880388 0.49850701 0.22479665]
 [0.19806286 0.76053071 0.16911084]]


### Exercise with random numbers:  

Write a function that estimates the value of $\pi$ with a Monte Carlo approach:

- First, we want to create a random set of $n$ points distributed randomly inside a 2x2 square, so with x- and y- coordinates between 0 and 2.
- Then we count how many of these points lie inside a circle of radius 1 centered in the middle of the square.
- Because our points are randomly distributed, the ratio between the fraction of points that lie inside the circle gives you an estimate of the ratio between the area of the circle and the total area of the square:

$$
\frac{N_\rm{inside}}{N_\rm{total}} = \frac{A_\bigodot}{A_\square} = \frac{\pi}{4}
$$

In [19]:
def approxpi(n, seed=0):
    """Estimates an approximate value of pi with a Monte Carlo approach.
    
    Args:
        n (int): number of points to scatter
        seed (int): random seed  
        
    Return:
        piapprox (float): estimate of the value of pi.
    """
    np.random.seed(seed)            # Ensures result is reproducible
    xarr = np.random.rand(n) * 2    # x coordinates
    yarr = np.random.rand(n) * 2    # y coordinates
    dist2arr = ((xarr - 1.0) ** 2 + # Distance (squared) of each point to the center
                (yarr - 1.0) ** 2
               )
    isincirc = dist2arr <= 1        # True for points in the circle, False for out
    incount = np.sum(isincirc)      # Number of points inside the circle 
                                    # (remember that True acts as 1, False as 0)
    piapprox = 4 * incount / n      
    return piapprox

print("With a million points, pi~", approxpi(1000000))

With a million points, pi~ 3.141688
