# Sources

- [Mastering NumPy: Tips and Tricks for Efficient Numerical Computing](https://moez-62905.medium.com/mastering-numpy-tips-and-tricks-for-efficient-numerical-computing-624d44b4bebd)
- [NumPy User Guide](https://numpy.org/doc/stable/user/quickstart.html)

# The Basics

In NumPy dimensions are called **axes**.  For example, the array for the coordinates of a point in 3D space, `[1, 2, 1]`, has one axis.  `[[1., 0., 0.],
 [0., 1., 2.]]` has two axes, the first having length 1 and the second having length 3.


In [13]:
import numpy as np

arr = np.array([[1,0,0],[0,1,2]])
print(arr.ndim) # number of dimensions/axes
print(arr.shape) # tuple of length of each dimension
print(arr.size) # overall number of elements

2
(2, 3)
6


Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with initial placeholder content. These minimize the necessity of growing arrays, an expensive operation.

In [77]:
print(np.zeros((3,4)))
print(np.ones((2, 2)))
# See https://stackoverflow.com/questions/28363447/what-are-the-advantages-of-using-numpy-identity-over-numpy-eye for diff with np.identity
print(np.eye(2))

[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[[1. 1.]
 [1. 1.]]
[[1. 0.]
 [0. 1.]]


To create sequences of numbers, NumPy provides the `arange` function which is analogous to the Python built-in range, but returns an array. When working with floating point ranges, we want to use `linspace`.  `linspace` receives as an argument the number of elements that we want, instead of the step.

In [78]:
print(np.arange(0,10))
print(np.arange(10,20,5))

print(np.linspace(0, 2, 9))

[0 1 2 3 4 5 6 7 8 9]
[10 15]
[0.   0.25 0.5  0.75 1.   1.25 1.5  1.75 2.  ]


Numpy provides `pi` and `e`.   

In [79]:
print(np.pi)
print(np.e)

3.141592653589793
2.718281828459045


In [80]:
# Shape a 1d array into the specified dimensions
print(np.arange(9).reshape(3, 3))

[[0 1 2]
 [3 4 5]
 [6 7 8]]


## Basic operations

Arithmetic and logic operators on arrays apply elementwise.

In [138]:
a = np.array([20, 30, 40, 50])
print(a < 35)

# np.all tests whether all array elements along a given axis evaluate to True.
print(np.all([[True,False],[True,True]]))

# np.any tests whether any array element along a given axis evaluates to True.
print(np.any([[True, False], [True, True]]))

# np.where return elements chosen from x or y depending on condition.
a = np.arange(10)
print(np.where(a < 5, a, 10*a))

#argmax and argmin give the indices of the max or min (along a given axis)

# prod is product of elements
print(np.prod([1.,2.]))


[ True  True False False]
False
True
[ 0  1  2  3  4 50 60 70 80 90]
2.0


# Stacking and splitting

[Explanation](https://stackoverflow.com/questions/33356442/when-should-i-use-hstack-vstack-vs-append-vs-concatenate-vs-column-stack) of when to use concatenate vs other options.

In [139]:
a = np.array([[9, 7], [5, 2]])
b = np.array([[1, 9], [5, 1]])

print("a:", a)
print("b:", b)
print("vstack:", np.vstack([a, b]))
print("hstack:", np.hstack([a, b]))
a = np.array([9, 7])
b = np.array([1, 9])
print("a:", a)
print("b:", b)
print("column_stack:", np.column_stack([a,b]))
x = np.arange(8.0)
print("x:", x)
print("array_split:", np.array_split(x, 3))


a: [[9 7]
 [5 2]]
b: [[1 9]
 [5 1]]
vstack: [[9 7]
 [5 2]
 [1 9]
 [5 1]]
hstack: [[9 7 1 9]
 [5 2 5 1]]
a: [9 7]
b: [1 9]
column_stack: [[9 1]
 [7 9]]
x: [0. 1. 2. 3. 4. 5. 6. 7.]
array_split: [array([0., 1., 2.]), array([3., 4., 5.]), array([6., 7.])]


In [140]:
a = np.array([[9, 7], [5, 2]])
b = np.array([[1, 9], [5, 1]])

print(np.hsplit(a, 2))
print(np.vsplit(a, 2))

[array([[9],
       [5]]), array([[7],
       [2]])]
[array([[9, 7]]), array([[5, 2]])]


# Copies and views

Overview of how mutability and copying work [here](https://numpy.org/doc/stable/user/quickstart.html#no-copy-at-all).

# Advanced indexing and index tricks

 In addition to indexing by integers and slices, as we saw before, arrays can be indexed by arrays of integers and arrays of booleans.

In [149]:
a = np.arange(12)**2  # the first 12 square numbers
i = np.array([1, 1, 3, 8, 5])  # an array of indices
print(a[i])  # the elements of `a` at the positions `i`
j = np.array([[3, 4], [9, 7]])  # a bidimensional array of indices
print(a[j])  # the same shape as `j`

# You can also use indexing with arrays as a target to assign to:
a = np.arange(5)
print(a)
a[[1, 3, 4]] = 0
print(a)

a = np.arange(4).reshape(2,2)
print(a)
print(a.diagonal())

# np.put replaces specified elements of an array with given values. it mutates the object.
a = np.arange(5)
print(a)
np.put(a, [0, 2], [-44, -55])
print(a)

[ 1  1  9 64 25]
[[ 9 16]
 [81 49]]
[0 1 2 3 4]
[0 0 2 0 0]
[[0 1]
 [2 3]]
[0 3]
[0 1 2 3 4]
[-44   1 -55   3   4]


## Indexing with boolean arrays

In [150]:
a = np.arange(12).reshape(3, 4)
b = a > 4
print(b)  # `b` is a boolean with `a`'s shape
print(a[b])  # 1d array with the selected elements
a[b] = 0  # All elements of `a` higher than 4 become 0
print(a)

# you can select elements that satisfy two conditions using the & and | operators:
a = np.array([[1 , 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
five_up = (a > 5) | (a == 5)
print(five_up)


[[False False False False]
 [False  True  True  True]
 [ True  True  True  True]]
[ 5  6  7  8  9 10 11]
[[0 1 2 3]
 [4 0 0 0]
 [0 0 0 0]]
[[False False False False]
 [ True  True  True  True]
 [ True  True  True  True]]


# Random Sampling
[Docs](https://numpy.org/doc/stable/reference/random/index.html).  [Explanation](https://stackoverflow.com/questions/7029993/differences-between-numpy-random-and-random-random-in-python) of differences with built in `random.random`. The pseudo-random number generators implemented in this module as well as the built in one are designed for statistical modeling and simulation. They are not suitable for security or cryptographic purposes. See the secrets module from the standard library for such use cases.




In [151]:
import numpy as np
rng = np.random.default_rng()
# Generate one random float uniformly distributed over the range [0, 1)
print(rng.random())
# Generate an array of 10 numbers according to a unit Gaussian distribution.
rng.standard_normal(10)  
# Generate an array of 5 integers uniformly over the range [0, 10).
rng.integers(low=0, high=10, size=5)  




0.9317779295075374


array([9, 1, 2, 8, 3])

# Vectorize your code

Vectorization is the process of transforming a scalar operation into a vector operation. In Numpy, you can perform vectorized operations on arrays instead of scalars, which can significantly improve the performance of your code.


In [152]:
import numpy as np
import timeit # Using Timeit for reason listed @ https://stackoverflow.com/a/17579466.  It's part of the python standard library.

a = np.array(range(0,1000))

def square_with_loop():
    # Calculate the square of each element using a for loop
    squared = []
    for i in a:
        i ** 2

def square_with_vector():
    a ** 2

# print(timeit.timeit('square_with_loop()', globals=globals()))
# print(timeit.timeit('square_with_vector()', globals=globals()))

