# NumPy Functions to Make your Life Easier


### Useful NumPy Functions 

NumPy is an open-source Python package that facilitates working with arrays, as well as providing basic linear algebra functions and comprehensive random number generation capabilities.  The central feature of NumPy is the N-dimensional array object, which provides vast flexibility in the use of multi-dimensional arrays in terms of both UX and memory usage.  Below is a small collection of array functions which illustrate some of NumPy's core functionalities and strengths.

- numpy.reshape()
- numpy.resize()
- numpy.linalg.norm()
- numpy.hypot()
- numpy.cumsum()


# List of functions explained 
### function1 = np.reshape()
    numpy.reshape(a, newshape, order='C')
    
        Gives a new shape to an array without changing its data.
        
### function2 = np.resize()
    numpy.resize(a, new_shape)
    
        Return a new array with the specified shape.  If the new array is larger than the original array, then the new array is filled with repeated copies of a. Note that this behavior is different from a.resize(new_shape) which fills with zeros instead of repeated copies of a.
        
### function3 = np.linalg.norm()
    numpy.linalg.norm(x, ord=None, axis=None, keepdims=False)
    Matrix or vector norm.

    This function is able to return one of eight different matrix norms, or one of an infinite number of vector norms (described below), depending on the value of the ord parameter.
    
### function4 = np.hypot()
    numpy.hypot(x1, x2, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj]) = <ufunc 'hypot'>
    
    Given the “legs” of a right triangle, return its hypotenuse.

    Equivalent to sqrt(x1**2 + x2**2), element-wise. If x1 or x2 is scalar_like (i.e., unambiguously cast-able to a scalar type), it is broadcast for use with each element of the other argument.
    
### function5 = np.cumsum()
    numpy.cumsum(a, axis=None, dtype=None, out=None)
    
    Return the cumulative sum of the elements along a given axis.

In [1]:
import numpy as np

## Function 1 - np.reshape()

Allows for the reshaping of a NumPy N-dimensional array into a new shape, as long as the new shape is compatible.

In [2]:
# Example 1 - working
x = np.linspace(0,1000, dtype = np.int64)
print("Our original 1-dimensional array:\n{}".format(x))
print()
print("The current shape of our 1d array:\n{}".format(x.shape))
print()
x = np.reshape(x, (2, 5, 5))
print("Our reshaped array:\n{}".format(x))
print()
print("We can verify that our new array has the proper shape:\n{}".format(x.shape))

Our original 1-dimensional array:
[   0   20   40   61   81  102  122  142  163  183  204  224  244  265
  285  306  326  346  367  387  408  428  448  469  489  510  530  551
  571  591  612  632  653  673  693  714  734  755  775  795  816  836
  857  877  897  918  938  959  979 1000]

The current shape of our 1d array:
(50,)

Our reshaped array:
[[[   0   20   40   61   81]
  [ 102  122  142  163  183]
  [ 204  224  244  265  285]
  [ 306  326  346  367  387]
  [ 408  428  448  469  489]]

 [[ 510  530  551  571  591]
  [ 612  632  653  673  693]
  [ 714  734  755  775  795]
  [ 816  836  857  877  897]
  [ 918  938  959  979 1000]]]

We can verify that our new array has the proper shape:
(2, 5, 5)


Example 1 demonstrates the flexibility of the np.reshape() function.  A 1-dimensional array is reshaped into a 3-dimensional array.

In [3]:
# Example 2 - working
x = np.random.randint(-10, 10, 12)
print("Original 1D Array:\n{}".format(x))
print()
x = np.reshape(x, (4, -1)) # The second argument in the reshape function call contains -1 as an element
print("New 2D Array:\n{}".format(x))

Original 1D Array:
[-1 -1 -6 -1  5 -4  0  1  0  6  0 -1]

New 2D Array:
[[-1 -1 -6]
 [-1  5 -4]
 [ 0  1  0]
 [ 6  0 -1]]


Example 2 shows how easy NumPy makes it to create multi-dimensional arrays.  One of the target dimensions in the np.reshape() function can be -1, which allows NumPy to infer the correct value based on the values of the other dimension arguments and the size of any remaining dimensions.

In [4]:
# Example 3 - breaking (to illustrate when it breaks)
x = np.random.randint(-10, 10, 13)
print(x)
print(x.shape)
#x.resize(16)   Without resizing to zero-fill the incoming array, the desired shape may not be possible
x = np.reshape(x, (4, -1))
print(x)

[ 8 -6  4 -2 -2  4 -6 -3  7 -3  2  1  3]
(13,)


ValueError: cannot reshape array of size 13 into shape (4,newaxis)

Without resizing to zero-fill the incoming array, the desired shape may not be possible

np.reshape() is a fantastic function when building datasets as it allows for easily translation between the shape of the raw data and the desired tensor shape.

## Function 2 - np.ndarray.resize()

np.ndarray.resize() will resize an array or dimension of an array by filling in elements from the last element on the axis up to the specified size.

In [5]:
# Example 1 - working
x = np.array([1,2,3,4,5])
y = np.array([6,7,8])
z = np.array([9,10,11,12])
print("Original Arrays:\n{}\n{}\n{}".format(x,y,z))
print()

a = np.empty((3,5))

y.resize(5)
z.resize(5)

for i in range(3):
    if i == 0:
        a[i] = x
    elif i == 1:
        a[i] = y
    else:
        a[i] = z
print("Resized Arrays Combined into 2D Matrix:\n{}".format(a))


Original Arrays:
[1 2 3 4 5]
[6 7 8]
[ 9 10 11 12]

Resized Arrays Combined into 2D Matrix:
[[ 1.  2.  3.  4.  5.]
 [ 6.  7.  8.  0.  0.]
 [ 9. 10. 11. 12.  0.]]


The arrays in the above example were not of the same size and you would have trouble combining them into a 2-dimensional array without the ability to pad the smaller arrays.

In [6]:
# Example 2 - working
x = np.random.randint(-10, 10, (5,4))
print("Original Array:\n{}".format(x))
print()
x.resize((3,3))
print("Shrunken Array:\n{}".format(x))

Original Array:
[[ -7  -4   0   1]
 [ -1  -8   7  -4]
 [  9  -8  -7 -10]
 [  7  -7  -1  -8]
 [ -8  -5  -2   5]]

Shrunken Array:
[[-7 -4  0]
 [ 1 -1 -8]
 [ 7 -4  9]]


np.ndarray.resize() can be used to shrink N-dimensional arrays.  The values are chosen according to C-like convention, which takes the values from the front first.

In [7]:
# Example 3 - breaking (to illustrate when it breaks)
x = np.random.randint(-10, 10, (5,4))
y = np.random.randint(-10, 10, (7,3))

print(x)
print()
print(y)

x.resize((10,10))
y.resize(10)
#y.resize((10,10)) would be the correct way to resize y.

print(x)
print(y)

x = np.reshape(x, (4,25))
print(x)
y = np.reshape(y, (4,25))
print(y)

[[ -2  -3   0   8]
 [ -9 -10   8   8]
 [-10  -8  -3   6]
 [  2   4   4 -10]
 [  5   2  -7  -6]]

[[-5 -2 -5]
 [-6 -5  3]
 [-2 -2 -8]
 [ 9 -8 -9]
 [-7 -8  3]
 [ 4 -4 -7]
 [ 0  8  0]]
[[ -2  -3   0   8  -9 -10   8   8 -10  -8]
 [ -3   6   2   4   4 -10   5   2  -7  -6]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0]]
[-5 -2 -5 -6 -5  3 -2 -2 -8  9]
[[ -2  -3   0   8  -9 -10   8   8 -10  -8  -3   6   2   4   4 -10   5   2
   -7  -6   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0
    0   0   0   0   0   0   0]
 [  0   0   0   0   0   0   0   0   0   

ValueError: cannot reshape array of size 10 into shape (4,25)

When array y (a 2-dimensional array) is resized in the above example, the target size is mistakenly entered as a 1-dimensional array of size(10,); enter the value as '(10, n)'.

np.ndarray.resize() is useful for fitting large amounts of irregularly shaped data together into N-dimensional arrays which can then be further refined together.

## Function 3 - np.linalg.norm()

This function will return normalizations of a vector or matrix of vectors according to the 'ord' argument passed.  The default is the Frobenius norm, which is the square root of the sum of squares of the absolute values along the axis.

In [8]:
# Example 1 - working
from numpy import linalg as LA
x = np.random.randint(-10, 10, size=(10,10))
print("Original Vector:\n{}".format(x))
print()
print("Frobenius Norms of Vectors Along Axis 1 (inner arrays):\n{}".format(LA.norm(x, axis=1)))

Original Vector:
[[ -7   3  -2   7   3  -5  -6  -8   6   5]
 [ -4  -6  -1  -9  -4 -10  -1   0  -3  -2]
 [ -6 -10  -9   9   8   8  -1  -3  -1   4]
 [ -8   5  -1  -1   0   4   7  -2  -6   3]
 [ -2  -1  -6   0  -1  -9   6   0   7   9]
 [-10   4  -5  -3  -3   5  -2   9   1   6]
 [-10 -10  -7   2  -6   9   3   0   7  -4]
 [ -1   0  -7  -3   1  -2  -1  -5  -2  -2]
 [  4  -5   7  -7  -8  -1 -10  -5  -5  -5]
 [  6   7   9  -7  -7   1  -5  -5   5  -2]]

Frobenius Norms of Vectors Along Axis 1 (inner arrays):
[17.49285568 16.24807681 21.28379665 14.31782106 17.         17.49285568
 21.07130751  9.89949494 19.46792233 18.54723699]


The returned Frobenius norms can be verified by summing the squares of absolute values of each array along axis 1, and then taking the square root of each.

In [9]:
# Example 2 - working
from numpy import linalg as LA
x = np.random.randint(-10, 10, size=(10,10))
print("Original Vector:\n{}".format(x))
print()
print("Frobenius Norm of Entire 2D Vector:\n{}".format(LA.norm(x)))

Original Vector:
[[  9   9   8  -4   7   0  -9   2   5  -8]
 [ -2  -2  -7   4  -2  -6  -5 -10  -4   9]
 [ -3   0  -5   8  -1   7   1 -10  -9   4]
 [  2  -5  -7  -2   0  -9 -10 -10  -6   3]
 [  9   9   7   6  -9   4   9   5   6  -9]
 [ -2   5   8   4   0  -8   6   8  -7  -8]
 [  4  -8 -10   1   8   8   8  -8   4  -6]
 [  5  -9  -7  -1   6  -6   1   5  -4  -5]
 [ -9  -4   9  -6   1   9   3  -5   7   3]
 [  7  -3  -8   5   5  -4  -5   0  -9  -7]]

Frobenius Norm of Entire 2D Vector:
63.419239982831705


Without specifiying the 'axis' argument while calling the norm() function, the entire 2-norm (Frobenius) is calculated for the 2d matrix.

In [10]:
# Example 3 - breaking (to illustrate when it breaks)
from numpy import linalg as LA
x = np.random.randint(-10, 10, size=(10,10))
print("Original Vector:\n{}".format(x))
print()
print("Count of Non-Zero Element Norm Along Axis 1:\n{}".format(LA.norm(x, ord=0))) #No axis is specified with 'axis='

Original Vector:
[[ -5   4  -4   6   3  -3  -8   4   6   2]
 [-10  -6  -4   0   5 -10  -6  -5   1   9]
 [  8  -3  -9  -2  -3   7  -8   8  -4   8]
 [  4   0   5  -3   5  -2  -1  -4   0  -5]
 [ -6   7  -2   3  -8   3   4  -9  -9   6]
 [ -8  -3   6  -2   0   8   8  -6   2   1]
 [  6  -2  -1   1  -9  -1   8 -10   9   2]
 [  4   7  -2  -4  -3   5  -8   3 -10 -10]
 [ -4  -2   8  -3   1   8   9  -4 -10   5]
 [ -9  -1  -5   5  -3  -7  -8   9  -8  -7]]



ValueError: Invalid norm order for matrices.

The 'ord' argument 0 in the above norm() function call gives a count of the non-zero elements along an axis.  In this example, no axis was specified and so the normalization cannot occur.  Add the argument 'axis=n' to the function call.

This function offers multiple ways to normalize data and can even preserve the dimensions of the normalized values so that they can be broadcast against the original array or matrix.

## Function 4 - np.hypot()

Will calculate the hypotenuse of a right triangle given the values of the remaining sides from two arrays, element-wise.

In [11]:
# Example 1 - working
x = np.array([3, 5, 48])
y = np.array([4, 12, 55])
print("Array 1:\n{}".format(x))
print()
print("Array 2:\n{}".format(y))
print()
print("Hypotenuse Results:\n{}".format(np.hypot(x,y)))

Array 1:
[ 3  5 48]

Array 2:
[ 4 12 55]

Hypotenuse Results:
[ 5. 13. 73.]


The pythagorean theorem is applied element-wise to Arrays 1 and 2.  The hypotenuse values populate the return array.

In [12]:
# Example 2 - working
x = np.random.randint(10, size=(3,3,3))
y = np.random.randint(10, size=(3,3,3))
print("Array 1:\n{}".format(x))
print()
print("Array 2:\n{}".format(y))
print()
print("Hypotenuse Results:\n{}".format(np.hypot(x,y)))

Array 1:
[[[1 3 6]
  [1 3 9]
  [1 3 1]]

 [[6 4 2]
  [8 1 9]
  [2 4 5]]

 [[8 6 2]
  [7 6 1]
  [4 6 1]]]

Array 2:
[[[0 2 3]
  [2 3 6]
  [1 0 9]]

 [[2 3 7]
  [1 7 0]
  [7 2 8]]

 [[1 3 3]
  [5 7 6]
  [6 5 9]]]

Hypotenuse Results:
[[[ 1.          3.60555128  6.70820393]
  [ 2.23606798  4.24264069 10.81665383]
  [ 1.41421356  3.          9.05538514]]

 [[ 6.32455532  5.          7.28010989]
  [ 8.06225775  7.07106781  9.        ]
  [ 7.28010989  4.47213595  9.43398113]]

 [[ 8.06225775  6.70820393  3.60555128]
  [ 8.60232527  9.21954446  6.08276253]
  [ 7.21110255  7.81024968  9.05538514]]]


The hypotenuse function can be used across arrays of multiple dimensions, as long as there are only 2 arrays and the shapes are compatible.

In [13]:
# Example 3 - breaking (to illustrate when it breaks)
x = np.random.randint(10, size=(3,3,2))
y = np.random.randint(10, size=(3,3,3))
print("Array 1:\n{}".format(x))
print()
print("Array 2:\n{}".format(y))
print()
print("Hypotenuse Results:\n{}".format(np.hypot(x,y)))

Array 1:
[[[8 1]
  [1 5]
  [8 1]]

 [[9 5]
  [0 8]
  [4 1]]

 [[9 5]
  [8 2]
  [1 2]]]

Array 2:
[[[9 9 6]
  [4 0 1]
  [5 6 4]]

 [[7 5 3]
  [0 7 4]
  [0 6 0]]

 [[2 9 6]
  [6 7 7]
  [2 8 2]]]



ValueError: operands could not be broadcast together with shapes (3,3,2) (3,3,3) 

The two input arrays are not of compatible shapes.  Axis 2 of array x could be resized using np.resize() in order to make the output broadcastable to a common shape.

When computing the values of multiple vectors and give x and y data in the form of two input arrays, the distance from the origin to (x[i], y[i]) can be computed as the hypotenuse of a right triangle and displayed in the returned array.

## Function 5 - np.cumsum()

This function will return the cumulative sum of elements along an axis.

In [14]:
# Example 1 - working
x = np.arange(1, 6)
print("Original Arrays:\n{}".format(x))
print()
print("Cumulative Sums:\n{}".format(np.cumsum(x)))

Original Arrays:
[1 2 3 4 5]

Cumulative Sums:
[ 1  3  6 10 15]


Each element of the output array is the value of the cumulative total of the preceding input array elements.

In [15]:
# Example 2 - working
x = np.empty(shape=(5,5))
for i in range(5):
    x[i] = np.arange(1, 6)
print("Original 2D Matrix, Values 1-5:\n{}".format(x))
print()
print("Cumulative Sums along Axis 0:\n{}".format(np.cumsum(x, axis=0)))

Original 2D Matrix, Values 1-5:
[[1. 2. 3. 4. 5.]
 [1. 2. 3. 4. 5.]
 [1. 2. 3. 4. 5.]
 [1. 2. 3. 4. 5.]
 [1. 2. 3. 4. 5.]]

Cumulative Sums along Axis 0:
[[ 1.  2.  3.  4.  5.]
 [ 2.  4.  6.  8. 10.]
 [ 3.  6.  9. 12. 15.]
 [ 4.  8. 12. 16. 20.]
 [ 5. 10. 15. 20. 25.]]


Axis 0 in the output matrix is a running total of the corresponding column of the input matrix.  Here, the results are counted by 1, 2, 3, 4 and 5, respectively; the running totals are read vertically.

In [16]:
# Example 3 - breaking (to illustrate when it breaks)
x = np.arange(10)
print("1D Array:\n{}".format(x))
print()
print("Cumulative Sums of Axis 1:\n{}".format(np.cumsum(x, axis=1)))

1D Array:
[0 1 2 3 4 5 6 7 8 9]



AxisError: axis 1 is out of bounds for array of dimension 1

The above example breaks because the function call requests that the second axis in a 1-dimensional array be summed; there is no second axis in a 1-dimensional array.  This could be fixed by changing axis to 0 or leaving it as default.

Some closing comments about when to use this function.

## Conclusion

Whether calculating the magnitudes of vectors, normalizing data, or trying to arrange your data into a tensor to be used in machine learning, NumPy has many useful functions which are beneficial to be familiar with.  Next it's time to try and combine my NumPy and Pandas knowledge to create useful representations of real-world data.

## Reference Links

* Numpy official tutorial : https://numpy.org/doc/stable/user/quickstart.html
* Python documentation : https://www.python.org/