![](https://upload.wikimedia.org/wikipedia/commons/thumb/3/31/NumPy_logo_2020.svg/1280px-NumPy_logo_2020.svg.png)


# NumPy Tutorial


## Content:
1. [Introduction to NumPY](#introduction_to_numpy)
2. [The Basics of NumPy Arrays](#basics_of_numpy_array)
    * [NumPy Array Attributes](#numpy_array_attributes)
    * [NumPy Data types](#numpy_data_types)
    * [Creating Arrays from Scratch](#creating_arrays_from_scratch)
    * [Array Indexing: Accessing Single Elements](#array_indexing)
    * [Array Slicing: Accessing Subarrays](#array_slicing)
    * [Reshaping of NumPu Arrays](#array_reshaping)
    * [Array Concatenation and Splitting](#array_concatenation_splitting)
5. [Computation on NumPy Arrays: Universal Functions](#computation_on_numpy_arrays)
      * [Exploring NumPy’s UFuncs](#numpy_ufuncs)
         * [Array arithmetic](#array_arithmetic)  
         * [Exponents and logarithms](#exponenets_and_logarithms)
         * [Absolute value](#absolute_value)
         * [Trigonometric functions](#trigonometric_functions)
6. [Aggregations: Min, Max, and Everything in Between](#aggregation_min_max)
      * [Summing the Values in an Array](#summing_values_in_array)
      * [Minimum and Maximum](#minimum_and_maximum)
      * [Multidimensional aggregates](#multidimensional_aggregates)
7. [Computation on Arrays: Broadcasting](#broadcasting)
      * [Introducing Broadcasting](#introducing_broadcasting)
      * [Visualization of NumPy broadcasting](#visulization_of_numpy_broadcasting)
      * [Broadcasting examples](#broadcasting_examples)
8. [Comparisons, Masks, and Boolean Logic](#comparisions_masks_booleanlogic)
      * [Comparison Operators as ufuncs](#comparison_operators_as_ufuncs)
      * [Comparison operators and their equivalent](#comparision_operators_and_equivalent)
      * [Working with Boolean Arrays ](#working_with_boolean_arrays)
9. [Sorting Arrays](#sorting_arrays)
      * [Fast Sorting in NumPy: np.sort and np.argsort](#fast_sorting)
      * [Sorting along rows or columns](#sorting_rows_columns) 
10. [References](#references)   

### Learning Objectives:

* Understand the difference between one-, two- and n-dimensional arrays in NumPy.
* Understand axis and shape properties for n-dimensional arrays.
* Understand how to apply some linear algebra operations to n-dimensional arrays without using for-loops.

<a id="introduction_to_numpy"></a> <br>
# Introduction to NumPy
* NumPy is the core library for scientific computing in Python. 
* It provides a high-performance multi-dimensional array object, and tools for working with these arrays.
* NumPy is also incredibly fast, as it has bindings to C libraries.

Below statement imports ***numpy*** package and name is as ***np***.

In [None]:
import numpy as np

<a id="basics_of_numpy_array"></a> <br>
# Basics of NumPy Array

* At the core of the NumPy package, is the ***ndarray*** object. It is also known by the alias array. 
* ***ndarray*** is an N-dimensional array, which describes a collection of “items” of the same type.
* All ***ndarrays*** are ***homogeneous***: every item takes up the same size block of memory, and all blocks are interpreted in exactly the same way. 
* ***ndarray*** is  indexed by a tuple of non-negative integers. 
* ***numpy.array*** is not the same as the Standard Python Library class array.array, which only handles one-dimensional arrays and offers less functionality.
* Many operations performed on ***ndarray*** uses compiled code for performance improvement. 

 
**There are several important differences between NumPy arrays and the standard Python sequences:**

1. NumPy arrays have a fixed size at creation, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.
2. The elements in a NumPy array are all required to be of the same data type, and thus will be the same size in memory. The exception: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements.
3. NumPy arrays facilitate advanced mathematical and other types of operations on large numbers of data. Typically, such operations are executed more efficiently and with less code than is possible using Python’s built-in sequences.
4. A growing plethora of scientific and mathematical Python-based packages are using NumPy arrays; though these typically support Python-sequence input, they convert such input to NumPy arrays prior to processing, and they often output NumPy arrays. In other words, in order to efficiently use much (perhaps even most) of today’s scientific/mathematical Python-based software, just knowing how to use Python’s built-in sequence types is insufficient - one also needs to know how to use NumPy arrays.

We can create an array by directly converting a list or list of lists. Below statement creates NumPy array with 3 elements.

In [None]:
a = np.array([1, 2, 3])
print("a:\n", a)

NumPy arrays can be multi-dimensional too.

In [None]:
b = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("b:\n", b)

<a id="numpy_array_attributes"></a> <br>
## NumPy Array Attributes
The more important attributes of an ***ndarray*** object are:

#### ***ndarray.ndim***: 
> The number of axes(dimensions) of the array.
#### ***ndarray.shape***: 
> The dimension of array. This is a tuple of integers indicating the size of the array in each dimension. For a matrix with n rows and m columns, shape will be (n,m). The length of the shape tuple is therefore the number of axes, ndim.
#### ***ndarray.size***: 
> The total number of elements of the array or It is equal to the product of the elements of the shape.
#### ***ndarray.dtype***: 
> The type of the elements in the array. One can create or specify dtype’s using standard Python types. Additionally NumPy provides types of its own like numpy.int32, numpy.int16, and numpy.float64.
#### ***ndarray.itemsize***: 
> The size in bytes of each element of the array. For example, an array of elements of type float64 has itemsize 8 (=64/8), while one of type int32 has itemsize 4 (=32/8). It is equivalent to ndarray.dtype.itemsize.
#### ***ndarray.data***: 
> The buffer containing the actual elements of the array. Normally, we won’t need to use this attribute because we will access the elements in an array using indexing facilities


In [None]:
a1 = np.array([1, 2, 3])
print("a1: \n", a1)

print("\nNumber of axes/dimensions in the array: ", a1.ndim)
print("Shape of the array: ", a1.shape)
print("Size of the array/Number of Elements in the: ", a1.size)
print("Type of the elements in the array: ", a1.dtype)
print("Size of the elements in the array: ", a1.itemsize)
print("Buffer containing the array: ", a1.data)

In NumPy dimensions are called axes. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

In above example, the array has one axis. That axis has *3* elements in it, so it has a length of 3 and shape of *(3, )*.

In [None]:
a2 = np.array([[1., 2, 3],
               [4, 5, 6]])
print("a2: \n", a2)

print("\nNumber of axes/dimensions in the array: ", a2.ndim)
print("Shape of the array: ", a2.shape)
print("Size of the array/Number of Elements in the: ", a2.size)
print("Type of the elements in the array: ", a2.dtype)
print("Size of the elements in the array: ", a2.itemsize)
print("Buffer containing the array: ", a2.data)

* Since NumPy arrays can contain only homogeneous datatypes, values will be upcast if the types do not match. In above example one element of 2-dim list is of type float and remaining elements of type integer, so it will convert all elements into float64.

* In the example above, the array has *2* axes. The first axis has a length of *2* and the second axis has a length of *3*, so shape of the array is *(2, 3)*.

<a id="numpy_data_types"></a> <br>
## NumPy Data types:
NumPy supports most of the Python data types.By default Python have these data types:

1. ***strings*** - used to represent text data, the text is given under quote marks. e.g. *"ABCD"*
2. ***integer*** - used to represent integer numbers. e.g. *-1, -2, -3*
3. ***float*** - used to represent real numbers. e.g. *1.2, 42.42*
4. ***boolean*** - used to represent *True* or *False*.
5. ***complex*** - used to represent complex numbers. e.g. *1.0 + 2.0j, 1.5 + 2.5j*

Apart from default Python data types NumPy supports additional data types. List of most commonly used numeric and boolean data types in NumPy:

1. ***int8, int16, int32, int64*** - signed integer types with different bit sizes
2. ***uint8, uint16, uint32, uint64*** - unsigned integer types with different bit sizes
3. ***float32, float64 - floating*** -point types with different precision levels
4. ***complex64, complex128*** - complex number types with different precision levels
5. ***bool*** - boolean type with one byte size

<center><img src="https://1.bp.blogspot.com/-xrnk-4zD2Ac/W1LCbjrSjjI/AAAAAAAAXUY/8QJ0AZBxsQ0h6BgUV4kml0NqIO_hOs2KwCLcBGAs/s1600/4212_t2-1.PNG" alt="cce" border="0" width="600px"></center>

In [None]:
# You can specify the type of data inside the array.
np.array([1, 2, 3, 4], dtype=np.float32)

In [None]:
a3 = np.array([[ 0,  1,  2,  3,  4],
               [ 5,  6,  7,  8,  9],
               [10, 11, 12, 13, 14]], dtype="int32")
print("a3: \n", a3)

print("Shape of the array: ", a3.shape)
print("Type of the elements in the array: ", a3.dtype)

<a id='creating_arrays_from_scratch'></a><br>
## Creating Arrays from Scratch Using Built-in Methods

#### ***numpy.zeros(shape, dtype=float, order='C', *, like=None)***
* Return a new array of given shape and type, filled with zeros.

In [None]:
# Create an array of 10 elements filled with zeros of type integer
a = np.zeros(10, dtype="int")
print("a: ", a)

In [None]:
# Create a 4x5 array/matrix filled with zeros of type float
b = np.zeros((4, 5), dtype="float32")
print("b: \n", b)

#### ***numpy.ones(shape, dtype=None, order='C', *, like=None)***
* Return a new array of given shape and type, filled with ones.

In [None]:
# Create an array of 15 elements filled with ones of type integer
a = np.ones(15)
print("a: ", a)

In [None]:
# Create a 3x6 array/matrix filled with ones of type float
b = np.ones((3, 6), dtype="float32")
print("b: \n", b)

#### ***numpy.full(shape, fill_value, dtype=None, order='C', *, like=None)***
* Return a new array of given shape and type, filled with fill_value.

In [None]:
# Create a 3x5 array/matrix filled with 3.14
c = np.full((3, 5), 3.14)
print("c: \n", c)

#### ***numpy.arange([start, ]stop, [step, ]dtype=None, *, like=None)***
* Return evenly spaced values within a given interval.

In [None]:
# Create an array filled with a linear sequence Starting at 0, ending at 20, stepping by 2
# (this is similar to the built-in range() function)
d = np.arange(0, 20, 2)
print("d: ", d)

#### ***numpy.linspace(start, stop, num=50, endpoint=True, retstep=False, dtype=None, axis=0)***
* Return evenly spaced numbers over a specified interval.
* Returns num evenly spaced samples, calculated over the interval [start, stop].
* The endpoint of the interval can optionally be excluded.

In [None]:
# Create an array of five values evenly spaced between 0 and 1
e = np.linspace(0, 1, 5)
print("e: ", e)

#### ***random.random(size=None)***
* Return random floats uniformly distributed over the range [0.0, 1.0). 


In [None]:
# Create a 3x3 array of uniformly distributed random values between 0 and 1
f = np.random.random((3, 3))
print("f: \n", f)

#### ***random.normal(loc=0.0, scale=1.0, size=None)***
* Draw random samples from a normal (Gaussian) distribution.

In [None]:
# Create a 3x3 array of normally distributed random values with mean 0 and standard deviation 1
g = np.random.normal(0, 1,(3, 3))
print("g: \n", g)

#### ***random.randint(low, high=None, size=None, dtype=int)***
* Return random integers from low (inclusive) to high (exclusive). 
* Return random integers from the “discrete uniform” distribution of the specified dtype in the “half-open” interval [low, high). If high is None (the default), then results are from [0, low).

In [None]:
# Create a 3x3 array of random integers in the interval [0, 10)
h = np.random.randint(0, 10,(3, 3))
print("h: \n", h)

#### ***numpy.eye(N, M=None, k=0, dtype=<class 'float'>, order='C', *, like=None)***
* Return a 2-D array with ones on the diagonal and zeros elsewhere.

In [None]:
# Create a 4x4 identity matrix
i = np.eye(4)
print("i: \n", i)

#### ***numpy.empty(shape, dtype=float, order='C', *, like=None)***
* Return a new array of given shape and type, without initializing entries.

In [None]:
j = np.empty((4, 3),dtype="int")
print("j: \n", j)

### Appending into NumPy Array

##### ***numpy.append(arr, values, axis=None)***
The numpy append() function is used to merge two arrays. It returns a new array, and the original array remains unchanged.

In [None]:
# Appending at the end of 1-d array
a1 = np.array([1, 3, 5, 7, 9])
print('Original Array : ', a1)
 
# Appending to the array
a2 = np.append(a1, [11, 13])
print('Array after appending : ', a2)

In [None]:
# Appending at the end of 2-d array
a3 = np.array([[10, 20, 30], [40, 50, 60], [70, 80, 90]])  
a4 = np.array([[11, 21, 31], [42, 52, 62], [73, 83, 93]])  

# Appending with axis = None => flatten the array and append
a5 = np.append(a3, a4)
print("After appending with axis=None:\n", a5)

# Appending with axis = 0
a6 = np.append(a3, a4, axis=0)
print("\nAfter appending with axis=0:\n", a6)

# Appending with axis = 1
a7=np.append(a3,a4, axis=1)
print("\nAfter appending with axis=1:\n", a7)

<a id="array_indexing"></a><br>
## Array Indexing: Accessing Single Elements

Indexing in NumPy is quite similar to Python’s standard list indexing. 

### Accessing one-dimensional array:
* You can access the ith value (counting from zero) by specifying the desired index in square brackets, just as with Python lists:

In [None]:
x1 = np.arange(1, 25, 3)
print(x1)

# Accessing element at 3rd index
print("x1[3]: ", x1[3])

# To index from the end of the array, you can use negative indices:
print("x1[-1]: ", x1[-1])
print("x1[-4]: ", x1[-4])

### Accessing multi-dimensional array:
* You access items using a comma-separated tuple of indices in multi-dimensional array.

In [None]:
# Create two-dimensional array of 3x4 elements with random numbers between 0 to 19
x2 = np.random.randint(20, size=(3, 4)) 
print("x2: \n", x2)

print("x2[2,1]: ", x2[2,1])

print("x2[1,-2]: ", x2[1,-2])

In [None]:
print("x2 before updation: \n", x2)

# You can also modify values using any of the above index notation:
x2[0,0] = 10
print("x2 after updation: \n", x2)

In [None]:
# You can also use python list syntax to access individual element in the NumPy array
print("x2[1][3]: ", x2[1][3])

<a id="array_slicing"></a><br>
## Array Slicing: Accessing Subarrays

* Just as we can use square brackets to access individual array elements, we can also use them to access subarrays with the slice notation, marked by the colon (:) character. 
* The NumPy slicing syntax follows that of the standard Python list; to access a slice of an array x, use this:
> ***x[start : stop : step]***
* If any of these are unspecified, they default to the values start=0, stop=size of dimension, step=1. 

In [None]:
x = np.arange(10)
print("x: ", x)

# first four elements
print("x[:4]: ", x[:4])

# elements after index 4
print("x[4:]: ", x[4:])

# middle subarray
print("x[4:7]: ", x[4:7])

# every other element
print("x[::2]: ", x[::2])

# every other element, starting at index 1
print("x[1::2]: ", x[1::2])

# Negative indexes with stpes
print("x[-7:-2:2]:", x[-7:-2:2])

A potentially confusing case is when the step value is negative. In this case, the defaults for start and stop are swapped. This becomes a convenient way to reverse an array:


In [None]:
# all elements, reversed
print("x[::-1]: ", x[:: -1])

# reversed every other from index 5
print("x[5::-2]: ", x[5::-2])

print("x[5:-8:-1]: ", x[5:-8:-1])

In [None]:
# Create two-dimensional array of 3x4 elements with random numbers between 0 to 19
x = np.random.randint(20, size=(3, 4)) 

# Multidimensional slices work in the same way, with multiple slices separated by commas.
# For example:
print("x: \n", x)

# two rows, three columns
print("\nx2[:2, :3]: \n", x2[:2, :3])

# all rows, every other column
print("\nx2[:3,::2]: \n", x2[:3, ::2])

# Subarray dimensions can even be reversed together
print("\nx2[::-1,::-1]: \n", x2[::-1, ::-1])

### Subarrays as no-copy views
* One important—and extremely useful—thing to know about array slices is that they return views rather than copies of the array data. This is one area in which NumPy array slicing differs from Python list slicing: in lists, slices will be copies. 

Consider our two-dimensional array from before:

In [None]:
# Create two-dimensional array of 3x4 elements with random numbers between 0 to 19
x1 = np.random.randint(20, size=(3, 4)) 
print("x1: \n", x1)

# Extract a 2×2 subarray from this:
x1_sub = x1[:2, :2]
print("\nx1_sub before updation: \n", x1_sub)

# Now if we modify this subarray, we’ll see that the original array is changed! Observe:
x1_sub[0, 0] = 99
print("\nx1_sub after updation: \n", x1_sub)

print("x1 after updation of x1_sub: \n", x1)

##### Note: 
This default behavior is actually quite useful: it means that when we work with large datasets, we can access and process pieces of these datasets without the need to copy the underlying data buffer.

### Creating copies of arrays
Despite the nice features of array views, it is sometimes useful to instead explicitly copy the data within an array or a subarray. This can be most easily done with the copy() method:

In [None]:
print("\nx1 before updation of x1_sub_copy: \n", x1)

x1_sub_copy = x1[:2, :2].copy()
print("\nx1_sub_copy before updation: \n", x1_sub_copy)

#If we now modify this subarray, the original array is not touched:
x1_sub_copy[0, 0] = 42

print("\nx1_sub_copy after updation: \n", x1_sub_copy)

print("\nx1 after updation of x1_sub_copy: \n", x1)

<a id="array_reshaping"></a> <br>
## Reshaping of Numpy Arrays

Another useful type of operation is reshaping of arrays. The most flexible way of doing this is with the reshape() method. 

#### ***numpy.reshape(a, newshape, order='C')***
* Gives a new shape to an array without changing its data.

For example, if you want to put the numbers 1 through 9 in a 3×3 grid, you can do the following:

In [None]:
z = np.arange(1, 10, 1)
print("Original z: ", z)

z_new = z.reshape(3, 3)
print("\nNew z of shape 3x3:\n", z_new)

#### Note:
* Where possible, the reshape method will use a no-copy view of the initial array, but with noncontiguous memory buffers this is not always the case.
* Another common reshaping pattern is the conversion of a one-dimensional array into a two-dimensional row or column matrix. You can do this with the reshape method, or more easily by making use of the newaxis keyword within a slice operation.

In [None]:
a = np.array([1, 2, 3])
print("a: ", a)
print("Shape of a: ", a.shape)

In [None]:
# Convert into row vector via reshape
b = a.reshape(1,3)
print("b: ", b)
print("Shape of b: ", b.shape)

In [None]:
# Convert into row vector via newaxis
c = a[np.newaxis, :]
print("c: ", c)
print("Shape of c: ", c.shape)

In [None]:
# Convert into row vector via reshape and negative index
d = x.reshape(1,-1)
print("d: ", d)
print("Shape of d: ", d.shape)

In [None]:
# Convert into column vector via reshape
e = a.reshape((3, 1))
print("e: ", e)
print("Shape of e: ", e.shape)

In [None]:
# Convert into column vector via newaxis
f = a[:, np.newaxis]
print("f: ", f)
print("Shape of f: ", f.shape)

In [None]:
# Convert into column vector via reshape and negative index
g = x.reshape(-1,1)
print("g: ", g)
print("Shape of g: ", g.shape)

<img src="https://i.stack.imgur.com/zkMBy.png" alt="cce" border="0">

In [None]:
# Create 1-d aray of size 20
a = np.arange(20)
print("a: ", a)

# Reshape into 4x5 matrix
b = a.reshape(4, 5)
print("\nb: ", b)

In [None]:
# Reshape into 2x10 matrix
c = a.reshape(2, 10)
print("c: ", c)

In [None]:
# Reshape into matrix with 5 rows. It will automatically decide number of columns.
d = a.reshape(5, -1)
print("\nd: ", d)

### Flattening of Array
#### ***ndarray.flatten(order='C')***
* Return a copy of the array collapsed into one dimension.
* *order = {‘C’, ‘F’, ‘A’, ‘K’}**
    * ‘C’ means to flatten in row-major (C-style) order. 
    * ‘F’ means to flatten in column-major (Fortran- style) order.
    * ‘A’ means to flatten in column-major order if a is Fortran contiguous in memory, row-major order otherwise. 
    * ‘K’ means to flatten a in the order the elements occur in memory.

In [None]:
a = np.array([[1, 2, 3], [4, 5, 6]])
print("a: \n", a)

b = a.flatten()
print("\nFlatten a row-major order: \n", b)

c = a.flatten('F')
print("\nFlatten a column-major order: \n", c)

### Array Transpose

#### ***numpy.transpose(a, axes=None)***
* Returns an array with axes transposed.

In [None]:
a = np.random.randint(1, 9, size=(3, 5))
print("a: \n", a)

b = np.transpose(a)
print("b: \n", b)

#  another way to transpose
c = a.T
print("c: \n", c)

<a id="array_concatenation_splitting"></a><br>
## Array Concatenation and Splitting

### Concatenation of arrays
Concatenation operation joins two or more arrays along given axis.

#### ***numpy.concatenate((a1, a2, ...), axis=0, out=None, dtype=None, casting="same_kind")***
* Join a sequence of arrays along an existing axis.

In [None]:
a = np.array([1, 2, 3])
print("a: ", a)

b = np.array([4, 5, 6])
print("\nb: ", b)

d = np.concatenate((a, b))
print("\nd: \n", d)

In [None]:
c = np.array([7, 8, 9]) 
print("\nc: ", c)

# Concatenate three arrays
e = np.concatenate((a, b, c))
print("\ne: ", e)

In [None]:
a = np.array([[1, 2, 3],
                [4, 5, 6]])
print("a: \n", a)

# Concatenate along the first axis
b = np.concatenate((a, a))
print("\nb: \n", b)

In [None]:
# concatenate along the second axis (zero-indexed)
c =np.concatenate((a, a), axis=1)
print("\nc: \n", c)

#### ***numpy.vstack(tup, *, dtype=None, casting='same_kind')***
* Stack arrays in sequence vertically (row wise).
* This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). Rebuilds arrays divided by vsplit.

In [None]:
# For working with arrays of mixed dimensions, it can be clearer to use the np.vstack
# (vertical stack) and np.hstack (horizontal stack) functions:
a = np.array([[1, 2, 3]])
b = np.array([[9, 8, 7], [6, 5, 4]])
print("\na: \n", a)
print("\nb: \n", b)

# vertically stack the arrays (axis = 0)
c = np.vstack([a, b])
print("\nc: \n", c)

#### ***numpy.hstack(tup, *, dtype=None, casting='same_kind')***
* Stack arrays in sequence horizontally (column wise).
* This is equivalent to concatenation along the second axis, except for 1-D arrays where it concatenates along the first axis. Rebuilds arrays divided by hsplit.

In [None]:
a = np.array([[99],
            [99]])
b = np.array([[9, 8, 7],
                 [6, 5, 4]])
print("\na: \n", a)
print("\nb: \n", b)

# horizontally stack the arrays (axis = 1)
c = np.hstack([a, b])
print("\nc: \n", c)

### Splitting of arrays

* The opposite of concatenation is splitting, which is implemented by the functions ***np.split***, ***np.hsplit***, and ***np.vsplit***. For each of these, we can pass a list of indices giving the split points:

#### ***numpy.split(ary, indices_or_sections, axis=0)***
* Split an array into multiple sub-arrays as views into ary.

In [None]:
x = np.array([1,2,3,99,99,3,2,1])
x1, x2, x3 = np.split(x, [3,5])
print(x1, x2, x3)

In [None]:
x = np.array([1,2,3,99,99,3,2,1])
x1, x2, x3, x4 = np.split(x, [3,5,7])
print(x1, x2, x3,x4)

* Notice that N split points lead to N + 1 subarrays. The related functions ***np.hsplit*** and ***np.vsplit*** are similar:


#### ***numpy.vsplit(ary, indices_or_sections)***
* Split an array into multiple sub-arrays vertically (row-wise).
* vsplit is equivalent to split with axis=0 (default), the array is always split along the first axis regardless of the array dimension.

In [None]:
x = np.arange(36, dtype=np.float32).reshape((6,6))
print("x: \n", x)

# Vertically splitting array in to 2 parts
upper, lower = np.vsplit(x, [2])
print("\nUpper: \n", upper)
print("\nLower: \n", lower)

In [None]:
# Vertically splitting array in to 3 parts
upper, middle, lower = np.vsplit(x, [2,3])
print("\nUpper: \n", upper)
print("\nMiddle: \n", middle)
print("\nLower: \n", lower)

#### ***numpy.hsplit(ary, indices_or_sections)***
* Split an array into multiple sub-arrays horizontally (column-wise).
* hsplit is equivalent to split with axis=1, the array is always split along the second axis except for 1-D arrays, where it is split at axis=0.

In [None]:
x = np.arange(36, dtype=np.float32).reshape((6,6))
print("x: \n", x)

# Horizontally spliting array into 2 parts
left, right = np.hsplit(x, [2])
print("\nLeft: \n", left)
print("\nRight: \n", right)

In [None]:
# Horizontally spliting array into 2 equal parts
left, right = np.hsplit(x, 2)
print("\nLeft: \n", left)
print("\nRight: \n", right)

<a id="computation_on_numpy_arrays"></a> <br>
# Computation on NumPy Arrays: Universal Functions

<a id="numpy_ufuncs"></a> <br>
## Exploring NumPy’s UFuncs

* Ufuncs exist in two flavors: 
    * unary ufuncs, which operate on a single input
    * binary ufuncs, which operate on two inputs. 

The standard addition, subtraction, multiplication, and division can all be used with NumPy arrays:

<a id="array_arithmetic"></a> <br>
### Array arithmetic

In [None]:
x = np.arange(4)
print("x =", x)
print("x + 5 =", x + 5)
print("x - 5 =", x - 5)
print("x * 2 =", x * 2)
print("x / 2 =", x / 2)
print("x // 2 =", x // 2)    # floor division
print("x ** 2 = ", x ** 2)   # exponent
print("x % 2 = ", x % 2)     # modulo

In [None]:
#There is also a unary ufunc for negation
print("-x = ", -x)

In [None]:
# In addition, these can be strung together however you wish, and the standard order of operations is respected:
-(0.5 * x + 1) ** 2

* NumPy’s ufuncs feel very natural to use because they make use of Python’s native arithmetic operators.
* All of these arithmetic operations are simply convenient wrappers around specific functions built into NumPy; for example, the + operator is a wrapper for the add function.

In [None]:
print("np.add(3, 2):", np.add(3, 2))

x = np.arange(6)
print("\nx: \n", x)
print("\nnp.add(x, 2):\n", np.add(x, 2))                    # Addition +
print("\nnp.subtract(x, 5):\n", np.subtract(x, 5))          # Subtraction -
print("\n(np.negative(x):\n", np.negative(x))              # Unary negation -
print("\nnp.multiply(x, 3):\n", np.multiply(x, 3))          # Multiplication *
print("\nnp.divide(x, 2):\n", np.divide(x, 2))              # Division /
print("\nnp.floor_divide(x, 2):\n", np.floor_divide(x, 2))   # Floor division //
print("\nnp.power(x, 2):\n", np.power(x, 2))                # Exponentiation **
print("\nnp.mod(x, 2):\n", np.mod(x, 2))                     # Modulus/remainder **
print("\nnp.multiply(x, x):\n", np.multiply(x, x))

#### ***numpy.average(a, axis=None, weights=None, returned=False, *, keepdims=<no value>)***
* Compute the weighted average along the specified axis.

In [None]:
a = np.arange(12).reshape((3, 4))
print("a:\n", a)

print("\nAverage of a: ", np.average(a))
print("\nAverage of a along axis=0: ", np.average(a, axis=0))
print("\nAverage of a along axis=1: ", np.average(a, axis=1))

### Mean, Variance and Standard Deviation

#### ***numpy.mean(a, axis=None, dtype=None, out=None, keepdims=<no value>, *, where=<no value>)***
* Compute the arithmetic mean along the specified axis.
* Returns the average of the array elements. The average is taken over the flattened array by default, otherwise over the specified axis. float64 intermediate and return values are used for integer inputs.

#### ***numpy.std(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, *, where=<no value>)***
* Compute the standard deviation along the specified axis.
* Returns the standard deviation, a measure of the spread of a distribution, of the array elements. The standard deviation is computed for the flattened array by default, otherwise over the specified axis.

#### ***numpy.var(a, axis=None, dtype=None, out=None, ddof=0, keepdims=<no value>, *, where=<no value>)***
* Compute the variance along the specified axis.
* Returns the variance of the array elements, a measure of the spread of a distribution. The variance is computed for the flattened array by default, otherwise over the specified axis.
    
<center><img src="https://images.squarespace-cdn.com/content/v1/533db07de4b0d9f7ba7f1e77/1552186395207-9OQVIKUBI0G3TJPEJJ7R/summary+table+of+mean%2C+variance%2C+and+standard+deviation+formulas" alt="cce" border="0" width="500px"></center>

In [None]:
a = np.array([[1, 2], [3, 4]])
print("a:\n", a)

print("Mean of a: ", np.mean(a))
print("Mean of a along axis=0: ", np.mean(a, axis=0))
print("Mean of a along axis=1: ", np.mean(a, axis=1))

In [None]:
a = np.arange(12).reshape(3, 4)
print("a: \n", a)
print("Standard Deviation: \n", np.std(a))
print("Standard Deviation along axis=0: \n", np.std(a, axis=0))
print("Standard Deviation along axis=1: \n", np.std(a, axis=1))

In [None]:

a = np.random.randint(0, 12, (4, 3))
print("a: \n", a)
print("Variance: \n", np.var(a))
print("Variance along axis=0: \n", np.var(a, axis=0))
print("Variance along axis=1: \n", np.var(a, axis=1))

#### ***numpy.round(a, decimals=0, out=None)***
* Evenly round to the given number of decimals.
* For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc.
* np.around() and np.rint() are alias of np.round().

In [None]:
np.round([.5, 1.5, 2.5, 3.5, 4.6, 5.4, 6.5, 7.8])
#np.around([.5, 1.5, 2.5, 3.5, 4.6, 5.4, 6.5, 7.8])            # equivalent to np.round()
#np.rint([.5, 1.5, 2.5, 3.5, 4.6, 5.4, 6.5, 7.8])              # equivalent to np.round()

#### ***numpy.ediff1d(ary, to_end=None, to_begin=None)***
* The differences between consecutive elements of an array.
* The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively.

In [None]:
x = np.array([1, 2, 4, 7, 0])
np.ediff1d(x)

In [None]:
np.ediff1d(x, to_begin=-99, to_end=np.array([88, 99]))

In [None]:
# ediff1d cosider multi-dimensional array as 1-dim array and always returns 1-d array
y = [[1, 2, 4], [1, 6, 24]]
np.ediff1d(y)

#### ***numpy.diff(a, n=1, axis=-1, prepend=<no value>, append=<no value>)***
* Calculate the n-th discrete difference along the given axis.
* The first difference is given by out[i] = a[i+1] - a[i] along the given axis, higher differences are calculated by using diff recursively.

In [None]:
x = np.array([1, 2, 4, 7, 0])
np.diff(x)

In [None]:
x = np.array([[1, 3, 6, 10], [0, 5, 6, 8]])
np.diff(x)

In [None]:
np.diff(x, axis=0)

<a id="exponenets_and_logarithms"></a> <br>
### Exponents and logarithms

* Another common type of operation available in a NumPy ufunc are the exponentials and logarithms:

In [None]:
x = np.array([1,2,3])
print("x: ", x)
print("\ne^x: ",np.exp(x))
print("\n2^x: ",np.exp2(x))
print("\n3^x: ",np.power(3,x))

In [None]:
# The inverse of the exponentials, the logarithms, are also available. The basic np.log gives the natural logarithm; 
# if you prefer to compute the base-2 logarithm or the base-10 logarithm, these are available as well.

x = [1, 2, 4, 10]
print("x: ", x)
print("\nln(x) :", np.log(x))
print("\nlog2(x): ", np.log2(x))
print("\nlog10(x): ", np.log10(x))

In [None]:
# There are also some specialized versions that are useful for maintaining precision with very small input.
x = [0, 0.001, 0.01, 0.1]
print("exp(x)-1: ", np.expm1(x))
print("\nlog(1+x): ", np.log1p(x))

<a id="absolute_value"></a> <br>
### Absolute value

In [None]:
# Just as NumPy understands Python’s built-in arithmetic operators, it also understands
# Python’s built-in absolute value function:
x = np.array([-2, -1, 0, 1, 2])
abs(x)

The corresponding NumPy ufunc is ***np.absolute***, which is also available under the alias ***np.abs***.
#### ***numpy.absolute(x, /, out=None, *, where=True, casting='same_kind', order='K', dtype=None, subok=True[, signature, extobj])***
* Calculate the absolute value element-wise.

In [None]:
print("x: ", x)
print("\nnp.absolute(x): ", np.absolute(x))
print("\nnp.abs(x): ", np.abs(x))

In [None]:
# This ufunc can also handle complex data, in which the absolute value returns the magnitude:
x = np.array([7-24j,4-3j,2+0j,1+3j])
print("\nnp.absolute(x): ", np.absolute(x))

<a id="trigonometric_functions"></a> <br>
### Trigonometric functions

* NumPy provides a large number of useful ufuncs, and some of the most useful for the data scientist are the trigonometric functions. 

In [None]:
# define an array of angles:
theta = np.linspace(0,np.pi,3)

# now we can compute some trigonometric fuctions on these values:
print("theta: ",theta)
print("\nsin(theta): ",np.sin(theta))
print("\ncos(theta): ",np.cos(theta))
print("\ntan(theta): ",np.tan(theta))

In [None]:
x = np.array([-1, 0, 1])

print("x = ", x)
print("\narcsin(x): ", np.arcsin(x))
print("\narccos(x): ", np.arccos(x))
print("\narctan(x): ", np.arctan(x))

<a id="aggregation_min_max"></a> <br>
# Aggregations: Min, Max, and Everything in Between

NumPy has fast built-in aggregation functions for working on arrays.

<a id="summing_values_in_array"></a> <br>
## Summing the Values in an Array

In [None]:
# As a quick example, consider computing the sum of all values in an array. 
# Python itself can do this using the built-in sum function:
x = np.random.random(100)
print("sum(x): ", sum(x))

# NumPy supports np.sum() to perform summation on array
print("\nnp.sum(x): ", np.sum(x))

In [None]:
# NumPy executes the operation in compiled code, NumPy’s version of the
# operation is computed much more quickly.
big_array = np.random.rand(1000000)

%timeit sum(big_array)
%timeit np.sum(big_array)

<a id="minimum_and_maximum"></a> <br>
## Minimum and Maximum

In [None]:
# Similarly, Python has built-in min and max functions, used to find the minimum value
# and maximum value of any given array:

print("min(big_array): ", min(big_array))
print("max(big_array): ", max(big_array)) 

#NumPy’s corresponding functions have similar syntax, and again operate much more quickly:
print("\nnp.min(big_array): ",np.min(big_array))
print("np.max(big_array): ", np.max(big_array))

In [None]:
# NumPy executes the operation in compiled code, NumPy’s version of the
# operation is computed much more quickly.

%timeit min(big_array)
%timeit np.min(big_array)

For min, max, sum, and several other NumPy aggregates, a shorter syntax is to use methods of the array object itself.

In [None]:

print("Minimum: ", big_array.min())
print("Maximum: ", big_array.max())
print("Sum: ", big_array.sum())

# Whenever possible, make sure that you are using the NumPy version of these aggregates \
# when operating on NumPy arrays!
%timeit np.min(big_array)
%timeit big_array.min()

<a id="multidimensional_aggregates"></a> <br>
## Multidimensional aggregates

* One common type of aggregation operation is an aggregate along a row or column.

In [None]:
# get 2-dimensional array
a = np.random.random((3,4))
print("a: \n", a)

print("\na.sum(): \n", a.sum())

* Aggregation functions take an additional argument specifying the axis along which the aggregate is computed. For example, we can find the minimum value within each column by specifying axis=0.

In [None]:
print("a: \n", a)

print("\nMinimum using a.min(axis=0): ", a.min(axis=0))
# or use that way
print("Minimum using np.min(a, axis=0): ", np.min(a, axis=0))


In [None]:
# Similarly, we can find the maximum value within each row:
print("a: \n", a)

print("\nMaximum using a.mzx(axis=0): ", a.max(axis=0))
# or use that way
print("Maximum using np.max(a, axis=0): ", np.max(a, axis=0))

In [None]:
# Note that some of these NaN-safe functions were not added until NumPy 1.8, 
# so they will not be available in older NumPy versions.
x = np.array([1, 2, np.nan, 4, 5])
print("x: ", x)
print("\nnp.sum(x): ", np.sum(x))
print("np.nansum(x): ", np.nansum(x))

print("\nnp.mean(x): ", np.mean(x))
print("np.nanmean(x): ", np.nanmean(x))

print("\nnp.std(x): ", np.std(x))
print("np.nanstd(x): ", np.nanstd(x))

# Be careful when using argmin() that it does not return a real index of minimum value. 
# If there is a nan value in an array, it returns index of nan value.
print("\nnp.argmin(x): ", np.argmin(x)) 
print("np.nanargmin(x): ", np.nanargmin(x))

<img src="https://3.bp.blogspot.com/-2pjqt9Ga6IM/W20-sIVK0II/AAAAAAAAXVM/BB74tRGTiwgcYTgezVLD3LKH7NFj4pjpgCLcBGAs/s1600/4214_t2-3.PNG" alt="cce" border="0">

<a id="broadcasting"></a> <br>
# Computation on Arrays: Broadcasting

Broadcasting is simply a set of rules for applying binary ufuncs (addition, subtraction, multiplication, etc.) on arrays of different sizes.

<a id="introducing_broadcasting"></a> <br>
## Introducing Broadcasting

In [None]:
a = np.array([0,1,2])
b = np.array([5,5,5])
print("a: ", a)
print("b: ", b)

c = a + b
print("c: ", c)

In [None]:
d = a + 5
print("d: ", d)

<a id="visulization_of_numpy_broadcasting"></a> <br>
## Visualization of NumPy broadcasting

<img src="https://jakevdp.github.io/PythonDataScienceHandbook/figures/02.05-broadcasting.png" alt="broadcasting" border="0">

#### Rules of Broadcasting
***Broadcasting in NumPy follows a strict set of rules to determine the interaction between the two arrays:***
    
* ***Rule 1***: If the two arrays differ in their number of dimensions, the shape of the one with fewer dimensions is padded with ones on its leading (left) side.
* ***Rule 2***: If the shape of the two arrays does not match in any dimension, the array with shape equal to 1 in that dimension is stretched to match the other shape.
* ***Rule 3***: If in any dimension the sizes disagree and neither is equal to 1, an error is raised.

In [None]:
# We can similarly extend this to arrays of higher dimension. Observe the result when
# we add a one-dimensional array to a two-dimensional array:

a = np.array([1, 2, 3])
b = np.ones((3,3))

print("a: ", a)
print("b: \n", b)

# Here the one-dimensional array a is stretched, or broadcast, across the second
# dimension in order to match the shape of M .
c = b + a
print("\nc: \n", c)

In [None]:
# here we’ve stretched both a and b to match a common shape, and the result is a two-
# dimensional array!

a = np.arange(3) #(3,) 1 dimensional
b = np.arange(3)[:,np.newaxis] #(3,1) 2 dimensional

print("a: ", a)
print("b: \n", b)

c = a + b
print("\nc: \n", c)

<a id="broadcasting_examples"></a> <br>
## Broadcasting examples

In [None]:
#Let’s look at adding a two-dimensional array to a one-dimensional array:
a = np.ones((2, 3))
b = np.arange(3)

# Let’s consider an operation on these two arrays. The shapes of the arrays are:
# a.shape = (2, 3)
# b.shape = (3,)
# We see by rule 1 that the array a has fewer dimensions, so we pad it on the left with
# ones:
# a.shape -> (2, 3)
# b.shape -> (1, 3)
# By rule 2, we now see that the first dimension disagrees, so we stretch this dimension
# to match:
# a.shape -> (2, 3)
# b.shape -> (2, 3)
# The shapes match, and we see that the final shape will be (2, 3) :

print("a: \n", a)
print("b: \n", b)

print("a+b: \n", a+b)

In [None]:
# Let’s take a look at an example where both arrays need to be broadcast:
a = np.arange(3).reshape((3,1))
b = np.arange(3)
# Again, we’ll start by writing out the shape of the arrays:

# a.shape = (3, 1)
# b.shape = (3,)
# |
# Rule 1 says we must pad the shape of b with ones:
# a.shape -> (3, 1)
# b.shape -> (1, 3)
# And rule 2 tells us that we upgrade each of these ones to match the corresponding
# size of the other array:
# a.shape -> (3, 3)
# b.shape -> (3, 3)
# Because the result matches, these shapes are compatible. We can see this here:

print("a: \n", a)
print("b: \n", b)

print("a+b: \n", a+b)

In [None]:
# Now let’s take a look at an example in which the two arrays are not compatible:

a = np.ones((3,2))
b = np.arange(3)

print("a: \n", a)
print("b: \n", b)

# This is just a slightly different situation than in the first example: the matrix M is
# transposed. How does this affect the calculation? The shapes of the arrays are:
# a.shape = (3, 2)
# b.shape = (3,)
# Again, rule 1 tells us that we must pad the shape of a with ones:
# a.shape -> (3, 2)
# b.shape -> (1, 3)
# By rule 2, the first dimension of a is stretched to match that of M :
# a.shape -> (3, 2)
# b.shape -> (3, 3)
# Now we hit rule 3—the final shapes do not match, so these two arrays are incompati‐
# ble, as we can observe by attempting this operation:

# print(a+b) #ERROR! operands could not be broadcast together with shapes

In [None]:
print(a[:, np.newaxis].shape)
a + b[:, np.newaxis]

<a id="comparisions_masks_booleanlogic"></a> <br>
# Comparisons, Masks, and Boolean Logic

<a id="comparison_operators_as_ufuncs"></a> <br>
## Comparison Operators as ufuncs

* The result of these comparison operators is always an array with a Boolean data type. All six of the standard comparison operations are available:

* for example, you might wish to count all values greater than a certain value, or perhaps remove all outliers that are above some threshold. In NumPy, Boolean masking is often the most efficient way to accomplish these types of tasks.

In [None]:
x = np.array([1, 2, 3, 4, 5])

print("x<3: ", x<3)       # less than
print("x>3: ", x>3)       # greater than
print("x<=3: ", x<=3)     # less than or equal
print("x>=3: ", x>=3)     # greater than or equal
print("x==3: ", x==3)     # equal
print("x!=3: ", x!=3)     # not equal

In [None]:
# It is also possible to do an element-by-element comparison of two arrays, and to
# include compound expressions:
a = 2*x
b = 2**x
print("x: ", x)
print("a: ", a)
print("b: ", b)
c = a == b
print("c: ", c)

<a id="comparision_operators_and_equivalent"></a> <br>
## Comparison operators and their equivalent

As in the case of arithmetic operators, the comparison operators are implemented as ufuncs in NumPy; for example, when you write x < 3 , internally NumPy uses np.less(x, 3) . A summary of the comparison operators and their equivalent ufunc is shown here:

<img src="https://3.bp.blogspot.com/-ePv8m0F9BaI/W4nwyN2vb2I/AAAAAAAAXWs/zF0LYfQGYzI4u4JILeHSnH4-jRoUgk-TwCLcBGAs/s1600/4229_2.PNG" alt="Comparison operators and their equivalent" border="0">

In [None]:
x = np.random.randint(10, size=(3,4))
print("x: \n", x)

#x<6
print("\nnp.less(x, 6): \n", np.less(x, 6))

In [None]:
print("x: \n", x)

# To count the number of True entries in a Boolean array, np.count_nonzero is useful:
# number of values less than 6
print("1-: ", np.count_nonzero(x<6))

# We see that there are some array entries that are less than 6. Another way to get at this
# information is to use np.sum ; in this case, False is interpreted as 0 , and True is inter‐
# preted as 1 :

print("2-: ", np.sum(x<6))

print("3-: ", np.sum(x!=np.nan))
print("4-: ", np.count_nonzero(x!=np.nan))

In [None]:
print("x: \n", x)

# number of values less than 6 in each row
print("Number of values <6 in each row: ", np.sum(x<6, axis=1))

# number of values less than 6 in each column
print("Number of values <6 in each column: ", np.sum(x<6, axis=0))

In [None]:
print("x: \n", x)

# If we’re interested in quickly checking whether any or all the values are true, we can
# use (you guessed it) np.any() or np.all() :
# are there any values greater than 8?
print("np.any(x>8): ", np.any(x>8))

# are there any values less than zero?
print("np.any(x<0): ", np.any(x<0))

# are all values less than 10?
print("np.all(x<10): ", np.all(x<10))

# are all values equal to 6?
print("np.all(x==6): ", np.all(x==6))

In [None]:
print("x: \n", x)

# are all values in each row less than 8?
print("np.all(x<8, axis=1): ", np.all(x<8, axis=1))

# are all values in each column less than 3?
print("np.all(x<3, axis=0): ", np.all(x<3, axis=0))

In [None]:
print("x: \n", x)

print("x<5: \n", x<5)
print("x[x<5]: ", x[x<5])

##### <a id="working_with_boolean_arrays"></a> <br>
## Working with Boolean Arrays 

In [None]:
# In Python, all nonzero integers will evaluate as True .
bool(42), bool(0), bool(-1)

In [None]:
bool(42 and 0)

In [None]:
bool(42 or 0)

In [None]:
# When you have an array of Boolean values in NumPy, this can be thought of as a
# string of bits where 1 = True and 0 = False , and the result of & and | operates in a
# similar manner as before:

a = np.array([1, 0, 1, 0, 1, 0], dtype=bool)
b = np.array([1, 1, 1, 0, 1, 1], dtype=bool)
c = a | b
print("a: ", a)
print("b: ", b)
print("c: ", c)

In [None]:
x = np.arange(10)
print("x: ", x)
print("(x > 4) & (x < 8): ", (x > 4) & (x < 8))

<a id="sorting_arrays"></a> <br>
# Sorting Arrays

<a id="fast_sorting"></a> <br>
## Fast Sorting in NumPy: np.sort and np.argsort

In [None]:
x = np.array([2, 1, 4, 3, 5])
print("x before sorting: ", x)

np.sort(x)
print("x after sorting: ", x)

In [None]:
x = np.array([2, 1, 4, 3, 5])
print("x before sorting: ", x)

# other syntax
x.sort()
print("x after sorting: ", x)

In [None]:
# return indices
x = np.array([2,1,4,3,5])
print("x: ", x)

y = np.argsort(x)
print("y: ", y)

print("x[y]: ", x[y])

<a id="sorting_rows_columns"></a> <br>
## Sorting along rows or columns

In [None]:
# A useful feature of NumPy’s sorting algorithms is the ability to sort along specific
# rows or columns of a multidimensional array using the axis argument. For example:
rand = np.random.RandomState(42)
x = rand.randint(0,10,(4,6))
print("x before sorting: \n", x)

# sort each column of X
y = np.sort(x, axis=0)
print("x after soring along axis=0: \n", y)

In [None]:
print("x before sorting: \n", x)

# sort each row of X
y = np.sort(x, axis=1)
print("x after soring along axis=1: \n", y)

<a id="references"></a></br>
# References
* https://numpy.org/devdocs/user/quickstart.html
* https://numpy.org/doc/stable/reference/index.html