# <center>Introduction to Numpy</center>
![](http://m.memegen.com/o6i6hi.jpg)

![](https://bids.berkeley.edu/sites/default/files/styles/400x225/public/projects/numpy_project_page.jpg?itok=flrdydei)


# What is Numpy?
---

NumPy is a general-purpose array-processing package. It provides a high-performance multidimensional array object, and tools for working with these arrays.

It is the fundamental package for scientific computing with Python. It contains among other things:
- a powerful N-dimensional array object
- sophisticated (broadcasting) functions
- tools for integrating C/C++ and Fortran code
- useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data.
Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

# Installation
---

![](https://i.imgflip.com/21yk3f.jpg)

- **Mac** and **Linux** users can install NumPy via pip command:
    ```
    pip install numpy
    ```

- **Windows** does not have any package manager analogous to that in linux or mac. Please download the pre-built windows installer for NumPy from [here](http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy) (according to your system configuration and Python version). And then install the packages manually.


Once you are done, just type this in python interpreter:
```python
import numpy as np
```

If you are still experiencing some issues, then Stack Overflow is your friend!

If no errors appear,congo! You have successfully installed NumPy. 
Lets move ahead...


## Arrays in NumPy
---
NumPy’s main object is the homogeneous multidimensional array.
- It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers.
- In NumPy dimensions are called *axes*. The number of axes is *rank*.
- NumPy’s array class is called **ndarray**. It is also known by the alias **array**. 

For example:
```python
[[ 1, 2, 3],
 [ 4, 2, 5]]
```  
This array has:
- rank = 2 (as it is 2-dimensional or it has 2 axes)
- first dimension(axis) length = 2, second dimension has length = 3.
- overall shape can be expressed as: (2, 3)

In [None]:
import numpy as np

In [None]:
arr = np.array([[ 1, 2, 3],
                [ 4, 2, 5]])

In [None]:
# type of arr
type(arr)

In [None]:
# shape of arr
arr.shape

In [None]:
# type of elements inside array
arr.dtype

![](https://memegenerator.net/img/instances/400x/74259368.jpg)

## Array creation
---
There are various ways to create arrays in NumPy.

- For example, you can create an array from a regular Python **list** or **tuple** using the **array** function. The type of the resulting array is deduced from the type of the elements in the sequences.

In [None]:
mylist = [[1,2,3,4],
          [5,6,7,8]]

In [None]:
myarr = np.array(mylist, dtype='float')

In [None]:
myarr

- Often, the elements of an array are originally unknown, but its size is known. Hence, NumPy offers several functions to create arrays with **initial placeholder content**. These minimize the necessity of growing arrays, an expensive operation. **For example:** np.zeros, np.ones, np.full, np.empty, etc.

In [None]:
# create an array of size 3x4 filled with 0s
c = np.zeros((3,4))

In [None]:
c

In [None]:
# create an array of size 3x3 filled with 6s of complex type
d = np.full((3, 3), 6, dtype = 'complex')

In [None]:
d

In [None]:
# 2x2 array with random values
e = np.random.random((2,2))

In [None]:
e

- To create sequences of numbers, NumPy provides a function analogous to range that returns arrays instead of lists.
   - **arange:** returns evenly spaced values within a given interval. **step** size is specified.
   - **linspace:** returns evenly spaced values within a given interval. **num** no. of elements are returned.

In [None]:
# create a sequence of integers from 0 to 30 with steps of 5
f = np.arange(0, 30, 5)

In [None]:
f

In [None]:
# create a sequence of 10 values in range 0 to 5
g = np.linspace(0, 5, 10)

In [None]:
g

In [None]:
# sequence of 10 random integers in range 0 to 10
h = np.random.randint(0, 10, 10)

- **Reshaping array:** We can use **reshape** method to reshape an array. Consider an array with shape (a1, a2, a3, ..., aN). We can reshape and convert it into another array with shape (b1, b2, b3, ....., bM). The only required condition is:   <br> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*a1 x a2 x a3 .... x aN = b1 x b2 x b3 .... x bM *. (i.e original size of array remains unchanged.)

In [None]:
# reshaping 3X4 array to 2X2X3 array
arr = np.array([[1, 2, 3, 4],
                [5, 2, 4, 2],
                [1, 2, 0, 1]])
newarr = arr.reshape(2, 2, 3)

In [None]:
newarr

- **Flatten array:** We can use **flatten** method to get a copy of array collapsed into **one dimension**. It accepts *order* argument. Default value is 'C' (for row-major order). Use 'F' for column major order.

In [None]:
arr = np.array([[1, 2, 3], [4, 5, 6]])

In [None]:
arr

In [None]:
flarr = arr.flatten()

In [None]:
flarr

## Array Indexing
---

Knowing the basics of array indexing is important for analysing and manipulating the array object.
NumPy offers many ways to do array indexing.

- **Slicing:** Just like lists in python, NumPy arrays can be sliced. As arrays can be multidimensional, you need to specify a slice for each dimension of the array.

In [None]:
# an exemplar array
arr = np.array([[-1, 2, 0, 4],
                [4, -0.5, 6, 0],
                [2.6, 0, 7, 8],
                [3, -7, 4, 2.0]])

In [None]:
temp = arr[:2, ::2]

In [None]:
temp

- **Integer array indexing:** In this method, lists are passed for indexing for each dimension. One to one mapping of corresponding elements is done to construct a new arbitrary array.

In [None]:
temp = arr[[0, 1, 2, 3], [3, 2, 1, 0]]

In [None]:
temp

- **Boolean array indexing:** This method is used when we want to pick elements from array which satisfy some condition.

In [None]:
cond = arr > 0

In [None]:
cond

In [None]:
# array elements which satisfy the condition
temp = arr[cond]

In [None]:
temp

## Basic operations
---

Plethora of built-in arithmetic functions are provided in NumPy.

- **Operations on single array:** We can use overloaded arithmetic operators to do element-wise operation on array to create a new array. In case of +=, -=, *= operators, the exsisting array is modified.

**Here are some examples:**

In [None]:
a = np.array([1, 2, 5, 3])

In [None]:
# add 1 to every element
a+1

In [None]:
# subtract 3 from each element
a-3

In [None]:
# multiply each element by 10
a*10

In [None]:
# square each element
a**2

In [None]:
# modify existing array
a *= 2

In [None]:
a

In [None]:
# sample array
a = np.array([[1, 2, 3], [3, 4, 5], [9, 6, 0]])

In [None]:
a

In [None]:
# transpose of array
a.T


- **Unary operators:** Many unary operations are provided as a method of **ndarray** class. This includes sum, min, max, etc. These functions can also be applied row-wise or column-wise by setting an axis parameter.

**Here are some examples:**

In [None]:
arr = np.array([[1, 5, 6], 
                [4, 7, 2], 
                [3, 1, 9]])

In [None]:
# maximum element of array
arr.max()

In [None]:
# row-wise maximum elements
arr.max(axis=1)

In [None]:
# column wise minimum elements
arr.min(axis=0)

In [None]:
# sum of all array elements
arr.sum()

In [None]:
# sum of each row
arr.sum(axis=1)

In [None]:
# cumulative sum along each row
arr.cumsum(axis=1)

---
- **Binary operators:** These operations apply on array elementwise and a new array is created. You can use all basic arithmetic operators like +, -, /, *, etc. In case of +=, -=, *= operators, the exsisting array is modified.

**Here are some examples:**

In [None]:
a = np.array([[1, 2], 
              [3, 4]])
b = np.array([[4, 3], 
              [2, 1]])

In [None]:
# sum of arrays
a + b

In [None]:
# multiply arrays (elementwise multiplication)
a*b

In [None]:
# matrix multiplication
a.dot(b)


- **Universal functions (ufunc):** NumPy provides familiar mathematical functions such as sin, cos, exp, etc. These functions also operate elementwise on an array, producing an array as output.

**Note:** All the operations we did above using overloaded operators can be done using ufuncs like np.add, np.subtract, np.multiply, np.divide, np.sum, etc.

In [None]:
a = np.array([0, np.pi/2, np.pi])

In [None]:
a

In [None]:
np.sin(a)

In [None]:
np.exp(a)

In [None]:
np.sqrt(a)

## Sorting array
There is a simple **np.sort** method for sorting NumPy arrays.
Let's explore it a bit.

In [None]:
a = np.array([[1, 4, 2],
              [3, 4, 6],
              [0, -1, 5]])

In [None]:
# array elements in sorted order
np.sort(a, axis=None)

In [None]:
# sort array row wise
np.sort(a, axis=1)

In [None]:
# specify sort algorithm
np.sort(a, axis = 0, kind = 'mergesort')

In [None]:
# example to show sorting of structured array
## set alias names for dtypes
dtypes = [('name', 'S10'), ('grad_year', int), ('cgpa', float)]
## values to be put in array
values = [('Hrithik', 2009, 8.5), ('Ajay', 2008, 8.7), ('Pankaj', 2008, 7.9), ('Aakash', 2009, 9.0)]
## creating array
arr = np.array(values, dtype = dtypes)
print("\nArray sorted by names:\n", )
print("Array sorted by grauation year and then cgpa:\n", )

In [None]:
# array sorted by name
np.sort(arr, order = 'name')

In [None]:
# array sorted by grauation year and then cgpa
np.sort(arr, order = ['grad_year', 'cgpa'])

# Stacking and Splitting

Several arrays can be stacked together along different axes.

- **np.vstack:** To stack arrays along vertical axis.

- **np.hstack:** To stack arrays along horizontal axis.

- **np.column_stack:** To stack 1-D arrays as columns into 2-D arrays.

- **np.concatenate:** To stack arrays along specified axis (axis is passed as argument).

In [None]:
a = np.array([[1, 2],
              [3, 4]])

b = np.array([[5, 6],
              [7, 8]])

In [None]:
# vertical stacking
np.vstack((a, b))

In [None]:
# horizontal stacking
np.hstack((a, b))

In [None]:
# new array
c = [5, 6]

In [None]:
# stacking array c as a column to array a
np.column_stack((a, c))

In [None]:
# stacking array c as a row to array a
np.row_stack((a, c))

In [None]:
# concatenation method
np.concatenate((a,b), 1)

For splitting, we have these fuctions:

- **np.hsplit:** Split array along horizontal axis.

- **np.vsplit:** Split array along vertical axis.

- **np.array_split:** Split array along specified axis.

In [None]:
a = np.array([[1, 3, 5, 7, 9, 11],
              [2, 4, 6, 8, 10, 12]])

In [None]:
# horizontal splitting in 2 parts
np.hsplit(a, 2)

In [None]:
# vertical splitting in 2 parts
np.vsplit(a, 2)

# Broadcasting 

The term broadcasting describes how numpy treats arrays with different shapes during arithmetic operations. Subject to certain constraints, the smaller array is "broadcast" across the larger array so that they have compatible shapes. Broadcasting provides a means of vectorizing array operations so that looping occurs in C instead of Python. It does this without making needless copies of data and usually leads to efficient algorithm implementations. There are also cases where broadcasting is a bad idea because it leads to inefficient use of memory that slows computation.


Numpy operations are usually done element-by-element which requires two arrays to have exactly the same shape. Numpy's broadcasting rule relaxes this constraint when the arrays' shapes meet certain constraints. The simplest broadcasting example occurs when an array and a scalar value are combined in an operation.

Consider the example given below:

In [None]:
a = np.array([1.0, 2.0, 3.0])

In [None]:
b = [2.0, 2.0, 2.0]

In [None]:
a*b

In [None]:
b = np.array([2.0])

In [None]:
a*b

![](http://scipy.github.io/old-wiki/pages/image0013830.gif?action=AttachFile&do=get&target=image001.gif)

**In above example, the scalar b is stretched to become an array of with the same shape as a so the shapes are compatible for element-by-element multiplication.**

We can think of the scalar b being stretched during the arithmetic operation into an array with the same shape as a. The new elements in b, as shown in above figure, are simply copies of the original scalar. Although, the stretching analogy is only conceptual. 
Numpy is smart enough to use the original scalar value without actually making copies so that broadcasting operations are as memory and computationally efficient as possible. Because Example 1 moves less memory, (b is a scalar, not an array) around during the multiplication, it is about 10% faster than Example 2 using the standard numpy on Windows 2000 with one million element arrays! 

## The Broadcasting Rule

In order to broadcast, the size of the trailing axes for both arrays in an operation must either be the same size or 
one of them must be **one**.

Let us see some examples:
```
A(2-D array): 4 x 3
B(1-D array):     3
Result      : 4 x 3
```

```
A(4-D array): 7 x 1 x 6 x 1
B(3-D array):     3 x 1 x 5
Result      : 7 x 3 x 6 x 5
```

But this would be a mismatch:
```
A: 4 x 3
B:     4
```
Now, let us see an example where both arrays get stretched.

In [None]:
a = np.array([0.0, 10.0, 20.0, 30.0])
b = np.array([0.0, 1.0, 2.0])

In [None]:
a[:, np.newaxis]

In [None]:
a[:, np.newaxis] + b

![](http://scipy.github.io/old-wiki/pages/image004de9e.gif?action=AttachFile&do=get&target=image004.gif)
** In some cases, broadcasting stretches both arrays to form an output array larger than either of the initial arrays. **

# Working with datetime


Numpy has core array data types which natively support datetime functionality. The data type is called “datetime64”, so named because “datetime” is already taken by the datetime library included in Python.

Consider the example below for some examples:

In [None]:
# creating a date
today = np.datetime64('2017-12-31')

In [None]:
today

In [None]:
# get year in numpy datetime object
np.datetime64(today, 'Y')

In [None]:
# creating array of dates in a month
dates = np.arange('2017-12', '2018-01', dtype='datetime64[D]')

In [None]:
dates

In [None]:
today in dates

In [None]:
# arithmetic operation on dates
dur = np.datetime64('2018-05-22') - np.datetime64('2017-05-22')

In [None]:
dur

In [None]:
np.timedelta64(dur, 'W')

In [None]:
# sorting dates
a = np.array(['2017-02-12', '2016-10-13', '2019-05-22'], dtype='datetime64')

In [None]:
np.sort(a)

# Linear algebra in NumPy


The **Linear Algebra** module of NumPy offers various methods to apply linear algebra on any numpy array.

You can find:
- rank, determinant, trace, etc. of an array.
- eigen values of matrices
- matrix and vector products (dot, inner, outer,etc. product), matrix exponentiation
- solve linear or tensor equations
and much more!

Now, let us assume that we want to solve this linear equation set:
```
x + 2*y = 8
3*x + 4*y = 18
```
This problem can be solved using **linalg.solve** method as shown in example below:

In [None]:
# coefficients
a = np.array([[1, 2], [3, 4]])
# constants
b = np.array([8, 18])

np.linalg.solve(a, b)

Consider the example below which explains how we can use numpy to do some matrix operations.

In [None]:
A = np.array([[6, 1, 1],
              [4, -2, 5],
              [2, 8, 7]])

In [None]:
# rank of matrix
np.linalg.matrix_rank(A)

In [None]:
# trace of matrix
np.trace(A)

In [None]:
# determinant of matrix
np.linalg.det(A)

In [None]:
# inverse of matrix
np.linalg.inv(A)

In [None]:
# matrix exponentiation
np.linalg.matrix_power(A, 3)

## Saving and loading numpy arrays


The ``.npy`` format is the standard binary file format in NumPy for
persisting a **single** arbitrary NumPy array on disk. The format stores all
of the shape and dtype information necessary to reconstruct the array
correctly even on another machine with a different architecture.
The format is designed to be as simple as possible while achieving
its limited goals.

The ``.npz`` format is the standard format for persisting **multiple** NumPy
arrays on disk. A ``.npz`` file is a zip file containing multiple ``.npy``
files, one for each array.

- **np.save(filename, array)** : saves a single array in ``npy`` format.

- **np.savez(filename, array_1[, array_2])** : saves multiple numpy arrays in ``npz`` format.

- **np.load(filename)** : load a ``npy`` or ``npz`` format file.

In [None]:
a = np.array([[1,2,3],
             [4,5,6]])

b = np.array([[6,5,4],
              [3,2,1]])

In [None]:
np.save("a.npy", a)

In [None]:
arr = np.load("a.npy")

In [None]:
arr

In [None]:
np.savez("ab.npz", a=a, b=b)

In [None]:
X = np.load("ab.npz")

In [None]:
X['a']

In [None]:
X['b']

References:
- [broadcasting](http://scipy.github.io/old-wiki/pages/EricsBroadcastingDoc)
- [datetime in numpy](https://docs.scipy.org/doc/numpy/reference/arrays.datetime.html#arrays-dtypes-dateunits)
- [linaer algebra in numpy](https://docs.scipy.org/doc/numpy/reference/routines.linalg.html)

![](https://i.pinimg.com/736x/c8/90/b2/c890b24d364d6ae6413c37b70e6640ae--math-jokes-math-humor.jpg)