# NUMPY

NumPy is a Python library that can be used for scientific calculations and it is used as a tool for linear algebra operations.

# 1.1 ARRAY CREATION

We can create a NumPy array by passing a  python list to it using 'np.array()'

* 1-D Array
* 2-D Array
* 3-D Array
* Attributes
    * ndim
    * shape
    * size
    * dtype
    * itemsize
    * databuffer
* Axis
* zeros
* ones
* arange
* linspace
* full
* eye
* empty
* rand and randn


## 1D Array

In [162]:
import numpy as np   #first import the numpy library
a = np.array([1,2,3])  #creation of 1d array
a

array([1, 2, 3])

## 2D Array

In [163]:
b=np.array([(1,2,3),(4,5,6)])  #creation of 2d array
b

array([[1, 2, 3],
       [4, 5, 6]])

## 3D Array(N-Dimensional Array)

In [164]:
c = np.array( [[[0,  1,  2],               # a 3D array (two stacked 2D arrays)
                [3, 4, 5]],
                [[10, 11, 12],              #N-Dimensional array
                [13, 14, 15]]])
c

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[10, 11, 12],
        [13, 14, 15]]])

### some of the important attributes to be noted are:

i)ndim ---No.of dimensions(axes) of array

ii)shape---Dimensions of the array

iii)size---No.of elements in array

iv)dtype---Describes the type of elements present in an array

v)itemsize---Tells about the size in bytes of each element in array

vi)data---The buffer object pointing to the start of the array’s data.

data,dtype will be discussed in later sections

### for 1-D array

In [165]:
a

array([1, 2, 3])

In [166]:
a.shape 

(3,)

In [167]:
len(a)   #length of the array

3

In [168]:
a.ndim   

1

In [169]:
a.size 

3

In [170]:
a.itemsize

4

In [171]:
a.data 

<memory at 0x000001EED19A3A08>

### for 2-D array

In [172]:
b

array([[1, 2, 3],
       [4, 5, 6]])

In [8]:
b.shape  

(2, 3)

In [9]:
len(b)  

2

In [10]:
b.ndim 

2

In [11]:
b.size 

6

In [173]:
b.itemsize 

4

In [174]:
b.data

<memory at 0x000001EED1A5CA68>

### for N-D array

In [12]:
c

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[10, 11, 12],
        [13, 14, 15]]])

In [13]:
c.shape 

(2, 2, 3)

In [14]:
len(c)  #length of the array

2

In [15]:
c.ndim  

3

In [16]:
c.size 

12

In [175]:
c.itemsize

4

In [176]:
c.data

<memory at 0x000001EED034DE58>

## axis

axis=0 -->Direction along rows

axis=1 -->Direction along columns

In [17]:
import numpy as np
x=np.array([(1,7,3),(4,5,6)])  #here no.of rows=2,no.of cols=3
x

array([[1, 7, 3],
       [4, 5, 6]])

In [18]:
##axis=0
x.sum(axis=0) #sum is calculated along the 2rows 

array([ 5, 12,  9])

In [19]:
x.min(axis=0)  #minimum value is taken along 2rows

array([1, 5, 3])

In [20]:
x.max(axis=0)  #maximum value is taken along 2 rows

array([4, 7, 6])

In [21]:
##axis=1
x.sum(axis=1) #sum is calculated along the 3columns

array([11, 15])

In [22]:
x.min(axis=1) #minimum value is taken along 3 columns

array([1, 4])

In [23]:
x.max(axis=1) #maximum value is taken along 3 columns

array([7, 6])

## zeros

The `zeros` function creates an array containing any number of zeros:

In [144]:
e=np.zeros((2))
e

array([0., 0.])

In [145]:
e.ndim

1

It's just as easy to create a 2D array (ie. a matrix) by providing a tuple with the desired number of rows and columns. For example, here's a 3x4 matrix:

In [147]:
f=np.zeros((3,4))
f

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

## Note

* In NumPy, each dimension is called an **axis**.

* The number of axes is called the **rank**.
   * For example, the above 3x4 matrix is an array of rank 2 (it is 2-dimensional).
   * The first axis has length 3, the second has length 4.
* An array's list of axis lengths is called the **shape** of the array.
    * For example, the above matrix's shape is `(3, 4)`.
    * The rank is equal to the shape's length.
* The **size** of an array is the total number of elements, which is the product of all axis lengths (eg. 3*4=12)

In [148]:
f.ndim

2

In [149]:
f.size

12

You can also create an N-dimensional array of arbitrary rank. For example, here's a 3D array (rank=3), with shape `(2,3,4)`:

In [152]:
g=np.zeros((2,3,4))
g

array([[[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]],

       [[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]]])

In [153]:
g.ndim

3

## ones

The `ones` function creates an array containing any number of ones:

In [30]:
e=np.ones((2))
e

array([1., 1.])

In [31]:
e.ndim

1

In [32]:
f=np.ones((2,3))
f

array([[1., 1., 1.],
       [1., 1., 1.]])

In [33]:
f.ndim

2

In [150]:
g=np.ones((2,3,4))
g

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [151]:
g.ndim

3

## arange

we can create an `ndarray` using NumPy's `arange` function, which is similar to python's built-in `range` function:

In [36]:
x=np.arange(1, 5)
x

array([1, 2, 3, 4])

In [37]:
y=np.arange(1.0, 5.0) #works with float type
y

array([1., 2., 3., 4.])

In [38]:
z=np.arange(1, 5, 0.5) #we can provide the step parameter
z

array([1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])

However, when dealing with floats, the exact number of elements in the array is not always predictible. For example, consider this:

In [16]:
import numpy as np
print(np.arange(0, 5/3, 1/3)) # depending on floating point errors, the max value is 4/3 or 5/3.
print(np.arange(0, 5/3, 0.333333333))
print(np.arange(0, 5/3, 0.333333334))
print(np.arange(0, 100, 11))

[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]
[0.         0.33333333 0.66666667 1.         1.33333333 1.66666667]
[0.         0.33333333 0.66666667 1.         1.33333334]
[ 0 11 22 33 44 55 66 77 88 99]


## linspace

For this reason, it is generally preferable to use the `linspace` function instead of `arange` when working with floats. The `linspace` function returns an array containing a specific number of points evenly distributed between two values (note that the maximum value is *included*, contrary to `arange`):

In [41]:
np.linspace(0, 5/3, 6)

array([0.        , 0.33333333, 0.66666667, 1.        , 1.33333333,
       1.66666667])

In [40]:
np.linspace(0, 2, 6) #no.of samples=6 taken from range 0 to 2

array([0. , 0.4, 0.8, 1.2, 1.6, 2. ])

In [17]:
print(np.linspace(0, 100, 11))

[  0.  10.  20.  30.  40.  50.  60.  70.  80.  90. 100.]


## full

Creates an array of the given shape initialized with the given value. Here's a 3x4 matrix full of `π`.

In [42]:
np.full((3,4), np.pi)

array([[3.14159265, 3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265, 3.14159265],
       [3.14159265, 3.14159265, 3.14159265, 3.14159265]])

## eye

In [43]:
np.eye(2,2) #identity matrix

array([[1., 0.],
       [0., 1.]])

## empty

An uninitialized 2x3 array (its content is not predictable, as it is whatever is in memory at that point):

In [44]:
np.empty((3,2))

array([[0.        , 0.33333333],
       [0.66666667, 1.        ],
       [1.33333333, 1.66666667]])

## rand and randn
A number of functions are available in NumPy's `random` module to create `ndarray`s initialized with random values.
For example, here is a 3x4 matrix initialized with random floats between 0 and 1 (uniform distribution):

In [45]:
np.random.rand(3,4)

array([[0.53613064, 0.10726284, 0.91155648, 0.37071323],
       [0.36096173, 0.28614762, 0.14360321, 0.38847909],
       [0.80726873, 0.73305052, 0.95781859, 0.61018227]])

In [46]:
np.random.randn(3,4)

array([[ 1.16627967, -1.12778849, -1.75879309, -0.37224191],
       [-0.77007944, -0.75711359,  0.13886288, -0.22898301],
       [-1.90196252,  1.07301726, -0.42948683, -0.86682944]])

## dtype

NumPy's ndarrays are also efficient in part because all their elements must have the same type (usually numbers). You can check what the data type is by looking at the 'dtype' attribute.

some of the datatypes are:

a)int
b)float
c)bool
d)complex

In [55]:
x=np.array([(1,2,3),(4,5,6),(7,8,9)])
x

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [56]:
x.dtype

dtype('int32')

In [57]:
y=np.array([(1.1,2.1,3.0),(4.5,5.7,6.8),(7,8.9,9)])
y

array([[1.1, 2.1, 3. ],
       [4.5, 5.7, 6.8],
       [7. , 8.9, 9. ]])

In [58]:
y.dtype

dtype('float64')

In [59]:
x = np.arange(1, 5)
print(x.dtype, x)

int32 [1 2 3 4]


In [60]:
y = np.arange(1.0, 5.0)
print(y.dtype, y)

float64 [1. 2. 3. 4.]


Instead of letting NumPy guess what data type to use, you can set it explicitly when creating an array by setting the dtype parameter:

In [61]:
z = np.arange(1, 5, dtype=np.complex64)
print(z.dtype, z)

complex64 [1.+0.j 2.+0.j 3.+0.j 4.+0.j]


Available data types include int8, int16, int32, int64, uint8|16|32|64, float16|32|64 and complex64|128. Check out the documentation for the full list.

In [62]:
h=np.array([(1,0),(0,0)],bool)  #1-->True, 0-->False
h

array([[ True, False],
       [False, False]])

## databuffer

An array's data is actually stored in memory as a flat (one dimensional) byte buffer. It is available *via* the `data` attribute (you will rarely need it, though).

In [196]:
f = np.array([[1,2],[1000, 2000]], dtype=np.int32)
f.data

<memory at 0x000001EED1A5CB40>

# 1.2 OPERATIONS ON NUMPY ARRAY

* Arthimetic operations
* Conditional operations
* Set operations

## ARTHIMETIC OPERATIONS

All the usual arithmetic operators (`+`, `-`, `*`, `/`, `//`, `**`, etc.) can be used with `ndarray`s. They apply *elementwise*:

In [63]:
a = np.array([14, 23, 32, 41])
b = np.array([5,  4,  3,  2])
print("a + b  =", a + b)
print("a - b  =", a - b)
print("a * b  =", a * b)
print("a / b  =", a / b)
print("a // b  =", a // b)
print("a % b  =", a % b)
print("a ** b =", a ** b)

a + b  = [19 27 35 43]
a - b  = [ 9 19 29 39]
a * b  = [70 92 96 82]
a / b  = [ 2.8         5.75       10.66666667 20.5       ]
a // b  = [ 2  5 10 20]
a % b  = [4 3 2 1]
a ** b = [537824 279841  32768   1681]


Note: Multiplication is *not* a matrix multiplication.

## CONDITIONAL OPERATIONS

The conditional operators also apply elementwise:

In [27]:
m = np.array([20, -5, 30, 40])
m < [15, 16, 35, 36]

array([False,  True,  True, False])

In [28]:
m[m < 25]

array([20, -5])

In [29]:
m[m > 25]

array([30, 40])

In [154]:
a=np.array([1,0,1,0])
b=np.array([1,1,1,1])
a==b

array([ True, False,  True, False])

## SET OPERATIONS

In [25]:
import numpy as np
 
array1 = np.array([[1, 2, 3], [4, 5, 6]])
array2 = np.array([[2, 1, 0], [4, 3, 6]])
 
# Find the union of two arrays.
print(np.union1d(array1, array2))
 
# Find the intersection of two arrays.
print(np.intersect1d(array1, array2))
 
# Find the set difference of two arrays.
print(np.setdiff1d(array1, array2))

[0 1 2 3 4 5 6]
[1 2 3 4 6]
[5]


# 1.3 MATHEMATICAL AND STATISTICAL FUNCTIONS

In [4]:
##Basic mathematical functions
import numpy as np
a = np.array([[-2.5, 3.1, 7], [10, 11, 12]])
print(a)
print("min =", a.min())
print("max =",a.max())
print("sum =",a.sum())
print("prod =",a.prod())
print("cumsum =",a.cumsum())
print("mean =",a.mean())

[[-2.5  3.1  7. ]
 [10.  11.  12. ]]
min = -2.5
max = 12.0
sum = 40.6
prod = -71610.0
cumsum = [-2.5  0.6  7.6 17.6 28.6 40.6]
mean = 6.766666666666667


Note: This computes the mean of all elements in the `ndarray`, regardless of its shape.


These functions accept an optional argument `axis` which lets you ask for the operation to be performed on elements along the given axis. For example:

In [5]:
b=np.arange(24).reshape(2,3,4)
b

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [6]:
b.sum(axis=0)  # sum across matrices

array([[12, 14, 16, 18],
       [20, 22, 24, 26],
       [28, 30, 32, 34]])

In [9]:
b.sum(axis=1)  # sum across rows

array([[12, 15, 18, 21],
       [48, 51, 54, 57]])

You can also sum over multiple axes:

In [10]:
b

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [8]:
b.sum(axis=(0,2))  # sum across columns of matrices

array([ 60,  92, 124])

In [None]:
##basic statistical functions

In [182]:
for func in (a.mean, a.std, a.var):
    print(func.__name__, "=", func())

mean = 6.766666666666667
std = 5.084835843520964
var = 25.855555555555554


# 1.4 ARRAY INDEXING,SLICING,ITERATING

* One Dimensional Arrays
* Differences with Regular python Arrays
* Multi Dimensional Arrays
* Fancy Indexing
* Higher Dimensions
* Boolean Indexing
* ix_
* Iterating


## One Dimensional Arrays

One-dimensional NumPy arrays can be accessed more or less like regular python arrays:

In [184]:
a = np.array([1, 5, 3, 19, 13, 7, 3])
a[3]

19

In [73]:
a[2:5]

array([ 3, 19, 13])

In [74]:
a[2:-1]

array([ 3, 19, 13,  7])

In [75]:
a[:2]

array([1, 5])

In [76]:
a[2::2]

array([ 3, 13,  3])

In [77]:
a[::-1]

array([ 3,  7, 13, 19,  3,  5,  1])

In [78]:
a[3]=999 #we can modify elements
a

array([  1,   5,   3, 999,  13,   7,   3])

In [79]:
a[2:5] = [997, 998, 999] #we can modify an ndarray slice
a

array([  1,   5, 997, 998, 999,   7,   3])

In [185]:
for i in a:
    print(i**(2))

1
25
9
361
169
49
9


## Differences with Regular Python Arrays


In [80]:
a[2:5] = -1
a

array([ 1,  5, -1, -1, -1,  7,  3])

Also, you cannot grow or shrink `ndarray`s this way:

In [81]:
try:
    a[2:5] = [1,2,3,4,5,6]  # too long
except ValueError as e:
    print(e)

cannot copy sequence with size 6 to array axis with dimension 3


You cannot delete elements either:

In [82]:
try:
    del a[2:5]
except ValueError as e:
    print(e)

cannot delete array elements


Last but not least, `ndarray` **slices are actually *views*** on the same data buffer. This means that if you create a slice and modify it, you are actually going to modify the original `ndarray` as well!

In [83]:
a_slice = a[2:6]
a_slice[1] = 1000
a  # the original array was modified

array([   1,    5,   -1, 1000,   -1,    7,    3])

In [84]:
a[3] = 2000
a_slice  # similarly, modifying the original array modifies the slice

array([  -1, 2000,   -1,    7])

If you want a copy of the data, you need to use the `copy` method:

In [85]:
another_slice = a[2:6].copy()
another_slice[1] = 3000
a  # the original array is untouched

array([   1,    5,   -1, 2000,   -1,    7,    3])

In [86]:
a[3] = 4000
another_slice  # similary, modifying the original array does not affect the slice copy

array([  -1, 3000,   -1,    7])

## Multi Dimensional Arrays

Multi-dimensional arrays can be accessed in a similar way by providing an index or slice for each axis, separated by commas:

In [19]:
b = np.arange(48).reshape(4, 12)
b

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
       [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]])

In [20]:
b[1, 2]  # row 1, col 2

14

In [21]:
b[1, :]  # row 1, all columns

array([12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23])

In [22]:
b[:, 1]  # all rows, column 1

array([ 1, 13, 25, 37])

**Caution**: note the subtle difference between these two expressions: 

In [23]:
z=b[1, :]
print(z.shape)

(12,)


In [24]:
z=b[1:2, :]
print(z.shape)

(1, 12)


The first expression returns row 1 as a 1D array of shape `(12,)`, while the second returns that same row as a 2D array of shape `(1, 12)`.

In [187]:
for row in b:
    print(row)

[ 0  1  2  3  4  5  6  7  8  9 10 11]
[12 13 14 15 16 17 18 19 20 21 22 23]
[24 25 26 27 28 29 30 31 32 33 34 35]
[36 37 38 39 40 41 42 43 44 45 46 47]


## Fancy indexing
You may also specify a list of indices that you are interested in. This is referred to as *fancy indexing*.

In [95]:
b[(0,2), 2:5]  # rows 0 and 2, columns 2 to 4 (5-1)

array([[ 2,  3,  4],
       [26, 27, 28]])

In [96]:
b[:, (-1, 2, -1)]  # all rows, columns -1 (last), 2 and -1 (again, and in this order)

array([[11,  2, 11],
       [23, 14, 23],
       [35, 26, 35],
       [47, 38, 47]])

If you provide multiple index arrays, you get a 1D `ndarray` containing the values of the elements at the specified coordinates.

In [98]:
b[(-1, 2, -1, 2), (5, 9, 1, 9)]  # returns a 1D array with b[-1, 5], b[2, 9], b[-1, 1] and b[2, 9] (again)

array([41, 33, 37, 33])

## Higher dimensions
Everything works just as well with higher dimensional arrays, but it's useful to look at a few examples:

In [100]:
c = b.reshape(4,2,6)
c

array([[[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]],

       [[12, 13, 14, 15, 16, 17],
        [18, 19, 20, 21, 22, 23]],

       [[24, 25, 26, 27, 28, 29],
        [30, 31, 32, 33, 34, 35]],

       [[36, 37, 38, 39, 40, 41],
        [42, 43, 44, 45, 46, 47]]])

In [101]:
c[2, 1, 4]  # matrix 2, row 1, col 4

34

In [102]:
c[2, :, 3]  # matrix 2, all rows, col 3

array([27, 33])

If you omit coordinates for some axes, then all elements in these axes are returned:

In [103]:
c[2, 1]  # Return matrix 2, row 1, all columns.  This is equivalent to c[2, 1, :]

array([30, 31, 32, 33, 34, 35])

## Boolean indexing
You can also provide an `ndarray` of boolean values on one axis to specify the indices that you want to access.

In [105]:
b = np.arange(48).reshape(4, 12)
b

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35],
       [36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47]])

In [106]:
rows_on = np.array([True, False, True, False])
b[rows_on, :]  # Rows 0 and 2, all columns. Equivalent to b[(0, 2), :]

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11],
       [24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]])

In [107]:
cols_on = np.array([False, True, False] * 4)
b[:, cols_on]  # All rows, columns 1, 4, 7 and 10

array([[ 1,  4,  7, 10],
       [13, 16, 19, 22],
       [25, 28, 31, 34],
       [37, 40, 43, 46]])

## ix_
You cannot use boolean indexing this way on multiple axes, but you can work around this by using the `ix_` function:

In [108]:
b[np.ix_(rows_on, cols_on)]

array([[ 1,  4,  7, 10],
       [25, 28, 31, 34]])

In [109]:
np.ix_(rows_on, cols_on)

(array([[0],
        [2]], dtype=int64), array([[ 1,  4,  7, 10]], dtype=int64))

If you use a boolean array that has the same shape as the `ndarray`, then you get in return a 1D array containing all the values that have `True` at their coordinate. This is generally used along with conditional operators:

In [110]:
b[b % 3 == 1]

array([ 1,  4,  7, 10, 13, 16, 19, 22, 25, 28, 31, 34, 37, 40, 43, 46])

## Iterating
Iterating over `ndarray`s is very similar to iterating over regular python arrays. Note that iterating over multidimensional arrays is done with respect to the first axis.

In [111]:
c = np.arange(24).reshape(2, 3, 4)  # A 3D array (composed of two 3x4 matrices)
c

array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]],

       [[12, 13, 14, 15],
        [16, 17, 18, 19],
        [20, 21, 22, 23]]])

In [112]:
for m in c:
    print("Item:")
    print(m)

Item:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Item:
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


In [113]:
for i in range(len(c)):  # Note that len(c) == c.shape[0]
    print("Item:")
    print(c[i])

Item:
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
Item:
[[12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]


If you want to iterate on *all* elements in the `ndarray`, simply iterate over the `flat` attribute:

In [114]:
for i in c.flat:
    print("Item:", i)

Item: 0
Item: 1
Item: 2
Item: 3
Item: 4
Item: 5
Item: 6
Item: 7
Item: 8
Item: 9
Item: 10
Item: 11
Item: 12
Item: 13
Item: 14
Item: 15
Item: 16
Item: 17
Item: 18
Item: 19
Item: 20
Item: 21
Item: 22
Item: 23


# 1.5 ARRAY MANIPULATION


* Stacking together different arrays
    * vstack
    * hstack
    * concatenate
    * stack
* Splitting Arrays
    * vsplit
    * hsplit
* Changing the shape of the Array
    * ravel
    * reshape
    * swapaxes
    * Transpose of an array


## Stacking Arrays

* vstack
* hstack
* concatenate
* stack

It is often useful to stack together different arrays. NumPy offers several functions to do just that. Let's start by creating a few arrays.

In [116]:
q1 = np.full((3,4), 1.0)
q1

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.]])

In [117]:
q2 = np.full((4,4), 2.0)
q2

array([[2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.]])

In [118]:
q3 = np.full((3,4), 3.0)
q3

array([[3., 3., 3., 3.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.]])

### vstack

Now let's stack them vertically using `vstack`:

In [119]:
q4 = np.vstack((q1, q2, q3))
q4

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.]])

In [120]:
q4.shape

(10, 4)

This was possible because q1, q2 and q3 all have the same shape (except for the vertical axis, but that's ok since we are stacking on that axis).

### hstack
We can also stack arrays horizontally using `hstack`:

In [122]:
q5 = np.hstack((q1, q3))
q5

array([[1., 1., 1., 1., 3., 3., 3., 3.],
       [1., 1., 1., 1., 3., 3., 3., 3.],
       [1., 1., 1., 1., 3., 3., 3., 3.]])

In [123]:
q5.shape

(3, 8)

This is possible because q1 and q3 both have 3 rows. But since q2 has 4 rows, it cannot be stacked horizontally with q1 and q3:

In [124]:
try:
    q5 = np.hstack((q1, q2, q3))
except ValueError as e:
    print(e)

all the input array dimensions except for the concatenation axis must match exactly


### concatenate

The `concatenate` function stacks arrays along any given existing axis.

In [198]:
q7 = np.concatenate((q1, q2, q3), axis=0)  # Equivalent to vstack
q7

array([[1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.],
       [3., 3., 3., 3.]])

In [199]:
q7.shape

(10, 4)

### stack
The `stack` function stacks arrays along a new axis. All arrays have to have the same shape.

In [125]:
q8 = np.stack((q1, q3))
q8

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]])

In [126]:
q8.shape

(2, 3, 4)

## Splitting arrays

* vsplit
* hsplit

### vsplit

Splitting is the opposite of stacking. For example, let's use the `vsplit` function to split a matrix vertically.

First let's create a 6x4 matrix:

In [128]:
r = np.arange(24).reshape(6,4)
r

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23]])

Now let's split it in three equal parts, vertically:

In [129]:
r1, r2, r3 = np.vsplit(r, 3)
r1

array([[0, 1, 2, 3],
       [4, 5, 6, 7]])

In [130]:
r2

array([[ 8,  9, 10, 11],
       [12, 13, 14, 15]])

In [131]:
r3

array([[16, 17, 18, 19],
       [20, 21, 22, 23]])

There is also a `split` function which splits an array along any given axis. Calling `vsplit` is equivalent to calling `split` with `axis=0`. There is also an `hsplit` function, equivalent to calling `split` with `axis=1`:

### hsplit

In [132]:
r4, r5 = np.hsplit(r, 2)
r4

array([[ 0,  1],
       [ 4,  5],
       [ 8,  9],
       [12, 13],
       [16, 17],
       [20, 21]])

In [133]:
r5

array([[ 2,  3],
       [ 6,  7],
       [10, 11],
       [14, 15],
       [18, 19],
       [22, 23]])

## Changing the shape of the array

* shape
* reshape
* ravel
* Transpose
* swapaxes

### shape

Changing the shape of an `ndarray` is as simple as setting its `shape` attribute. However, the array's size must remain the same.

In [178]:
g = np.arange(24)
print(g)
print("Rank:", g.ndim)

[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
Rank: 1


In [188]:
g.shape = (6, 4)
print(g)
print("Rank:", g.ndim)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]]
Rank: 2


In [189]:
g.shape = (2, 3, 4)
print(g)
print("Rank:", g.ndim)

[[[ 0  1  2  3]
  [ 4  5  6  7]
  [ 8  9 10 11]]

 [[12 13 14 15]
  [16 17 18 19]
  [20 21 22 23]]]
Rank: 3


### reshape

The `reshape` function returns a new `ndarray` object pointing at the *same* data. This means that modifying one array will also modify the other.

In [191]:
g2 = g.reshape(4,6)
print(g2)
print("Rank:", g2.ndim)

[[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]]
Rank: 2


Set item at row 1, col 2 to 999 (more about indexing below).

In [192]:
g2[1, 2] = 999
g2

array([[  0,   1,   2,   3,   4,   5],
       [  6,   7, 999,   9,  10,  11],
       [ 12,  13,  14,  15,  16,  17],
       [ 18,  19,  20,  21,  22,  23]])

The corresponding element in `g` has been modified.

In [193]:
g

array([[[  0,   1,   2,   3],
        [  4,   5,   6,   7],
        [999,   9,  10,  11]],

       [[ 12,  13,  14,  15],
        [ 16,  17,  18,  19],
        [ 20,  21,  22,  23]]])

### ravel
Finally, the `ravel` function returns a new one-dimensional `ndarray` that also points to the same data:

In [205]:
g.ravel()

array([  0,   1,   2,   3,   4,   5,   6,   7, 999,   9,  10,  11,  12,
        13,  14,  15,  16,  17,  18,  19,  20,  21,  22,  23])

### Transpose
The `transpose` method creates a new view on an `ndarray`'s data, with axes permuted in the given order.

For example, let's create a 3D array:

In [35]:
t = np.arange(24).reshape(4,2,3)
t

array([[[ 0,  1,  2],
        [ 3,  4,  5]],

       [[ 6,  7,  8],
        [ 9, 10, 11]],

       [[12, 13, 14],
        [15, 16, 17]],

       [[18, 19, 20],
        [21, 22, 23]]])

Now let's create an `ndarray` such that the axes `0, 1, 2` (depth, height, width) are re-ordered to `1, 2, 0` (depth→width, height→depth, width→height):

In [36]:
t1 = t.transpose((1,2,0))
t1

array([[[ 0,  6, 12, 18],
        [ 1,  7, 13, 19],
        [ 2,  8, 14, 20]],

       [[ 3,  9, 15, 21],
        [ 4, 10, 16, 22],
        [ 5, 11, 17, 23]]])

In [37]:
t1.shape

(2, 3, 4)

In [38]:
t2 = t.transpose()  # equivalent to t.transpose((2, 1, 0))
t2

array([[[ 0,  6, 12, 18],
        [ 3,  9, 15, 21]],

       [[ 1,  7, 13, 19],
        [ 4, 10, 16, 22]],

       [[ 2,  8, 14, 20],
        [ 5, 11, 17, 23]]])

By default, `transpose` reverses the order of the dimensions:

### Swapaxes
NumPy provides a convenience function `swapaxes` to swap two axes. For example, let's create a new view of `t` with depth and height swapped:

In [39]:
t3 = t.swapaxes(0,1)  # equivalent to t.transpose((1, 0, 2))
t3

array([[[ 0,  1,  2],
        [ 6,  7,  8],
        [12, 13, 14],
        [18, 19, 20]],

       [[ 3,  4,  5],
        [ 9, 10, 11],
        [15, 16, 17],
        [21, 22, 23]]])

In [40]:
t3.shape

(2, 4, 3)