<h2>Numpy</h2>

Numpy is a python package specifically designed for efficiently working on <em><b> homogeneous n-dimensional arrays </em> </b>. Since array level operations are highly mathematical in nature, most of numpy is written in C and wrapped with Python. This is the key to numpy's success.

<h3>Table of Contents</h3>

<li> <b> <a href="#install-numpy">Install numpy </a></b></li>
<li> <b> <a href="#n-dimensional-array">n-dimensional array </a></b></li>
<li> Array Creation </li>
    <ul>
        <li> <b> <a href="#from-list">from list </a></b></li>
        <li> <b> <a href="#shape">shape ( )</a></b></li>
        <li> <b> <a href="#arange">arange ( ) </a></b></li>
    </ul>
<li> <a href="#array-operations"> <b>Array Operations</b> </a> </li>    
    <ul>
    <li> <b> <a href="#element-wise-operations"> <em>Element-wise operation</em> </a></b></li>
    <li> <b> <a href="#aggregate-operations"><em>Aggregate Operations</em></a></b></li>
    <li> <b> <a href="#aggregate-along-axis"><em>Aggregate Operations along an axis</em></a></b></li>
    </ul>
<li> <a href="#array-indexing-slicing"><b>Array Indexing & Slicing</b> </a> </li>        
    <ul>
    <li> <b> <a href="#array-index"> <em>Array Indexing</em> </a></b></li>
    <li> <b> <a href="#array-slice"> <em>Array Slicing</em></a></b></li>
    </ul>
    

<div id="installation"> <h4> Install numpy </h4></div>

Before you do anything with numpy, you would have to first install it ( unless you have other data science distributions like Anaconda or Canopy installed ). Installing numpy is as simple as

In [4]:
# pip install numpy

In [20]:
# without numpy
import time 

sum = 0

start_time = time.time()

for num in range(10000000) :
    sum = sum + num
    
print ( "sum = ", sum)

end_time = time.time()

python_time = end_time - start_time

print ( "time taken = ", python_time)

sum =  49999995000000
time taken =  1.4988813400268555


In [2]:
# with numpy
import numpy as np

sum = 0

start_time = time.time()

numbers = np.arange(10000000)

sum = np.sum(numbers, dtype = np.uint64)
print ( "sum = ", sum)

end_time = time.time()

numpy_time = end_time - start_time
factor = python_time / numpy_time

print ( "time taken = ", (end_time - start_time))

print ( "numpy is ", factor , " times faster than standard python")


NameError: name 'time' is not defined

As you can see, numpy is 45 times faster than standard python. Of course the number may slightly vary based on the power of your computer. 

<div id="n-dimensional-array"/><h4> n-dimensional array </h4>

This is the core data structure in numpy. We will explore how useful it is and what you can do with it pretty soon. Let's create a simple 1 dimensional array with just 10 numbers

<img src="./pics/1d-array.png"/>

In [10]:
import numpy as np

a = np.array([1,2,3,4,5,6,7,8,9,10])
a

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

Let's put a second dimension to it

<img src="./pics/2d-array.png"/>

In [11]:
b = np.array( [[1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10],
               [11,12,13,14,15,16,17,18,19,20]])
b

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])

<div id="from-list"/>
<h4> Create an array from list </h4>

In [29]:
numbers = [1,2,3,4,5,6,7,8,9,10]
a = np.array(numbers)
a

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

You can create a 2-d array as well from a list.

In [30]:
a1 = [1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10]
a2 = [11,12,13,14,15,16,17,18,19,20]
b = np.array( [a1,a2])
b

array([[ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10],
       [11, 12, 13, 14, 15, 16, 17, 18, 19, 20]])

<div id="shape"/>
<h4> shape ( ) </h4>
How do you know it has a second dimension ? Use the shape function to tell you the shape of the array.

In [31]:
b.shape

(2, 10)

meaning, there are 2 rows and 10 columns. 

<div id="arange"/> 
<h4>arange ( ) </h4>
Like the standard python function <b> <em>range ( )</em> </b>, numpy has a similar function called <b><em> arange ( ) </em> </b>


In [21]:
numbers = np.arange(100)
numbers

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50,
       51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
       68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84,
       85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])

<div id="reshape"/>
<h4> reshape ( )</h4>
You can now use the reshape function to <em>reshape</em> the data into any number of dimensions you like. For example, you can reshape this into any of the following combinations in 2d. eg.,
<li> 10 x 10 </li>
<li> 20 x 5 </li>
<li> 2 x 50 </li>
<li> 50 x 2 </li> etc

In [25]:
numbers.reshape(10,10)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44, 45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64, 65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74, 75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84, 85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94, 95, 96, 97, 98, 99]])

In [24]:
numbers.reshape(20 , 5)

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14],
       [15, 16, 17, 18, 19],
       [20, 21, 22, 23, 24],
       [25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34],
       [35, 36, 37, 38, 39],
       [40, 41, 42, 43, 44],
       [45, 46, 47, 48, 49],
       [50, 51, 52, 53, 54],
       [55, 56, 57, 58, 59],
       [60, 61, 62, 63, 64],
       [65, 66, 67, 68, 69],
       [70, 71, 72, 73, 74],
       [75, 76, 77, 78, 79],
       [80, 81, 82, 83, 84],
       [85, 86, 87, 88, 89],
       [90, 91, 92, 93, 94],
       [95, 96, 97, 98, 99]])

In [26]:
numbers.reshape(2,50)

array([[ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15,
        16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
        32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
        48, 49],
       [50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65,
        66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81,
        82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97,
        98, 99]])

In [27]:
numbers.reshape(50,2)

array([[ 0,  1],
       [ 2,  3],
       [ 4,  5],
       [ 6,  7],
       [ 8,  9],
       [10, 11],
       [12, 13],
       [14, 15],
       [16, 17],
       [18, 19],
       [20, 21],
       [22, 23],
       [24, 25],
       [26, 27],
       [28, 29],
       [30, 31],
       [32, 33],
       [34, 35],
       [36, 37],
       [38, 39],
       [40, 41],
       [42, 43],
       [44, 45],
       [46, 47],
       [48, 49],
       [50, 51],
       [52, 53],
       [54, 55],
       [56, 57],
       [58, 59],
       [60, 61],
       [62, 63],
       [64, 65],
       [66, 67],
       [68, 69],
       [70, 71],
       [72, 73],
       [74, 75],
       [76, 77],
       [78, 79],
       [80, 81],
       [82, 83],
       [84, 85],
       [86, 87],
       [88, 89],
       [90, 91],
       [92, 93],
       [94, 95],
       [96, 97],
       [98, 99]])

<div id="array-operations"/>
<h4> Array Operations </h4>
This is where we get the sweet surprise. Array operations are element wise. Let's compare it to a list and you will see the difference

<div id="element-wise-operations"/>
<h4> Element-wise Operations </h4>

In [42]:
a = list(range(11))
b = list(range(11,21))
a + b

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]

In [50]:
a1 = np.arange(1,11)
b1 = np.arange(11,21)
a1 + b1

array([12, 14, 16, 18, 20, 22, 24, 26, 28, 30])

<img src="./pics/array_addition.png"/>

Element wise operations are not just across 2 arrays. You can even do simple unary operations like power, multiplications etc. Essentially, we are eliminating the for loop

In [22]:
a = list(range(11))
a

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

In [57]:
a12 = pow(a1,2)
a12

array([  1,   4,   9,  16,  25,  36,  49,  64,  81, 100], dtype=int32)

Array Multiplication

<img src="./pics/array-multiplication.png"/>

In [59]:
a13 = a1 * 3
a13

array([ 3,  6,  9, 12, 15, 18, 21, 24, 27, 30])

<div id="aggregate-operations"/>
<h4> Aggregate Operations </h4>

<h5> sum ( ) </h5>

In [23]:
a1 = np.arange(1,11)
print ( a1 )
a1.sum()

[ 1  2  3  4  5  6  7  8  9 10]


55

<img src="./pics/array-sum.png"/>

<h5> min ( ) & max ( )</h5>

<img src="./pics/array-min-max.png"/>

In [64]:
a1.min()

1

In [65]:
a1.max()

10

<h5> len ( ) </h5>

<img src="./pics/array-length.png"/>

In [67]:
len(a1)

10

<div id="aggregate-along-axis"/><h4> Aggregate Operations along an axis </h4>

<img src="./pics/xyz-axis.png" style="background-color:white;"/>

In [76]:
a = np.arange(1,101).reshape(10,10)
a

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

Sum across each of the axis 

<img src="./pics/sum-across-axis.png"/>

In [77]:
a.sum(axis=1)

array([ 55, 155, 255, 355, 455, 555, 655, 755, 855, 955])

In [78]:
a.sum(axis=0)

array([460, 470, 480, 490, 500, 510, 520, 530, 540, 550])

Similarly, you can do a min ( ) or max ( ) across any axis

<img src="./pics/min-across-axis.png"/>

In [79]:
a.min( axis = 1 )

array([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])

In [80]:
a.min ( axis = 0 )

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

<div id="array-indexing-slicing"/>
<h4> Array indexing & Slicing </h4>

Indexing a 1-d array is exactly similar to a list

<img src="./pics/index-1d-array.png"/>

In [3]:
b = np.arange(1,11)
b

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

To get a particular index, just use the square brackets notation ( like a list )

<img src="./pics/1d-array-index.png"/>

In [89]:
b[5]

6

Indexing a 2d array is just as simple. Since the array is 2 dimensional now, you have to use 2 indices. One along each axis. 

<img src="./pics/2d-array-4x7.png"/>

In [90]:
a[4,7]

48

Slicing a 1-d array is also similiar to a list. Use a slice in place of a number for indexing

<img src="./pics/1d-array-3-6.png"/>

In [92]:
b

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

In [94]:
b[3:7]

array([4, 5, 6, 7])

Slicing a 2-d array extends the same functionality across all the axis

<img src="./pics/2d-array-slicing.png"/>

In [95]:
a

array([[  1,   2,   3,   4,   5,   6,   7,   8,   9,  10],
       [ 11,  12,  13,  14,  15,  16,  17,  18,  19,  20],
       [ 21,  22,  23,  24,  25,  26,  27,  28,  29,  30],
       [ 31,  32,  33,  34,  35,  36,  37,  38,  39,  40],
       [ 41,  42,  43,  44,  45,  46,  47,  48,  49,  50],
       [ 51,  52,  53,  54,  55,  56,  57,  58,  59,  60],
       [ 61,  62,  63,  64,  65,  66,  67,  68,  69,  70],
       [ 71,  72,  73,  74,  75,  76,  77,  78,  79,  80],
       [ 81,  82,  83,  84,  85,  86,  87,  88,  89,  90],
       [ 91,  92,  93,  94,  95,  96,  97,  98,  99, 100]])

In [96]:
a[2:5, 3:8]

array([[24, 25, 26, 27, 28],
       [34, 35, 36, 37, 38],
       [44, 45, 46, 47, 48]])

You can very well use a combination of slicing and indexing

<img src="./pics/2d-array-slicing-indexing.png"/>

In [97]:
a[4,3:8]

array([44, 45, 46, 47, 48])

If you wanted to specify all the elements across a particular axis, just use a colon (:) without anything before or after. 

<img src="./pics/2d-array-single-row.png"/>

So, both of these are equivalent.

In [98]:
# Expression 1
a[4,0:10]

array([41, 42, 43, 44, 45, 46, 47, 48, 49, 50])

In [99]:
# Expression 2
a[4, : ]

array([41, 42, 43, 44, 45, 46, 47, 48, 49, 50])

In [101]:
a[[1,4], :]

array([[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [41, 42, 43, 44, 45, 46, 47, 48, 49, 50]])

What if you wanted multiple slices.. like so ?

<img src="./pics/2d-array-multiple-slices.png"/>

In [102]:
a[ [1,4,8], : ]

array([[11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
       [41, 42, 43, 44, 45, 46, 47, 48, 49, 50],
       [81, 82, 83, 84, 85, 86, 87, 88, 89, 90]])