**Introduction to NumPy**

NumPy (short form for Numerical Python) is the most fundamental package designed for scientific computing and data analysis. Most of the other packages such as pandas, statsmodels are built on top of it, and is an important package to know and learn about. At the heart of NumPy is a data structure called **ndarray**. ndarray is a basically a multi-dimensional array that is built specifically for the purpose of numerical data analysis. Python also has array capabilities, but they are more generic. <b>The advantage of using ndarray is that processing is extremely efficient and fast.</b> 

You can perform standard mathematical operations on either individual elements or complete array. The range of functions covered is <b>linear algebra, statistical operations, and other specialized mathematical operations</b>. For our purpose, we need to know about <b>ndarray and the range of mathematical functions that are relevant to our research purpose</b>. If you already know languages such as C, Fortran, then you can integrate NumPy code with code written in these languages and can pass NumPy arrays seamlessly. 

From an overall perspective, understanding of <b>NumPy will help us in using pandas effectively as it is built on top of NumPy and frequently we will also be using functions of NumPy in research work</b>. In the current session, we will only look at some of the most important features of NumPy. For a full listing of NumPy features, please visit http://wiki.scipy.org/Numpy_Example_List .

Possible application of NumPy package in research work are:
<i>
+ Algorithmic operations such as sorting, grouping and set operations
+ Performing repetitive operations on whole arrays of data without using loops
+ Data merging and alignment operations
+ Data indexing, filtering, and transformation on individual elements or whole arrays
+ Data summarization and descriptive statistics
</i>
**Installing NumPy**

In order to check if NumPy is installed, go to Package Manager and type NumPy. You will get a list of packages with names closely matching to NumPy. For our purpose, we need to focus on package named numpy 1.xx. If the package is not installed, click on Install. 

**Importing NumPy**

In order to be able to use NumPy, first import it using import statement

In [1]:
import numpy as np

The above statement will import all of NumPy into your workspace. For starters its good, but if you are doing performance intensive work, then saving space is of importance. In such cases, you can import specific modules of NumPy by using

In [2]:
from numpy import array
from numpy import arange

In [3]:
distance = [45,50,35]
speed = [5,10,7]
# How we have overcome the prior challenge using 
dist = np.array(distance)
spd = np.array(speed)
print (dist)
print (spd)
time= dist/spd
print (time)
len(time)

[45 50 35]
[ 5 10  7]
[9. 5. 5.]


3

<b>NDARRAY</b>

ndarray
The most important data structure in NumPy is an n-dimensional array object. Using ndarray, you can store large multidimensional datasets in Python. Being an array, you can <b>perform mathematical operations on these arrays either one element at a time or on complete arrays without using loops</b>. The way to initialize an array object is

In [8]:
a=[1,2,3,4,5]                                 # initialize a list and assign values to it
b=[10,20,30,40]
print (a + b)
#print (a + 5)                               #Want to add '5' to all the elements in the list
c = [val + 5 for val in a]           
print (c)

[1, 2, 3, 4, 5, 10, 20, 30, 40]
[6, 7, 8, 9, 10]


In [4]:
f = array((1,2,3,4,5))                       # initializes an array a and assigns values to it
g = array((10,20,30,40))                  # initializes another array b
#print (a)
#print (b)
print(f+g) 
print (f+5) 
f**2
g**3

ValueError: operands could not be broadcast together with shapes (5,) (4,) 

In [5]:
c = array(arange(10))                           #arange function here works as a sequence or counter in increments of 1
print ('array:      ', c)                       # https://docs.scipy.org/doc/numpy/reference/generated/numpy.arange.html
#print()
#print()

array:       [0 1 2 3 4 5 6 7 8 9]


In [6]:
anarray = array(np.arange(5,56,5))                 #arange function here works as a sequence or counter in increments of 5
print ('anarray:        ', anarray)               
#print()
#print()

anarray:         [ 5 10 15 20 25 30 35 40 45 50 55]


In [7]:
onemorearray = array(np.linspace(3,4,13))           #linspace function here creates 11 evenly spaced numbers over a specified interval
print ('onemorearray:         ', onemorearray)
print ('Rounded onemorearray: ', np.round(onemorearray,0))
onemorearray.sum()

onemorearray:          [3.         3.08333333 3.16666667 3.25       3.33333333 3.41666667
 3.5        3.58333333 3.66666667 3.75       3.83333333 3.91666667
 4.        ]
Rounded onemorearray:  [3. 3. 3. 3. 3. 3. 4. 4. 4. 4. 4. 4. 4.]


45.5

In [14]:
data = np.array((32,45,123,756,23,2123,1,2,3,4,5,6,6,5,4,3,2,1,78,89,87,76,54,31))
#print(data)
#data1 = data.reshape(8,3)
#print(data1)
data2 = data.reshape(2,2,2,3)
print(data2)
print(data2.shape)
print(data2.dtype)
print(data2.size)
even_data = (data % 2 == 0)
print (even_data)
data[even_data]

[[[[  32   45  123]
   [ 756   23 2123]]

  [[   1    2    3]
   [   4    5    6]]]


 [[[   6    5    4]
   [   3    2    1]]

  [[  78   89   87]
   [  76   54   31]]]]
(2, 2, 2, 3)
int32
24
[ True False False  True False False False  True False  True False  True
  True False  True False  True False  True False False  True  True False]


array([ 32, 756,   2,   4,   6,   6,   4,   2,  78,  76,  54])

In [16]:
np.zeros(10)                         # Make a 50, 0 numbered single dimensional array

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [17]:
np.zeros((2,5)) + 6                     # Make a 15, 0 numbered 2 dimensional array

array([[6., 6., 6., 6., 6.],
       [6., 6., 6., 6., 6.]])

In [18]:
np.ones(30) + 7                         # Make a 30, 1 numbered single dimensional array

array([8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.,
       8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8., 8.])

In [19]:
np.ones((4,6))                       # Make a 45, 1 numbered 2 dimensional array

array([[1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1., 1.]])

In [21]:
np.eye(3)                            # creates a 5*5 identity matrix. 

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [22]:
np.diag(array([1,3,5,3]))        # Populate the disgonal elements only

array([[1, 0, 0, 0],
       [0, 3, 0, 0],
       [0, 0, 5, 0],
       [0, 0, 0, 3]])

**BroadCasting Rule -Numpy**

In [3]:
np.arange(3)+5

array([5, 6, 7])

In [10]:
np.ones((3,3))+np.arange(3)

array([[1., 2., 3.],
       [1., 2., 3.],
       [1., 2., 3.]])

In [13]:
np.arange(3).reshape(3,1)+np.arange(3)

array([[0, 1, 2],
       [1, 2, 3],
       [2, 3, 4]])

In [15]:
multi_array=np.array([(4,1),(2,5),(8,9)])
print(multi_array.sum(axis=0))

[14 15]


In [16]:
multi_array=np.array([(4,1),(2,5),(8,9)])
print(multi_array.sum(axis=1))

[ 5  7 17]


In [18]:
multi_array=np.array([(4,1),(2,5),(8,9)])
multi_array.shape
multi_array.reshape(2,3)

array([[4, 1, 2],
       [5, 8, 9]])

In [19]:
# scalar and one-dimensional
a = array([1, 2, 3])
print(a)
b = 2
print(b)
c = a + b
print(c)

[1 2 3]
2
[3 4 5]


In [20]:
# scalar and two-dimensional
A = array([[1, 2, 3], [1, 2, 3]])
print(A)
b = 2
print(b)
C = A + b
print(C)

[[1 2 3]
 [1 2 3]]
2
[[3 4 5]
 [3 4 5]]


In [21]:
# one-dimensional and two-dimensional
A = array([[1, 2, 3], [1, 2, 3]])
print(A)
b = array([1, 2, 3])
print(b)
C = A + b
print(C)

[[1 2 3]
 [1 2 3]]
[1 2 3]
[[2 4 6]
 [2 4 6]]
