# What is a Numpy ?

"NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more"
-https://docs.scipy.org/doc/numpy-1.10.1/user/whatisnumpy.html.

How to import this Numpy library in your Jupyter Notebook ?

In [1]:
import numpy as np

Let's run through an example showing how powerful NumPy is. Suppose we have two lists a and b, consisting of the first 100,000 non-negative numbers, and we want to create a new list c whose ith element is a[i] * b[i].

Without NumPy:

In [2]:
%%time
a = [i for i in range(100000)]
b = [i for i in range(100000)]

Wall time: 36 ms


In [3]:
%%time
c = []
for i in range(len(a)):
    c.append(a[i]  *  b[i])

Wall time: 48 ms


That's the thing we want you to notice the real time difference.
The Wall Time which a process needs to complete its task .
1st : Wall time: 25 ms 
2nd : Wall time: 84.9 ms.

But in case of Numpy lib, there is 10000000ms difference in performing this task.
Let's See.

In [4]:
%%time
a = np.arange(100000)
b = np.arange(100000)

Wall time: 15 ms


In [5]:
%%time
c = a  * b

Wall time: 4 ms


The result is 10 to 15 times faster, and we could do it in fewer lines of code (and the code itself is more intuitive)!

Regular Python is much slower due to type checking and other overhead of needing to interpret code and support Python's abstractions.

For example, if we are doing some addition in a loop, constantly type checking in a loop will lead to many more instructions than just performing a regular addition operation. NumPy, using optimized pre-compiled C code, is able to avoid a lot of the overhead introduced.

The process we used above is vectorization. Vectorization refers to applying operations to arrays instead of just individual elements (i.e. no loops).

Why vectorize?

1. Much faster
2. Easier to read and fewer lines of code
3. More closely assembles mathematical notation

Vectorization is one of the main reasons why NumPy is so powerful.

# What is an Array ?
A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

How to create an array ?

In [6]:
import numpy as np
array =np.arange(10)  # ...means give me an array 0-9
array

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [7]:
#For its checking of Size :
array.size     

10

In [8]:
# Remember that itemsize is the size of one element,this gives 4 cause u have integers and so each item = 4 bytes.
array.itemsize

4

 Now its better to store things as numpy array cause in array 1 element = 4 bytes but in python 1 object = 14 bytes
 Numpy array is also fast cause it will take a lot less time to process when u have large data.

In [9]:
# To create an array of size 3-5.
np.arange(3,6)

array([3, 4, 5])

Now for the length of this array , what can we do ?

In [10]:
len(array)

10

For its shape checking we can use :

In [11]:
array.shape

(10,)

In [12]:
#Now take a step of 2 on each step.
np.arange(2,11,2)

array([ 2,  4,  6,  8, 10])

Let's think , how we can create a 1-D array?

In [13]:
a = np.array([3,3,0,3,3]) #1D array
a

array([3, 3, 0, 3, 3])

In [14]:
#For finding the dimension of array ?
a.ndim

1

How to create a 2-D array , let's think..?

In [15]:
b=np.array([[2,3],[4,5],[6,7]]) # ...2D array
print(b.ndim)
print(b.shape)
b

2
(3, 2)


array([[2, 3],
       [4, 5],
       [6, 7]])

#  ndarray
ndarrays, n-dimensional arrays of homogenous data type, are the fundamental datatype used in NumPy. As these arrays are of the same type and are fixed size at creation, they offer less flexibility than Python lists, but can be substantially more efficient runtime and memory-wise. (Python lists are arrays of pointers to objects, adding a layer of indirection.)

The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.

In [16]:
# Can initialize ndarrays with Python lists, for example:
a = np.array([1, 2, 3])   # Create a rank 1 array
print(type(a))            # Prints "<class 'numpy.ndarray'>"
print(a.shape)           #  Prints "(3,)"
print(a.ndim)             # prinys number of dimensions that that the array has
print(a[0], a[1], a[2])   # Prints "1 2 3"
a[0] = 5                  # Change an element of the array
print(a)                  # Prints "[5, 2, 3]"

b = np.array([[1, 2, 3],
              [4, 5, 6]])    # Create a rank 2 array
print(b.shape)                     # Prints "(2, 3)"
print(b[0, 0], b[0, 1], b[1, 0])   # Prints "1 2 4"

<class 'numpy.ndarray'>
(3,)
1
1 2 3
[5 2 3]
(2, 3)
1 2 4


How to make a n-dimensional array ?

# Creating N dimensional arrays

In [17]:
b = np.array([[[1],[2],[3]],[[4],[5],[6]]])   # Create a rank 3 array
print (b)
print(b.ndim)
print(b.shape)

[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]]
3
(2, 3, 1)


In [18]:
# Gives datatype.
a.dtype

dtype('int32')

In [19]:
#Now that u have specified the dtype as float the itemsize will be 8 bytes
c = np.array([[2,3],[4,5],[6,7]], dtype=np.float64)
zz = np.array([1, 2], dtype=np.int64) 
print(c.dtype)
print(zz.dtype)

float64
int64


There are many other initializations that NumPy provides:

In [20]:
a = np.zeros((2, 2))   # Create an array of all zeros
print(a)               # Prints "[[ 0.  0.]
                       #          [ 0.  0.]]"

b = np.full((2, 2), 7)  # Create a constant array
print(b)                # Prints "[[ 7.  7.]
                        #          [ 7.  7.]]"

c = np.eye(2)         # Create a 2 x 2 identity matrix
print(c)              # Prints "[[ 1.  0.]
                      #          [ 0.  1.]]"

d = np.random.random((2, 2))  # Create an array filled with random values
print(d)                      # Might print "[[ 0.91940167  0.08143941]
                              #               [ 0.68744134  0.87236687]]"

[[0. 0.]
 [0. 0.]]
[[7 7]
 [7 7]]
[[1. 0.]
 [0. 1.]]
[[0.83035636 0.14928099]
 [0.09778568 0.23697491]]


How do we create a 2 by 2 matrix of ones?

In [21]:
a = np.ones((2, 2))    # Create an array of all ones
print(a)               # Prints "[[ 1.  1.]
                       #          [ 1.  1.]]"

[[1. 1.]
 [1. 1.]]


Useful to keep track of shape; helpful for debugging and knowing dimensions will be very useful when computing gradients, among other reasons.

creating or change the shape of an array ?

In [22]:
nums = np.arange(8)
print(nums)
print(nums.shape)

nums = nums.reshape((2, 4))
print('Reshaped:\n', nums)
print(nums.shape)

# The -1 in reshape corresponds to an unknown dimension that numpy will figure out,
# based on all other dimensions and the array size.
# Can only specify one unknown dimension.
# For example, sometimes we might have an unknown number of data points, and
# so we can use -1 instead without worrying about the true number.
nums = nums.reshape((4, -1))
print('Reshaped with -1:\n', nums)
print(nums.shape)

[0 1 2 3 4 5 6 7]
(8,)
Reshaped:
 [[0 1 2 3]
 [4 5 6 7]]
(2, 4)
Reshaped with -1:
 [[0 1]
 [2 3]
 [4 5]
 [6 7]]
(4, 2)


NumPy supports an object-oriented paradigm, such that ndarray has a number of methods and attributes, with functions similar to ones in the outermost NumPy namespace. For example, we can do both:

In [23]:
nums = np.arange(8)
print(nums.min())     # Prints 0
print(np.min(nums))   # Prints 0

0
0


# Little Disscusion about Flatten and Ravel .

 The primary functional difference is that flatten() is a method of an ndarray object and hence can only be called for true numpy arrays. In contrast ravel() is a library-level function and hence can be called on any object that can successfully be parsed. For example ravel() will work on a list of ndarrays, while flatten (obviously) won't

In [24]:
print (b)

[[7 7]
 [7 7]]


In [25]:
b.flatten() 

array([7, 7, 7, 7])

In [26]:
# flattening the array ...used in computer vision a lot..
b.ravel()

array([7, 7, 7, 7])

# Array Operations/Math.

NumPy supports many elementwise operations:

In [27]:
x = np.array([[1, 2],
              [3, 4]], dtype=np.float64)
y = np.array([[5, 6],
              [7, 8]], dtype=np.float64)

# Elementwise sum; both produce the array
# [[ 6.0  8.0]
#  [10.0 12.0]]
print(x + y)
print(np.add(x, y))

# Elementwise difference; both produce the array
# [[-4.0 -4.0]
#  [-4.0 -4.0]]
print(x - y)
print(np.subtract(x, y))

# Elementwise product; both produce the array
# [[ 5.0 12.0]
#  [21.0 32.0]]
print(x * y)
print(np.multiply(x, y))

# Elementwise square root; produces the array
# [[ 1.          1.41421356]
#  [ 1.73205081  2.        ]]
print(np.sqrt(x))

[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]
[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]
[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]
[[1.         1.41421356]
 [1.73205081 2.        ]]


How do we elementwise divide between two arrays?

In [28]:
x = np.array([[1, 2], [3, 4]], dtype=np.float64)
y = np.array([[5, 6], [7, 8]], dtype=np.float64)

# Elementwise division; both produce the array
# [[ 0.2         0.33333333]
#  [ 0.42857143  0.5       ]]
print(x / y)
print(np.divide(x, y))

[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]


Note * is elementwise multiplication, not matrix multiplication. We instead use the dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. dot is available both as a function in the numpy module and as an instance method of array objects:



There are many useful functions built into NumPy, and often we're able to express them across specific axes of the ndarray:

In [29]:
x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(np.sum(x))          # Compute sum of all elements; prints "21"
print(np.sum(x, axis=0))  # Compute sum of each row; prints "[5 7 9]"
print(np.sum(x, axis=1))  # Compute sum of each col; prints "[6 15]"

print(np.max(x, axis=1))  # Compute max of each row; col "[3 6]" 

21
[5 7 9]
[ 6 15]
[3 6]


How can we compute the index of the max value of each row? Useful, to say, find the class that corresponds to the maximum score for an input image.

In [30]:
x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(np.argmax(x, axis=1)) # Compute index of max of each row; prints "[2 2]"

[2 2]


# Axis 0 is a Row  and Axis 1 is a Column.
 

In [31]:
b=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(b)
b[0,1] # gives u 2nd  col in 1st row

[[1 2 3]
 [4 5 6]
 [7 8 9]]


2

In [32]:
b[0:2,0]  # from row 0 and 1 take column 1st

array([1, 4])

In [33]:
b[-1]   # gives the last row

array([7, 8, 9])

In [34]:
x = np.array([[1,2],[3,4]])
print("array: ",x)
print("-----")
print ("sum of all elements : ",np.sum(x))  # Compute sum of all elements; prints "10"
print ("sum of rows : ", np.sum(x, axis=0))  # Compute sum of each rows; prints "[4 6]"
print ("sum of cols : " ,np.sum(x, axis=1))  # Compute sum of each cosl; prints "[3 7]"

array:  [[1 2]
 [3 4]]
-----
sum of all elements :  10
sum of rows :  [4 6]
sum of cols :  [3 7]


In [35]:
x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(np.max(x, axis=1))  # Compute max of each row; prints "[3 6]" 

[3 6]


How can we compute the index of the max value of each row? Useful, to say, find the class that corresponds to the maximum score for an input image.

In [36]:
x = np.array([[1, 2, 3], 
              [4, 5, 6]])

print(np.argmax(x, axis=1)) # Compute index of max of each row; prints "[2 2]"

[2 2]


In [37]:
#Guess now
x = np.array([[1, 2, 16,0,3], 
              [4, 9, 6,2,1]])

print(np.argmax(x, axis=0)) 

[1 1 0 1 0]


In [38]:
d= np.random.randint(2, 10, size=(5, 5))#give me a 5 by 5 matrix of random elements  in the range of 2-10 (ofcourse 10 is not
# included)
d

array([[8, 5, 2, 4, 6],
       [7, 8, 6, 4, 8],
       [9, 8, 7, 8, 3],
       [9, 3, 9, 4, 5],
       [8, 7, 4, 2, 8]])

In [39]:
d[:,1:3]  # gives all rows of col 1 and 2

array([[5, 2],
       [8, 6],
       [8, 7],
       [3, 9],
       [7, 4]])

#  Indexing

NumPy also provides powerful indexing schemes.

In [40]:
# Create the following rank 2 array with shape (3, 4)
# [[ 1  2  3  4]
#  [ 5  6  7  8]
#  [ 9 10 11 12]]
a = np.array([[1, 2, 3, 4],
              [5, 6, 7, 8],
              [9, 10, 11, 12]])
print('Original:\n', a)

# Can select an element as you would in a 2 dimensional Python list
print('Element (0, 0) (a[0][0]):\n', a[0][0])   # Prints 1
# or as follows
print('Element (0, 0) (a[0, 0]) :\n', a[0, 0])  # Prints 1

# Use slicing to pull out the subarray consisting of the first 2 rows
# and columns 1 and 2; b is the following array of shape (2, 2):
# [[2 3]
#  [6 7]]
b = a[:2, 1:3]
print('Sliced (a[:2, 1:3]):\n', b)

# Steps are also supported in indexing. The following reverses the first row:
print('Reversing the first row (a[0, ::-1]) :\n', a[0, ::-1]) # Prints [4 3 2 1]

Original:
 [[ 1  2  3  4]
 [ 5  6  7  8]
 [ 9 10 11 12]]
Element (0, 0) (a[0][0]):
 1
Element (0, 0) (a[0, 0]) :
 1
Sliced (a[:2, 1:3]):
 [[2 3]
 [6 7]]
Reversing the first row (a[0, ::-1]) :
 [4 3 2 1]


We can also use boolean indexing/masks. Suppose we want to set all elements greater than MAX to MAX:

In [41]:
MAX = 5
nums = np.array([1, 4, 10, -1, 15, 0, 5])
print(nums > MAX)            # Prints [False, False, True, False, True, False, False]

nums[nums > MAX] = MAX
print(nums)                  # Prints [1, 4, 5, -1, 5, 0, 5]

[False False  True False  True False False]
[ 1  4  5 -1  5  0  5]


# What is Stacking ?

numpy.stack(arrays, axis=0, out=None)[source]
Join a sequence of arrays along a new axis.

The axis parameter specifies the index of the new axis in the dimensions of the result. For example, if axis=0 it will be the first dimension and if axis=-1 it will be the last dimension

In [42]:
a =  np.arange(9).reshape(3 , 3)
print(a)
b = np.arange(10,19).reshape(3,3)
print(b)
c = np.arange(20,29).reshape(3,3)
print(c)


[[0 1 2]
 [3 4 5]
 [6 7 8]]
[[10 11 12]
 [13 14 15]
 [16 17 18]]
[[20 21 22]
 [23 24 25]
 [26 27 28]]


In [43]:
np.hstack((a, b,c))

array([[ 0,  1,  2, 10, 11, 12, 20, 21, 22],
       [ 3,  4,  5, 13, 14, 15, 23, 24, 25],
       [ 6,  7,  8, 16, 17, 18, 26, 27, 28]])

In [44]:
np.vstack((a, c , b))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [20, 21, 22],
       [23, 24, 25],
       [26, 27, 28],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18]])

#### Note you can reshape an array of m,n = x into any m,n = same x e.g 3,2 can become 1,6

In [45]:
p=a.reshape(-1,1)  # now this is a column vector note that -1 means include all elements so this can also be written as 
# p=a.reshape(5,1)
p

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [6],
       [7],
       [8]])

In [46]:
o=a.reshape(1,-1)  # now this is a row vector
o

array([[0, 1, 2, 3, 4, 5, 6, 7, 8]])

In [47]:
hashe = np.arange(40).reshape(10,4)
hashe

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15],
       [16, 17, 18, 19],
       [20, 21, 22, 23],
       [24, 25, 26, 27],
       [28, 29, 30, 31],
       [32, 33, 34, 35],
       [36, 37, 38, 39]])

In [48]:
gap = np.arange(40).reshape(10,4)
print(gap)
print("Below is part 1")
part1,part2 = np.split(gap,2 , axis=1)
print(part1)
print("Below is part 2")
print(part2)

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]
 [16 17 18 19]
 [20 21 22 23]
 [24 25 26 27]
 [28 29 30 31]
 [32 33 34 35]
 [36 37 38 39]]
Below is part 1
[[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]
 [16 17]
 [20 21]
 [24 25]
 [28 29]
 [32 33]
 [36 37]]
Below is part 2
[[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]
 [18 19]
 [22 23]
 [26 27]
 [30 31]
 [34 35]
 [38 39]]


In [49]:
gap = np.arange(48).reshape(8,6)
print("org array : ",gap)
np.hsplit(gap,3)

org array :  [[ 0  1  2  3  4  5]
 [ 6  7  8  9 10 11]
 [12 13 14 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 31 32 33 34 35]
 [36 37 38 39 40 41]
 [42 43 44 45 46 47]]


[array([[ 0,  1],
        [ 6,  7],
        [12, 13],
        [18, 19],
        [24, 25],
        [30, 31],
        [36, 37],
        [42, 43]]), array([[ 2,  3],
        [ 8,  9],
        [14, 15],
        [20, 21],
        [26, 27],
        [32, 33],
        [38, 39],
        [44, 45]]), array([[ 4,  5],
        [10, 11],
        [16, 17],
        [22, 23],
        [28, 29],
        [34, 35],
        [40, 41],
        [46, 47]])]

### 1D arrays are a bad choice if you want to do serious operations on it so go for 2d ar more arrrays.

so instead of using v = np.random.randn(3,4)

In [50]:
v=np.random.rand(5)
print(v)
print(v.shape)

[0.72459625 0.86529339 0.33470038 0.91341508 0.81199312]
(5,)


In [51]:
v=np.random.rand(1,5)
print(v)
print(v.shape)  #row vector

[[0.43471471 0.95627951 0.46830148 0.51742784 0.77101663]]
(1, 5)


In [52]:
v=np.random.rand(5,1)
print(v)
print(v.shape)   #column vector

[[0.29541893]
 [0.53130844]
 [0.72653446]
 [0.99555067]
 [0.69223858]]
(5, 1)


In [53]:
for cell in v.flatten():
    print(cell) #now this wil print out each and every cell
    

0.295418927553048
0.5313084448768858
0.7265344643809585
0.9955506677292384
0.6922385754866872


### How to convert python list to numpy array.

In [54]:
pythonlist = [2,3,4,4]
print(type(pythonlist))
numpylist = np.array(pythonlist)
print(type(numpylist))

<class 'list'>
<class 'numpy.ndarray'>


In [55]:
# deeper indexing
print(p[0][0])
# remember after the rows and column the numpy arrays are just like dabe me daba, dabe me daba

0


In [56]:
np.random.seed(50)
co= np.random.randint(2, 100, size=(1,3,3,3))  # we can see that this is 4 dimensions
co

array([[[[50, 98, 13],
         [35, 96,  6],
         [72, 72, 24]],

        [[ 7,  4, 97],
         [73, 70, 80],
         [37, 94, 93]],

        [[28, 92,  8],
         [22, 45, 33],
         [51, 87, 43]]]])

In [57]:
co[0][1][1][2] #select 80
co[0][0][1][2]

6

In [58]:
co[0,1,1,2]  #2nd method

80

In [59]:
#this is how you get a new conditionalized array
ab= np.array([[1,2], [3, 4], [5, 6]])  
bool_idx = (ab > 2) 
ab =ab[bool_idx]
print(ab)

[3 4 5 6]


# Slicing ..

In [60]:
#slicing optional read
nums = list(range(5))# range is a built-in function that creates a list of integers
                     # (note that in Python 2 'range' returns a list object; in Python 3 'range' returns a generator object)
print(nums)          # Prints "[0, 1, 2, 3, 4]"
print(nums[2:4])     # Get a slice from index 2 to 4 (exclusive); prints "[2, 3]"
print(nums[2:])      # Get a slice from index 2 to the end; prints "[2, 3, 4]"
print(nums[:2])      # Get a slice from the start to index 2 (exclusive); prints "[0, 1]"
print(nums[:])       # Get a slice of the whole list; prints ["0, 1, 2, 3, 4]"
print(nums[:-1])     # Slice indices can be negative; prints ["0, 1, 2, 3]"
nums[2:4] = [8, 9]   # Assign a new sublist to a slice
print(nums)          # Prints "[0, 1, 8, 8, 4]"

[0, 1, 2, 3, 4]
[2, 3]
[2, 3, 4]
[0, 1]
[0, 1, 2, 3, 4]
[0, 1, 2, 3]
[0, 1, 8, 9, 4]


# View and Copies.

Unlike a copy, in a **view** of an array, the data is shared between the view and the array. Sometimes, our results are copies of arrays, but other times they can be views. Understanding when each is generated is important to avoid any unforeseen issues.

Views can be created from a slice of an array, changing the dtype of the same data area (using arr.view(dtype), not the result of arr.astype(dtype)), or even both.

In [61]:
x = np.arange(5)
print('Original:\n', x)  # Prints [0 1 2 3 4]

# Modifying the view will modify the array
view = x[0:3]
view[1] = -100
print('Array After Modified View:\n', x) 

Original:
 [0 1 2 3 4]
Array After Modified View:
 [   0 -100    2    3    4]


In [62]:
x = np.arange(5)
print('Original:\n', x)  # Prints [0 1 2 3 4]

# Modifying the result of the selection due to fancy indexing
# will not modify the original array.
copy = x[[1, 2]]  #or use copy= x[1:2].copy()
copy[1] = -1
print('Copy:\n', copy) # Prints [1 -1]
print('Array After Modified Copy:\n', x)  # Prints [0 1 2 3 4]

Original:
 [0 1 2 3 4]
Copy:
 [ 1 -1]
Array After Modified Copy:
 [0 1 2 3 4]


# Summary

1. NumPy is an incredibly powerful library for computation providing both massive efficiency gains and convenience.
2. Vectorize! Orders of magnitude faster.
3. Keeping track of the shape of your arrays is often useful.
4. Many useful math functions and operations built into NumPy.
5. Select and manipulate arbitrary pieces of data with powerful indexing schemes.
6. Watch out for views vs. copies.