
# <center><font style="color:rgb(100,109,254)">  Numpy </font><center>


"NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object, various derived objects (such as masked arrays and matrices), and an assortment of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, selecting, I/O, discrete Fourier transforms, basic linear algebra, basic statistical operations, random simulation and much more"
-https://docs.scipy.org/doc/numpy-1.10.1/user/whatisnumpy.html.

<font style="color:rgb(34,169,34)"> 
    Creating nd array,
    Shape/Reshape,
    Vectorization,
    where,
    Maths and Stats,
    Sort,
    Unique,
    Flat vs Ravel,
    Stacking,
    View vs Copy,

</font> 



##  <font style="color:rgb(34,169,34)">  Importing Numpy </font> 

** The `np` is a very popular alias given to numpy **

In [1]:
import numpy as np

Let's run through an example showing how powerful NumPy is. <br>
Suppose we have two lists `a` and `b`, consisting of the first `100,000 non-negative numbers`, and we want to create a new list c whose ith element is `a[i] * b[i]`.


###   <font style="color:rgb(34,169,34)"> Approach Without NumPy:  </font> 

** Using Python lists **

** Python List Comprehension **

In [2]:
%%time
a = [i for i in range(100000)]
b = [i for i in range(100000)]



Wall time: 117 ms


In [3]:
c = a * b

TypeError: can't multiply sequence by non-int of type 'list'

# c = a * b

That's the thing we want you to notice the real time difference.
The Wall Time which a process needs to complete its task .
<br><br>
<i> **Note: ** The `%%time` is the magic command for calculating the execution time of the cell. <br> </i>


## <font style="color:rgb(34,169,34)"> Using Numpy  </font> 

In [None]:
%%time
e = np.arange(100000)
f = np.arange(100000)

Wall time: 997 µs


# c = a * b

In [None]:

h = e * f
print(h)

[         0          1          4 ... 1409465417 1409665412 1409865409]


The result is 10 to 15 times faster, and we could do it in fewer lines of code (and the code itself is more intuitive)

Regular Python is much slower due to type checking and other overhead of needing to interpret code and support Python's abstractions.

For example, if we are doing some addition in a loop, constantly type checking in a loop will lead to many more instructions than just performing a regular addition operation. NumPy, using optimized pre-compiled C code, is able to avoid a lot of the overhead introduced.



##  <font style="color:rgb(34,169,34)"> What is an Array  </font> 
** A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension. **

Creating A simple array of 3 integers


![image.png](attachment:image.png)

![image.png](attachment:image.png)

In [None]:
a = np.array([3,3,0,3,3]) #1D array
a

array([3, 3, 0, 3, 3])

In [None]:
a = np.zeros((2, 2))   # Create an array of all zeros of specified shape
print(a) 

        

[[0. 0.]
 [0. 0.]]


In [None]:
a = np.ones((2, 2))    # Create an array of all ones
print(a)

[[1. 1.]
 [1. 1.]]


In [None]:
b = np.full((2, 2), 7)  # Create a constant array
print(b)   

[[7 7]
 [7 7]]


![image.png](attachment:image.png)

![image.png](attachment:image.png)

# Multi-dimensional Array

 ###  <font style="color:rgb(34,169,34)"> Creating a 2D Array  </font> 


In [None]:
b = np.array([   [2,3],
                 [4,5],
                 [6,7]  ]) # 2D array
print(b.ndim)
print(b.shape) # returns rows,columns
b


2
(3, 2)


array([[2, 3],
       [4, 5],
       [6, 7]])

 ###  <font style="color:rgb(34,169,34)"> Creating a 3D Array  </font>


In [None]:
b = np.array([ [  [1],[2],[3] ],[ [4],[5],[6]  ]  ])   # Create a rank 3 array
print (b)
print(b.ndim)
print(b.shape)


[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]]
3
(2, 3, 1)


![image.png](attachment:image.png)

# Slide 32
![Screenshot%20%28108%29.png](attachment:Screenshot%20%28108%29.png)

###  <font style="color:rgb(34,169,34)"> Reshaping an Array  </font> 



In [None]:
nums = np.arange(16)
print(nums)
nums.shape



[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15]


(16,)

**Now lets reshape that 16, to a 4x4 array **



In [None]:
nums = nums.reshape((4, 4))
print('Reshaped:\n', nums)
print(nums.shape)



Reshaped:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
(4, 4)


** Using -1 ** 

The -1 in reshape corresponds to an unknown dimension that numpy will figure out based on all other dimensions and the array size. We Can only specify one unknown dimension. For example, sometimes we might have an unknown number of data points, and so we can use -1 instead without worrying about the true number.





In [None]:
nums = nums.reshape((4,-1 ))
print('Reshaped with -1:\n', nums)
print(nums.shape)


Reshaped with -1:
 [[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]
 [12 13 14 15]]
(4, 4)


# Method vs Function
** NumPy supports an object-oriented paradigm, such that ndarray has a number of methods and attributes, with functions similar to ones in the outermost NumPy namespace. ** <br>
<i>For example, we can do both: </i>

https://techvidvan.com/tutorials/python-methods-vs-functions/


In [None]:
nums2 = np.arange(8)

print(nums2.min())

0


In [None]:
np.min(nums2)

0

# Vectorization 

![image.png](attachment:image.png)

## <font style="color:rgb(34,169,34)"> Array Operations/Math  </font> 



In [None]:
x = np.array([[1, 2],
              [3, 4]], dtype=np.float64)
y = np.array([[5, 6],
              [7, 8]], dtype=np.float64)



** Addition **

In [None]:
print(x + y)
print(np.add(x, y))



[[ 6.  8.]
 [10. 12.]]
[[ 6.  8.]
 [10. 12.]]


** Subtraction **



In [None]:
print(x - y)
print(np.subtract(x, y))



[[-4. -4.]
 [-4. -4.]]
[[-4. -4.]
 [-4. -4.]]


**Multiplication **



In [None]:
print(x * y)
print(np.multiply(x, y))



[[ 5. 12.]
 [21. 32.]]
[[ 5. 12.]
 [21. 32.]]


** Division **
How do we elementwise divide between two arrays?




In [None]:
print(x / y)
print(np.divide(x, y))


[[0.2        0.33333333]
 [0.42857143 0.5       ]]
[[0.2        0.33333333]
 [0.42857143 0.5       ]]



<small>Note * is elementwise multiplication, not matrix multiplication. We instead use the dot function to compute inner products of vectors, to multiply a vector by a matrix, and to multiply matrices. dot is available both as a function in the numpy module and as an instance method of array objects </small>

The process we used above is vectorization. Vectorization refers to applying operations to arrays instead of just individual elements (i.e. no loops).

** Why vectorize? **

1. Much faster
2. Easier to read and fewer lines of code
3. More closely assembles mathematical notation

<i>Vectorization is one of the main reasons why NumPy is so powerful.</i>

 ## <font style="color:rgb(34,169,34)">  Numpy Functions </font> 
There are many useful functions built into NumPy, and often we're able to express them across specific axes of the ndarray:



**Note:** `Axis 0` is `row` and `Axis 1` is `column`


In [None]:
x = np.array([[1, 2, 3], 
              [4, 5, 6]])

#print(np.sum(x))          # Compute sum of all elements
#print(np.sum(x, axis=0))  # Compute sum of each row
print(np.sum(x, axis=1))  # Compute sum of each column




[ 6 15]


![image.png](attachment:image.png)

In [None]:
salary = np.array([10000, 20000, 15000, 18000, 30000, 50000, 35000, 10000, 30000])
# < 20k qualifies for allowance
# Mark them "YES"
allowance = np.where(salary < 20000, "YES", "NO")
print(allowance)
np.where()


['YES' 'NO' 'YES' 'YES' 'NO' 'NO' 'NO' 'YES' 'NO']



![Screenshot%20%28104%29.png](attachment:Screenshot%20%28104%29.png)

In [None]:
salary.mean()

24222.222222222223

![Screenshot%20%28105%29.png](attachment:Screenshot%20%28105%29.png)

In [None]:
# Perform operation for those having <20k
allowance.sum()

![Screenshot%20%28106%29.png](attachment:Screenshot%20%28106%29.png)

In [None]:
salary.sort()
salary

array([10000, 10000, 15000, 18000, 20000, 30000, 30000, 35000, 50000])

![Screenshot%20%28107%29.png](attachment:Screenshot%20%28107%29.png)

In [None]:
# salary.unique()
np.unique(salary)

array([10000, 15000, 18000, 20000, 30000, 35000, 50000])

##  <font style="color:rgb(34,169,34)"> Flatten vs Ravel   </font> 

 The primary functional difference is that flatten() is a method of an ndarray object and hence can only be called for true numpy arrays. In contrast ravel() is a library-level function and hence can be called on any object that can successfully be parsed. For example ravel() will work on a list of ndarrays, while flatten (obviously) won't

In [None]:
print (b)

[[[1]
  [2]
  [3]]

 [[4]
  [5]
  [6]]]


In [None]:
b.flatten() 

array([1, 2, 3, 4, 5, 6])

In [None]:
# flattening the array ...used in computer vision a lot..
# Will cover in session for Computer Vision
b.ravel()

array([1, 2, 3, 4, 5, 6])


## <font style="color:rgb(34,169,34)">  Filtering   </font> 
We can also use boolean indexing/masks. Suppose we want to set all elements greater than MAX to MAX:

In [None]:
MAX = 5
nums = np.array([1, 4, 10, -1, 15, 0, 5])
print(nums > MAX)           


nums[nums > MAX] = MAX
print(nums)                 


[False False  True False  True False False]
[ 1  4  5 -1  5  0  5]


##  <font style="color:rgb(34,169,34)"> Stacking   </font> 

`numpy.stack()`:
**Joins a sequence of arrays along a new axis. **<br>

In [None]:
a = np.arange(0, 9).reshape(3, 3)
print(a)

[[0 1 2]
 [3 4 5]
 [6 7 8]]


In [None]:
b = np.arange(10,19).reshape(3,3)
print(b)

[[10 11 12]
 [13 14 15]
 [16 17 18]]


In [None]:
c = np.arange(20,29).reshape(3,3)
print(c)

[[20 21 22]
 [23 24 25]
 [26 27 28]]


**Horizontal Stacking**

In [None]:
# Syntax : numpy.hstack(tup)
np.hstack((a, b, c))
np.hstack()


array([[ 0,  1,  2, 10, 11, 12, 20, 21, 22],
       [ 3,  4,  5, 13, 14, 15, 23, 24, 25],
       [ 6,  7,  8, 16, 17, 18, 26, 27, 28]])

**Vertical Stacking**

In [None]:
# Syntax : numpy.vstack(tup)
np.vstack((a, b, c))

array([[ 0,  1,  2],
       [ 3,  4,  5],
       [ 6,  7,  8],
       [10, 11, 12],
       [13, 14, 15],
       [16, 17, 18],
       [20, 21, 22],
       [23, 24, 25],
       [26, 27, 28]])


### <font style="color:rgb(34,169,34)"> Converting Python Lists to Numpy arrays   </font> 

In [None]:
pythonlist = [2,3,4,4]
print(type(pythonlist))
numpylist = np.array(pythonlist)
print(type(numpylist))

<class 'list'>
<class 'numpy.ndarray'>



## <font style="color:rgb(34,169,34)"> View and Copies.  </font> 

Unlike a copy, in a **view** of an array, the data is shared between the view and the array. Sometimes, our results are copies of arrays, but other times they can be views. Understanding when each is generated is important to avoid any unforeseen issues.


### <font style="color:rgb(34,169,34)">  Views  </font> 
** Views can be created from a slice of an array **

In [None]:
x 

### <font style="color:rgb(34,169,34)"> Copy  </font> 
Just add a `.copy()` after the array to prevent it from being modified


## <font style="color:rgb(34,169,34)">Summary   </font> 

1. Numpy is an incredibly powerful library for computation providing both massive efficiency gains and convenience.
2. Vectorize! Orders of magnitude faster.
3. Keeping track of the shape of your arrays is often useful.
4. Many of the useful math functions and operations are built into Numpy.
5. Watch out for views vs. copies.