## What is NumPy?
NumPy is a Python library used for working with arrays.

It also has functions for working in domain of linear algebra, fourier transform, and matrices.

NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.

NumPy stands for Numerical Python.

>>Why Use NumPy?

>>In Python we have lists that serve the purpose of arrays, but they are slow to process.

>>NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.

>>The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.

>>Arrays are very frequently used in data science, where speed and resources are very important.

Data Science: is a branch of computer science where we study how to store, use and analyze data for deriving information from it.

>Why is NumPy Faster Than Lists?
>>NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently.

>>This behavior is called locality of reference in computer science.

>>This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.

>Which Language is NumPy written in?
>>NumPy is a Python library and is written partially in Python, but most of the parts that require fast computation are written in C or C++.



In [1]:
import numpy as np

In [5]:
l = [1,2,3,4]

In [6]:
ar = np.array(l)

In [7]:
ar

array([1, 2, 3, 4])

In [8]:
type(ar)

numpy.ndarray

In [9]:
np.array([[1,2],[3,4]]) #2-D Array

array([[1, 2],
       [3, 4]])

In [10]:
np.asarray(l)

array([1, 2, 3, 4])

In [12]:
a = [2,3,4]

In [13]:
np.asarray(a)

array([2, 3, 4])

In [14]:
np.array(a)

array([2, 3, 4])

In [15]:
b = np.matrix(l)

In [16]:
b

matrix([[1, 2, 3, 4]])

In [18]:
#Matrix is by default 2-D array which is a subset of array
#array is a super class
np.asanyarray(b)#No change will happen because matric is also a subset of array

matrix([[1, 2, 3, 4]])

### Shallow copy

In [19]:
a = np.array(l)

In [20]:
a

array([1, 2, 3, 4])

In [28]:

c = a

In [22]:
c

array([1, 2, 3, 4])

In [23]:
a

array([1, 2, 3, 4])

In [25]:
c[0] = 100

In [26]:
c

array([100,   2,   3,   4])

In [27]:
a

array([100,   2,   3,   4])

# Difference between Shallow copy and Deep copy
Shallow Copy	
1.	In Shallow copy, a copy of the original object is stored and only the reference address is finally copied.	

2.	Shallow copy is faster than Deep copy.	

3.	The changes made in the copied object also reflect the original object.	

4.	It stores references of the object in the main memory.	

Deep Copy
1. In Deep copy, the copy of the original object and the repetitive copies both are stored.
2. Deep copy is slower than Shallow copy.
4. It stores copies of the object values.
3. There is no reflection on the original object when the changes are made in the copied object.


### Deep Copy

In [29]:
d = np.copy(a)

In [30]:
d

array([100,   2,   3,   4])

In [31]:
a

array([100,   2,   3,   4])

In [32]:
a[1] = 400

In [33]:
d

array([100,   2,   3,   4])

In [34]:
a

array([100, 400,   3,   4])

In [35]:
a

array([100, 400,   3,   4])

In [None]:
3X3 Matrix

In [36]:
np.fromfunction(lambda i,j : i == j,(3,3))

array([[ True, False, False],
       [False,  True, False],
       [False, False,  True]])

In [37]:
np.fromfunction(lambda i,j : i*j,(3,3))

array([[0., 0., 0.],
       [0., 1., 2.],
       [0., 2., 4.]])

In [38]:
iterable = (i*i for i in range(5))

In [39]:
np.fromiter(iterable,float)

array([ 0.,  1.,  4.,  9., 16.])

In [40]:
np.fromstring('234 234', sep = ' ') #This will break the string from space ' '

array([234., 234.])

In [41]:
np.fromstring('234 , 234', sep = ' , ')

array([234., 234.])

### Numpy - Data Types.

In [42]:
l = [2,3,4,5,6]

In [45]:
ar = np.array(l)

In [46]:
ar

array([2, 3, 4, 5, 6])

In [49]:
ar.ndim

1

In [50]:
ar2 = np.array([[1,2,3,4],[2,3,4,5]])

In [51]:
ar2

array([[1, 2, 3, 4],
       [2, 3, 4, 5]])

In [53]:
ar2.ndim #This will return the dimension

2

In [55]:
ar.size

5

In [56]:
ar2.size

8

In [57]:
ar.shape

(5,)

In [58]:
ar2.shape

(2, 4)

In [61]:
ar.dtype

dtype('int32')

In [62]:
ar2.dtype

dtype('int32')

In [64]:
ar22 = np.array([(1.4,45,45),(23,45,66)])

In [66]:
ar22

array([[ 1.4, 45. , 45. ],
       [23. , 45. , 66. ]])

In [67]:
ar22.dtype

dtype('float64')

In [70]:
list(range(5))

[0, 1, 2, 3, 4]

In [71]:
list(range(0.1,5))

TypeError: 'float' object cannot be interpreted as an integer

In [72]:
np.arange(2.3,5.6)

array([2.3, 3.3, 4.3, 5.3])

In [73]:
np.arange(2.3,5.6,.3)

array([2.3, 2.6, 2.9, 3.2, 3.5, 3.8, 4.1, 4.4, 4.7, 5. , 5.3])

In [75]:
list(np.arange(2.3,5.6,.3))#Array to list conversion

[2.3,
 2.5999999999999996,
 2.8999999999999995,
 3.1999999999999993,
 3.499999999999999,
 3.799999999999999,
 4.099999999999999,
 4.399999999999999,
 4.699999999999998,
 4.999999999999998,
 5.299999999999998]

In [77]:
np.linspace(1,5,10) #This will produce 10 data or number between 1  to 5

array([1.        , 1.44444444, 1.88888889, 2.33333333, 2.77777778,
       3.22222222, 3.66666667, 4.11111111, 4.55555556, 5.        ])

In [78]:
np.zeros(5) #This will produce a array which contain  5 zeros

array([0., 0., 0., 0., 0.])

In [80]:
np.zeros((3,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [82]:
np.zeros((3,4,2))

array([[[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]],

       [[0., 0.],
        [0., 0.],
        [0., 0.],
        [0., 0.]]])

In [86]:
np.zeros((3,4,2,3))

array([[[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]],


       [[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]],


       [[[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]],

        [[0., 0., 0.],
         [0., 0., 0.]]]])

In [87]:
ar4 = np.zeros((3,4,2,3))

In [88]:
ar4.ndim

4

In [89]:
np.ones(4)

array([1., 1., 1., 1.])

In [90]:
np.ones((2,3))

array([[1., 1., 1.],
       [1., 1., 1.]])

In [93]:
on = np.ones((2,3,2))

In [94]:
on + 5

array([[[6., 6.],
        [6., 6.],
        [6., 6.]],

       [[6., 6.],
        [6., 6.],
        [6., 6.]]])

In [95]:
on*4

array([[[4., 4.],
        [4., 4.],
        [4., 4.]],

       [[4., 4.],
        [4., 4.],
        [4., 4.]]])

In [97]:
np.empty((3,5)) #Produce empty array 

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [98]:
np.eye(4) #Identity Matrix

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.]])

In [101]:
np.linspace(2,4,20) #This will take a line in between 2 to 4 and divide it in 20 equal parts


array([2.        , 2.10526316, 2.21052632, 2.31578947, 2.42105263,
       2.52631579, 2.63157895, 2.73684211, 2.84210526, 2.94736842,
       3.05263158, 3.15789474, 3.26315789, 3.36842105, 3.47368421,
       3.57894737, 3.68421053, 3.78947368, 3.89473684, 4.        ])

In [104]:
np.logspace(2,5,10) 

array([   100.        ,    215.443469  ,    464.15888336,   1000.        ,
         2154.43469003,   4641.58883361,  10000.        ,  21544.34690032,
        46415.88833613, 100000.        ])

In [105]:
#Change The base
np.logspace(2,5,10, base = 2) 

array([ 4.        ,  5.0396842 ,  6.34960421,  8.        , 10.0793684 ,
       12.69920842, 16.        , 20.1587368 , 25.39841683, 32.        ])

In [110]:
#Generating a random numbers
np.random.randn(3,4)

array([[ 0.11485479, -0.12540308,  0.46115643, -0.87317131],
       [-0.57071413, -1.33246726,  0.2634346 , -1.06130812],
       [-0.16557922, -0.32591732, -2.09056644,  0.88714307]])

In [112]:
arr = np.random.randn(3,4)

In [113]:
import pandas as pd

In [114]:
pd.DataFrame(arr)

Unnamed: 0,0,1,2,3
0,0.679196,1.411427,-0.419409,1.145989
1,-0.173886,-1.627217,0.470601,-1.232263
2,0.545802,1.138159,-1.73205,0.542135


In [115]:
np.random.rand(3,4)

array([[0.35827807, 0.37912373, 0.41173652, 0.17736321],
       [0.56212446, 0.76384614, 0.17659177, 0.3776079 ],
       [0.22096752, 0.05290928, 0.77841714, 0.05052155]])

### Differences between numpy.random.rand vs numpy.random.randn in Python
numpy.random.randn generates samples from the normal distribution, while numpy.random.rand from a uniform distribution (in the range [0,1)).

In [116]:
np.random.randint(1,110,(3,4))

array([[ 84,  56,  90,  94],
       [ 86,  98,  92,  98],
       [103,  29, 102,  62]])

In [117]:
np.random.randint(1,110,(300,400))

array([[ 16,  95,  87, ...,  82,  15,  74],
       [ 72,  54,  29, ...,   2,  60,  69],
       [  4,  60, 107, ...,  59,  57,  88],
       ...,
       [ 66,   5,   6, ...,  34,  99,  47],
       [105, 107,  23, ...,  65,  14,  30],
       [107,  73,  44, ...,  76,  26,  21]])

In [120]:
 data = np.random.randint(1,110,(300,400))

In [121]:
pd.DataFrame(data)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,...,390,391,392,393,394,395,396,397,398,399
0,47,20,50,68,90,6,4,26,2,98,...,98,65,32,43,109,63,107,72,61,31
1,14,94,27,95,24,40,18,18,60,60,...,108,17,3,90,82,98,104,87,90,92
2,28,83,39,43,48,4,99,92,3,32,...,18,20,94,92,62,12,98,71,91,86
3,46,68,90,68,22,104,73,80,104,106,...,53,51,99,73,53,103,30,16,9,5
4,93,62,66,88,64,42,94,6,4,12,...,41,58,50,5,11,16,107,33,1,76
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
295,12,41,5,95,83,40,62,83,86,17,...,58,6,26,50,105,87,93,6,53,94
296,91,9,49,24,92,89,69,10,56,69,...,92,61,88,28,25,38,23,47,92,12
297,87,13,17,95,12,77,74,67,10,7,...,75,13,108,46,83,23,16,99,59,53
298,55,85,69,52,88,95,90,70,38,24,...,32,2,70,9,44,33,42,23,1,82


In [122]:
pd.DataFrame(data).to_csv('test.csv')

In [123]:
arr = np.random.rand(3,4)

In [124]:
arr

array([[0.34431471, 0.29454716, 0.19541249, 0.97714353],
       [0.74111346, 0.05941793, 0.083232  , 0.7097093 ],
       [0.72783773, 0.55061814, 0.47434831, 0.61262436]])

In [125]:
arr.reshape(6,2)

array([[0.34431471, 0.29454716],
       [0.19541249, 0.97714353],
       [0.74111346, 0.05941793],
       [0.083232  , 0.7097093 ],
       [0.72783773, 0.55061814],
       [0.47434831, 0.61262436]])

In [126]:
arr.reshape(6,-1)

array([[0.34431471, 0.29454716],
       [0.19541249, 0.97714353],
       [0.74111346, 0.05941793],
       [0.083232  , 0.7097093 ],
       [0.72783773, 0.55061814],
       [0.47434831, 0.61262436]])

In [127]:
arr.reshape(6,-152155)

array([[0.34431471, 0.29454716],
       [0.19541249, 0.97714353],
       [0.74111346, 0.05941793],
       [0.083232  , 0.7097093 ],
       [0.72783773, 0.55061814],
       [0.47434831, 0.61262436]])

In [129]:
arr1 = arr.reshape(6,-152155)

In [131]:
arr1

array([[0.34431471, 0.29454716],
       [0.19541249, 0.97714353],
       [0.74111346, 0.05941793],
       [0.083232  , 0.7097093 ],
       [0.72783773, 0.55061814],
       [0.47434831, 0.61262436]])

In [132]:
arr1[1][1]

0.9771435250686893

In [135]:
arr1[2:5,1]

array([0.05941793, 0.7097093 , 0.55061814])

In [139]:
np.random.randint(1,100,(5,5))

array([[20,  4, 77, 49, 50],
       [24, 41, 11, 89, 17],
       [68, 58, 76, 66, 28],
       [56, 60,  1, 47, 15],
       [12, 88, 50,  2, 45]])

In [140]:
arr = np.random.randint(1,100,(5,5))

In [141]:
arr

array([[67, 98, 34, 24, 68],
       [57, 88, 57, 32, 90],
       [50, 27, 25, 54, 70],
       [98, 63, 87, 11, 65],
       [96, 43, 53, 46, 97]])

In [142]:
#Find all data which are > 20
arr>20

array([[ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True,  True,  True],
       [ True,  True,  True, False,  True],
       [ True,  True,  True,  True,  True]])

In [143]:
arr[arr>50]

array([67, 98, 68, 57, 88, 57, 90, 54, 70, 98, 63, 87, 65, 96, 53, 97])

In [145]:
arr

array([[67, 98, 34, 24, 68],
       [57, 88, 57, 32, 90],
       [50, 27, 25, 54, 70],
       [98, 63, 87, 11, 65],
       [96, 43, 53, 46, 97]])

In [151]:
arr[2:4,1:3]

array([[27, 25],
       [63, 87]])

In [152]:
arr[0][0] = 5000

In [153]:
arr

array([[5000,   98,   34,   24,   68],
       [  57,   88,   57,   32,   90],
       [  50,   27,   25,   54,   70],
       [  98,   63,   87,   11,   65],
       [  96,   43,   53,   46,   97]])

In [156]:
arr1 = np.random.randint(1,3,(3,3)) 
arr2 = np.random.randint(1,3,(3,3))

In [157]:
arr1

array([[2, 1, 2],
       [2, 1, 1],
       [2, 2, 2]])

In [158]:
arr2

array([[2, 1, 1],
       [1, 1, 2],
       [1, 2, 2]])

In [159]:
arr1+arr2

array([[4, 2, 3],
       [3, 2, 3],
       [3, 4, 4]])

In [161]:
arr1-arr2

array([[ 0,  0,  1],
       [ 1,  0, -1],
       [ 1,  0,  0]])

In [162]:
arr1/arr2

array([[1. , 1. , 2. ],
       [2. , 1. , 0.5],
       [2. , 1. , 1. ]])

In [163]:
arr1*arr2 #This is not matrix multiplication

array([[4, 1, 2],
       [2, 1, 2],
       [2, 4, 4]])

In [164]:
arr1

array([[2, 1, 2],
       [2, 1, 1],
       [2, 2, 2]])

In [165]:
arr2

array([[2, 1, 1],
       [1, 1, 2],
       [1, 2, 2]])

In [166]:
#For matrix multiplication
arr1@arr2

array([[ 7,  7,  8],
       [ 6,  5,  6],
       [ 8,  8, 10]])

In [167]:
arr1/0

  arr1/0


array([[inf, inf, inf],
       [inf, inf, inf],
       [inf, inf, inf]])

In [168]:
arr1 + 100

array([[102, 101, 102],
       [102, 101, 101],
       [102, 102, 102]])

In [169]:
arr1 ** 2

array([[4, 1, 4],
       [4, 1, 1],
       [4, 4, 4]], dtype=int32)

### Numpy - Broadcasting 


In [172]:
np.zeros((4,4))

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [173]:
arr = np.zeros((4,4))

In [174]:
arr

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [176]:
row = np.array([1,2,3,4])

In [177]:
row

array([1, 2, 3, 4])

In [179]:
arr + row #This is called row wise broadcasting

array([[1., 2., 3., 4.],
       [1., 2., 3., 4.],
       [1., 2., 3., 4.],
       [1., 2., 3., 4.]])

In [180]:
row.ndim

1

In [181]:
row.shape

(4,)

In [182]:
row = np.array([1,2,3,4])

In [183]:
row.T

array([1, 2, 3, 4])

In [184]:
col = row = np.array([[1,2,3,4]])

In [185]:
col.T

array([[1],
       [2],
       [3],
       [4]])

In [187]:
arr + col.T

array([[1., 1., 1., 1.],
       [2., 2., 2., 2.],
       [3., 3., 3., 3.],
       [4., 4., 4., 4.]])

In [188]:
arr1 = np.random.randint(1,4,(3,4))

In [189]:
arr1

array([[3, 3, 2, 2],
       [1, 1, 2, 3],
       [1, 1, 1, 2]])

In [190]:
np.sqrt(arr1)

array([[1.73205081, 1.73205081, 1.41421356, 1.41421356],
       [1.        , 1.        , 1.41421356, 1.73205081],
       [1.        , 1.        , 1.        , 1.41421356]])

In [191]:
np.exp(arr1)

array([[20.08553692, 20.08553692,  7.3890561 ,  7.3890561 ],
       [ 2.71828183,  2.71828183,  7.3890561 , 20.08553692],
       [ 2.71828183,  2.71828183,  2.71828183,  7.3890561 ]])

In [192]:
np.log10(arr1)

array([[0.47712125, 0.47712125, 0.30103   , 0.30103   ],
       [0.        , 0.        , 0.30103   , 0.47712125],
       [0.        , 0.        , 0.        , 0.30103   ]])