### What is Numpy?
1. Numpy is the fundamental package that enables numeric computing using        python.
2. It provides powerful ways to create, store, and/or manipulate data, which makes it easier to integerate with wide variety of databases.
3. It is a foundation that Pandas is built-on.
4. Pandas is another very important package that is used for data manipulation and data analysis.
5. With the help of numpy one can create array with certain data types, manipulate arrays, select elements from array, load datasets into array
 

In [1]:
# For starting, First we need to import a numpy library (abbr : 'np') using a 'import' keyword 
import numpy as np
import math
import random

## Array Creation

In [2]:
# In python, Arrays are displayed as list or list of list which can be created through list as well.
# For creating arrays using numpy, we pass the list as a argument in numpy array.

a = np.array([1,2,3,4])
print(a)

[1 2 3 4]


In [3]:
# we can print the dimension of array using ndim attribute
print(a.ndim)
# Here, 1 reprsents array is of one-dimension

1


In [4]:
# We can also create multi-dimensional array , just need to pass list of list as a argument in numpy array
# for instance, a matrix
b = np.array([[1,2,3,4],[5,6,7,8]])
print(b)
print(b.ndim)


[[1 2 3 4]
 [5 6 7 8]]
2


In [5]:
# We can also print the length of each dimension by calling 'shape' attribute, which return a tuple
b.shape
# Here, Tuple represents (row, column)

(2, 4)

In [6]:
# We can also check the data type of elemnts present in array by calling 'dtype' attribute
b.dtype
# Here, data present inside the array is of type integer and 32 represents the number of bits that the operating system is reserving to represent a number, 
# which determines the size (or precision) of the number

dtype('int32')

In [7]:
# Beside Integers, floats and other datatypes like String, charcters are also accepted in numpy arrays.
c = np.array([1,3,3.4, 5, "Hello"])
print(c)
print(c.dtype.name)

['1' '3' '3.4' '5' 'Hello']
str1024


In [8]:
# Suppose, we have an array of both float and integer elements in that situation
# the numpy automatically converts the integers into floats since, there is no loss of presision.
# Numpy will try to give you the best datatype format possible to keep your data homogenous
# which means, all the data is same in the array.
c1 = np.array([1.0,3,4.5])
print(c1)    

[1.  3.  4.5]


In [9]:
# Sometimes, there comes a situation where we know the shape of an array but, not what we want to be in it.
# Numpy provides two methods to thrive in this situation.
# Two methods : zero's and one's

d = np.zeros((2,3))
print(d)
# Here, zeros method create an array of given shape where every element is zero 
e = np.ones((2,3))
print(e)
# Just Like zeros method but only differnce exists is one instead of zero. 

[[0. 0. 0.]
 [0. 0. 0.]]
[[1. 1. 1.]
 [1. 1. 1.]]


In [10]:
# We can also generate an array with random numbers

f = np.random.rand(2,3)
print(f)

[[0.61023071 0.27923901 0.05186356]
 [0.0458245  0.95298826 0.7477388 ]]


In [11]:
# We can also create sequence of numbers in an array by calling 'arange()' methos.
# First argument is starting bound (inclusive), second argument is the ending bound(exclusive) and third bound is steps or differnce between each consecutive number.


# Lets create an array of every odd number from 1 (inclusive) to 50 (exclusive)

g = np.arange(1,50,2)
g


array([ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33,
       35, 37, 39, 41, 43, 45, 47, 49])

In [12]:
# if we want to generate a sequence of floats we can use linspace method.
# here, third argument is not difference between two consecutive numbers but,
# the total number of items you want inside of an array.

h = np.linspace(0,2, 10)
h

# 1o numbers form 0 (inclusive) to 10 (inclusive) 

array([0.        , 0.22222222, 0.44444444, 0.66666667, 0.88888889,
       1.11111111, 1.33333333, 1.55555556, 1.77777778, 2.        ])

## Array Operations


In [13]:
# We can do various operations on array, such as
# 1. Mathematical Operation : addition, subtraction, multipication, division
# 2. Matrix operations : product, transpose, inverse, and many more..

In [14]:
# Airthmetic operations on array apply elementwise

# Let's create a couple of arrays first

a = np.array([10,20,30,40])
b = np.array([5,6,7,8])

# Now, let's look at a minus b

c = a - b
print(c)

# a minus b return an array where element at each index is a result of substraction between corresponding array elements with same index.

# like, a minus b other opeations are as follows

d = a + b # addition
print(d)

e = a / b # division
print(e)

f = a * b # multiplication
print(f)


# with airthmetic operator, we can convert current data to the way we want it to be.

[ 5 14 23 32]
[15 26 37 48]
[2.         3.33333333 4.28571429 5.        ]
[ 50 120 210 320]


In [15]:
# Boolean Arrays:
# we can apply operator on an array and boolean array will be returned for every element in the original array.

# For instance, let's look at this

# let's create an array of random elements and check whcih elements are greater than 0.5.

a = np.array([random.random() for i in range(10)])
a > 0.5

# here, true represents that the element present at corresponding index is greater than 0.5 and False means not.

array([False, False, False, False, False, False, False,  True,  True,
       False])

In [139]:
# Now, Let's look at matrix operations

# Matrix Product
# If you want to do elementwise product use "*" operator

a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])

print(a*b)

# if you want to do matrix dot product, use "@" sign or dot operator

print(a@b)

# But there are some things we need to keep in mind while doing product of two matrices
# it is only possible when inner dimensions of the two matrices are the same.
# for example,
# let suppose a dimension is (x,y) and b dimension is (g,h).
# Product of both matrix (a and b) is only possible when y == g.


[[ 5 12]
 [21 32]]
[[19 22]
 [43 50]]


#### Transposing Arrays and Dot product



In [142]:
print(a)
print(a.T)
print(np.dot(a,b))


[[1 2]
 [3 4]]
[[1 3]
 [2 4]]
[[19 22]
 [43 50]]


In [146]:
# Also, we have various aggregate function in python such as, sum(), min(), max(), mean()

c = np.arange(1,20,2)
print(c)
print(c.min())
print(c.max())
print(c.sum())
print(c.mean())
print(c.cumsum()) # Cululative Sum
print(c.cumprod()) # cululative product

[ 1  3  5  7  9 11 13 15 17 19]
1
19
100
10.0
[  1   4   9  16  25  36  49  64  81 100]
[        1         3        15       105       945     10395    135135
   2027025  34459425 654729075]


In [24]:
# Just in case, there comes a situation when you are required to reshape the array to another dimesnion
# Numpy also enable us to do that 

print(c)
c = c.reshape(2,5) # converts 1 d array into 2-d array of shape (2,5)
print(c)

[ 1  3  5  7  9 11 13 15 17 19]
[[ 1  3  5  7  9]
 [11 13 15 17 19]]


## Indexing, Slicing, and Iterating

In [25]:
#Indexing, Slicing and iterations are very important when you talk about manipulating data
#Because these allow us to select data based on conditions and, copy or update data.



#### Indexing

In [28]:
# First, Let's make an array of arbitiary numbers. One - d array works similar ways as a list.
# To get an element from one-d array, we simple use the offset index
a = np.array([1,2,3,4,5])
print(a)
print(a[2])


[1 2 3 4 5]
3


In [29]:
# For multidimensional array, we need to use integer array indexing,
# Lets create a multidimensional array

b = np.array([[1,2],[3,4],[5,6]])
b

array([[1, 2],
       [3, 4],
       [5, 6]])

In [32]:
# Now, Look at some examples and understand how we can access data from multi dimensional arrays

# If we want to select one certain number, we can do it by entering the row and column index inside the square brackets as defined below

b[1,1]

4

In [37]:
# If we want to get multiple elements let says a list of 1,4,6
# we can do it by 

# method 1 : Simple by enetering index of every numbers
print(np.array([b[0,0],b[1,1],b[2,1]]))

# method 2 : We can also do it by another form of array indexing, which particularly "zips" the first list and second list index-wisely.

print(b[[0,1,2],[0,1,1]])


[1 4 6]
[1 4 6]


In [65]:
# now let's suppose you want all first column elements of array

print(b[: , 0])

# Similarly, all first row elements

print(b[0,:]) # which is like similar to print(b[0])
print(b[0])

[1 3 5]
[1 2]
[1 2]


##### Boolean Indexing

In [122]:
# Let's understand it with an exmples

#Let's create an array of string 
names = np.array(['Bob','Aman','Joe','Will', 'Joe','Joe','Will'])
print(names)

names == 'Bob'
# Here, the names == 'Bob' will return an arary of Boolean values based on the condition
# i.e, it will return true where name if 'Bob' otherwise False

['Bob' 'Aman' 'Joe' 'Will' 'Joe' 'Joe' 'Will']


array([ True, False, False, False, False, False, False])

In [123]:
# We can also pass the result generated by boolean indexing to another array to access the elemts based on the conditions
# For example,

data = np.random.randn(7,4) 
print(data)

data[names == 'Bob'] 
# since names == 'Bob' return only one index true i.e row 0 that's why data return row 0

[[ 0.40326161  0.08554377  0.53325554  1.07494883]
 [ 0.02327004  1.61367596  0.2310816  -0.94021847]
 [-0.73861345  1.1044402  -1.39993304  0.26831319]
 [-2.26839245  1.64334432 -0.9963198  -0.07069444]
 [ 1.52572689 -0.41272219 -0.19150874 -0.26717574]
 [ 0.95626051  0.7553614  -1.91722103  0.01592422]
 [-2.47289949  0.95357298  0.72783453 -0.22310445]]


array([[0.40326161, 0.08554377, 0.53325554, 1.07494883]])

In [124]:
# Let's do some more experiments

data[names == 'Will', 1:3]

array([[ 1.64334432, -0.9963198 ],
       [ 0.95357298,  0.72783453]])

In [125]:

data[names == 'Will', 3]

array([-0.07069444, -0.22310445])

In [126]:
names != 'Bob'

array([False,  True,  True,  True,  True,  True,  True])

In [127]:
data[names != 'Bob'] # or data[~(names == 'Bob')]

array([[ 0.02327004,  1.61367596,  0.2310816 , -0.94021847],
       [-0.73861345,  1.1044402 , -1.39993304,  0.26831319],
       [-2.26839245,  1.64334432, -0.9963198 , -0.07069444],
       [ 1.52572689, -0.41272219, -0.19150874, -0.26717574],
       [ 0.95626051,  0.7553614 , -1.91722103,  0.01592422],
       [-2.47289949,  0.95357298,  0.72783453, -0.22310445]])

In [128]:
condition = (names == 'Bob') | (names == 'Will')
condition

array([ True, False, False,  True, False, False,  True])

In [129]:
data[~(condition)]

array([[ 0.02327004,  1.61367596,  0.2310816 , -0.94021847],
       [-0.73861345,  1.1044402 , -1.39993304,  0.26831319],
       [ 1.52572689, -0.41272219, -0.19150874, -0.26717574],
       [ 0.95626051,  0.7553614 , -1.91722103,  0.01592422]])

In [130]:
data[data < 0]

array([-0.94021847, -0.73861345, -1.39993304, -2.26839245, -0.9963198 ,
       -0.07069444, -0.41272219, -0.19150874, -0.26717574, -1.91722103,
       -2.47289949, -0.22310445])

##### Assignment

In [131]:
# We can alos assign the values to indexes of arrays
#  for example
print(data)
data[0,1] = 1
print(data)

[[ 0.40326161  0.08554377  0.53325554  1.07494883]
 [ 0.02327004  1.61367596  0.2310816  -0.94021847]
 [-0.73861345  1.1044402  -1.39993304  0.26831319]
 [-2.26839245  1.64334432 -0.9963198  -0.07069444]
 [ 1.52572689 -0.41272219 -0.19150874 -0.26717574]
 [ 0.95626051  0.7553614  -1.91722103  0.01592422]
 [-2.47289949  0.95357298  0.72783453 -0.22310445]]
[[ 0.40326161  1.          0.53325554  1.07494883]
 [ 0.02327004  1.61367596  0.2310816  -0.94021847]
 [-0.73861345  1.1044402  -1.39993304  0.26831319]
 [-2.26839245  1.64334432 -0.9963198  -0.07069444]
 [ 1.52572689 -0.41272219 -0.19150874 -0.26717574]
 [ 0.95626051  0.7553614  -1.91722103  0.01592422]
 [-2.47289949  0.95357298  0.72783453 -0.22310445]]


In [132]:
data[names != 'Bob'] = 0
data

array([[0.40326161, 1.        , 0.53325554, 1.07494883],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ],
       [0.        , 0.        , 0.        , 0.        ]])

In [133]:
data[data == 0] =1

In [135]:
data

array([[0.40326161, 1.        , 0.53325554, 1.07494883],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ]])

In [137]:
data[names == 'Joe'] = 7
data

array([[0.40326161, 1.        , 0.53325554, 1.07494883],
       [1.        , 1.        , 1.        , 1.        ],
       [7.        , 7.        , 7.        , 7.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [7.        , 7.        , 7.        , 7.        ],
       [7.        , 7.        , 7.        , 7.        ],
       [1.        , 1.        , 1.        , 1.        ]])

##### Slicing

In [48]:
# Slicing is a way to create sub arrays based on the original array 

# For one dimenional array slicing works in similar as it works with list.
# To slice, we use : the sign and specify the staring and ending index to access the data
# For instance :3 in the indexing, we get elements from postion 0 (inclusive) to position 3 (exclusive)

# For example

a = np.arange(1,10,1)
print(a)
print(a[:3])
print(a[::2]) # We can also pass third parameter in slicing that specify the steps or jumps.



[1 2 3 4 5 6 7 8 9]
[1 2 3]
[1 3 5 7 9]


In [52]:
# For two dimensional array, it works similarly, Lets see an example

a  = np.arange(1,13,1).reshape(3,4)
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [54]:
#if we put one aruemnt in the array, for example a[:2] then we would get all the elemenys from the first and second row
#i.e

a[:2]

array([[1, 2, 3, 4],
       [5, 6, 7, 8]])

In [56]:
# if we add another argument to the array, for example a[:2, 1:3] , we get the first two rows but second and third columns only

a[:2 ,1:3]
# So, in multi dimensional array first argument is for slicing the rows and secinf 
# element is for slicing the columns

array([[2, 3],
       [6, 7]])

In [3]:
# Some More things to learn

arr = np.arange(1,6)
arr

array([1, 2, 3, 4, 5])

In [11]:
print(np.sum(arr))

print(np.add.reduce(arr))

15
15


In [8]:
np.multiply.reduce(arr)

120

In [18]:
np.subtract.reduce(arr)

-13

In [19]:
np.cumsum(arr)

array([ 1,  3,  6, 10, 15], dtype=int32)

In [21]:
np.add.accumulate(arr)

array([ 1,  3,  6, 10, 15], dtype=int32)

In [23]:
np.cumprod(arr)

array([  1,   2,   6,  24, 120], dtype=int32)

In [24]:
np.multiply.accumulate(arr)

array([  1,   2,   6,  24, 120], dtype=int32)

In [27]:
np.subtract.accumulate(arr)

array([  1,  -1,  -4,  -8, -13], dtype=int32)

In [29]:
np.eye(3) # identity matrix

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [32]:
np.zeros([3,3]) # Matrix of zeros

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [33]:
np.zeros(3)

array([0., 0., 0.])

In [34]:
np.ones(3)

array([1., 1., 1.])

In [37]:
np.full([3,5], 12)

array([[12, 12, 12, 12, 12],
       [12, 12, 12, 12, 12],
       [12, 12, 12, 12, 12]])

In [38]:
np.arange(1,10,2)

array([1, 3, 5, 7, 9])

In [40]:
np.arange(1,20,2).reshape(2,5)

array([[ 1,  3,  5,  7,  9],
       [11, 13, 15, 17, 19]])

In [43]:
np.linspace(1,20,5)

array([ 1.  ,  5.75, 10.5 , 15.25, 20.  ])

In [50]:
np.random.random([5,5]) #uniform distribution

array([[0.07217202, 0.64444566, 0.14370718, 0.4989118 , 0.14641136],
       [0.61049091, 0.47913764, 0.17715167, 0.53292838, 0.32003047],
       [0.34256195, 0.01547988, 0.13525848, 0.73231935, 0.84552954],
       [0.12238725, 0.10110349, 0.34753947, 0.64676514, 0.2297539 ],
       [0.58830262, 0.54225594, 0.52218595, 0.9420112 , 0.1039418 ]])

In [54]:
np.random.rand(3,4) # random values in given shape


array([[0.14257723, 0.02381663, 0.86032322, 0.5197083 ],
       [0.70464809, 0.21586882, 0.67622889, 0.44637074],
       [0.0728328 , 0.72191348, 0.88675554, 0.95684062]])

In [55]:
np.prod(arr)

120

In [None]:
np.sum
np.multiply
np.subtract
np.divide
np.floor_divide
np.power
np.mod
np.negative
np.log
np.log2
np.log10
np.exp
np.exp2
np.pi
np.abs
np.absolute
np.sin
np.cos
np.tan
np.cot
np.sec
np.cosec
np.arcsin
np.arccos
np.arctan
np.arccot
np.arcsec
np.arccosec
np.prod
np.min
np.max
np.std
np.var
np.median
np.mean
np.any
np.all
np.percentile
np.argmin
np.argmax

In [59]:
np.argmax(arr)

4

What is broadcasting?

Broadcasting is a technique of performing different airthmetic operations on arrays of differnet shapes

for example
arr + 5 adds 5 to each element of an array
similarly, arr - 5
arr * 5
arr / 5
arr // 5
arr ** 2



In [84]:
a = np.array([0,1,2])
arr = np.ones([3,3])
a

array([0, 1, 2])

In [85]:
arr

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

In [86]:
arr + a

array([[1., 2., 3.],
       [1., 2., 3.],
       [1., 2., 3.]])