![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

<img src="https://user-images.githubusercontent.com/7065401/39118381-910eb0c2-46e9-11e8-81f1-a5b897401c23.jpeg"
    style="width:300px; float: right; margin: 0 40px 40px 40px;"></img>

# Numpy: Numeric computing library

NumPy (Numerical Python) is one of the core packages for numerical computing in Python. Pandas, Matplotlib, Statmodels and many other Scientific libraries rely on NumPy.

NumPy major contributions are:

* Efficient numeric computation with C primitives
* Efficient collections with vectorized operations
* An integrated and natural Linear Algebra API
* A C API for connecting NumPy with libraries written in C, C++, or FORTRAN.

Let's develop on efficiency. In Python, **everything is an object**, which means that even simple ints are also objects, with all the required machinery to make object work. We call them "Boxed Ints". In contrast, NumPy uses primitive numeric types (floats, ints) which makes storing and computation efficient.

<img src="https://docs.google.com/drawings/d/e/2PACX-1vTkDtKYMUVdpfVb3TTpr_8rrVtpal2dOknUUEOu85wJ1RitzHHf5nsJqz1O0SnTt8BwgJjxXMYXyIqs/pub?w=726&h=396" />


![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Importing Libraries

In [4]:
import sys #means we are importing whole system
import numpy as np


## Basic Numpy Arrays


In [None]:
my_list= [1,2, 3, 4]
type(my_list)
np.array(my_list)
a = np.array(my_list)
type(a)
b = np.array([0, .5, 1, 1.5, 2])

# Index and Slicing

In [None]:
a[0], a[1]

(np.int64(1), np.int64(2))

In [None]:
a[0:]

array([1, 2, 3, 4])

In [None]:
a[1:3]

array([2, 3])

In [None]:
#a[1:-1]
#a[0::4]
a[0:4]

array([1, 2, 3, 4])

In [None]:
# Using Increment

a[:2]

array([1, 2])

In [None]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [None]:
# Single Index

b[0], b[2], b[-1]

(np.float64(0.0), np.float64(1.0), np.float64(2.0))

In [None]:
# Multi-Indexing

b[[0, 2, 4]]  # result is another numpy array

array([0., 1., 2.])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Array Types

In NumPy, dtypes (data types) define the type of elements that an array can hold. Every element in a NumPy array must have the same type, and the dtype object in NumPy specifies the kind of data (e.g., integer, float, etc.) stored in the array.

**Type Specification:** NumPy arrays can hold data of a specific type, such as integers, floating-point numbers, booleans, etc.

**Precision:** dtype allows control over the precision (number of bytes) of the data, making it possible to optimize memory usage or perform high-precision calculations.

Common types:

* np.int8, np.int16, np.int32, np.int64: Signed integers of 1, 2, 4, or 8 bytes.

* np.uint8, np.uint16, np.uint32, np.uint64: Unsigned integers.

* np.float16, np.float32, np.float64: Floating-point numbers of 2, 4, or 8 bytes.

* np.bool_: Boolean values (True or False).

* np.complex64, np.complex128: Complex numbers (two floats representing real and imaginary parts).

In [None]:
a

array([1, 2, 3, 4])

In [None]:
a.dtype  # Get the data type of the array

dtype('int64')

In [None]:
b

array([0. , 0.5, 1. , 1.5, 2. ])

In [None]:
b.dtype

dtype('float64')

In [None]:
np.array([1, 2, 3, 4], dtype=np.float64)  # Change the data type to float

array([1., 2., 3., 4.])

In [None]:
a.dtype

dtype('int64')

In [None]:
np.array([1, 2, 3, 4], dtype=np.int8)  # Change the data type to int8

array([1, 2, 3, 4], dtype=int8)

In [None]:
c = np.array(['ab', 'bc', 'cde'])  # Create an array of strings

In [None]:
c.dtype

dtype('<U3')

In [None]:
# Get the memory size of the integer 5

print(sys.getsizeof(5))  # Output: 28 bytes

28


![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Dimensions and shapes

In [None]:
my_matrix= [ [[1,2,3],[4,5,6]]
            ,[[7,8,9],[10,11,12]],
            ]  # 3x3

In [None]:
my_matrix

[[1, 2, 3, 30], [4, 5, 6, 60], [7, 8, 9, 90], [10, 11, 12, 15]]

In [None]:
A = np.array(my_matrix)  # 2-D array

In [None]:
A

array([[[ 1,  2,  3],
        [ 4,  5,  6]],

       [[ 7,  8,  9],
        [10, 11, 12]]])

In [None]:
A.shape  # 3 rows and 3 columns

(2, 2, 3)

In [None]:
A.ndim  # 2 dimensions

3

In [None]:
A.size  # 9 elements

16

In [None]:
B = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [3, 2, 1]
    ]
])  # 3-D array

In [None]:
B  # 2 elements, 2 rows, 3 columns

array([[[12, 11, 10],
        [ 9,  8,  7]],

       [[ 6,  5,  4],
        [ 3,  2,  1]]])

In [None]:
B.shape  # 2 elements, 2 rows, 3 columns

(2, 2, 3)

In [None]:
B.ndim  # 3 dimensions

3

In [None]:
B.size  # 12 elements

12

If the shape isn't consistent, it'll just fall back to regular Python objects:

In [None]:
C = np.array([
    [
        [12, 11, 10],
        [9, 8, 7],
    ],
    [
        [6, 5, 4],
        [6, 5, 4]
    ]
])  # 3-D array

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (2, 2) + inhomogeneous part.

In [None]:
C.dtype  # object

dtype('int64')

In [None]:
C.shape  # 2 elements, 2 rows, 3 columns

(2, 2, 3)

In [None]:
C.size  # 6 elements

2

In [None]:
type(C[0])  # list

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Indexing and Slicing of Matrices

In [None]:
# Square matrix
A = np.array([
#.   0. 1. 2
    [1, 2, 3], # 0
    [4, 5, 6], # 1
    [7, 8, 9]  # 2
])  # 3x3

In [None]:
A[1]  # 2nd row

array([4, 5, 6])

In [None]:
A[1][0]  # 2nd row, 1st column

np.int64(4)

In [None]:
A[1, 0]  # 2nd row, 1st column

array([4, 5, 6])

In [None]:
A[0:2]  # 1st and 2nd row

array([[1, 2, 3],
       [4, 5, 6]])

In [None]:
A[:, :2]  # 1st and 2nd column

array([[1, 2],
       [4, 5],
       [7, 8]])

In [None]:
A[:2, :2]  # 1st and 2nd row, 1st and 2nd column

array([[1, 2],
       [4, 5]])

In [None]:
A[:2, 2:]  # 1st and 2nd row, 3rd column

array([[3],
       [6]])

In [None]:
A  # 3x3

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [None]:
A[1] = np.array([10, 10, 10])  # 2nd row

In [None]:
A  # 3x3

array([[ 1,  2,  3],
       [10, 10, 10],
       [ 7,  8,  9]])

In [None]:
A[2] = 99  # 3rd row

In [None]:
A

array([[ 1,  2,  3],
       [10, 10, 10],
       [99, 99, 99]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Summary statistics

In [None]:
a = np.array([1, 2, 3, 4])  # 1-D array

In [None]:
a.sum()  # sum of all elements

np.int64(10)

In [None]:
a.mean()  # average of all elements

np.float64(2.5)

In [None]:
a.std()  # standard deviation of all elements

1.118033988749895

In [None]:
a.var()  # variance of all elements

1.25

In [5]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])  # 3x3

In [6]:
A.sum()  # sum of all elements

np.int64(45)

In [7]:
A.mean()  # average of all elements

np.float64(5.0)

In [8]:
A.std()  # standard deviation of all elements

np.float64(2.581988897471611)

In [9]:
A.sum(axis=0)  # sum of all elements in each column

array([12, 15, 18])

In [10]:
A.sum(axis=1)  # sum of all elements in each row

array([ 6, 15, 24])

In [11]:
A.mean(axis=0)  # average of all elements in each column

array([4., 5., 6.])

In [None]:
A.mean(axis=1)  # average of all elements in each row

array([2., 5., 8.])

In [None]:
A.std(axis=0)  # standard deviation of all elements in each column

array([2.44948974, 2.44948974, 2.44948974])

In [None]:
A.std(axis=1)  # standard deviation of all elements in each row

array([0.81649658, 0.81649658, 0.81649658])

And [many more](https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.ndarray.html#array-methods)...

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Broadcasting and Vectorized operations

The term broadcasting refers to the ability of NumPy to treat arrays with different dimensions during arithmetic operations. This process involves certain rules that allow the smaller array to be ‘broadcast’ across the larger one, ensuring that they have compatible shapes for these operations.

Broadcasting is not limited to two arrays; it can be applied over multiple arrays as well.





In [45]:
a = np.arange(5)  # 1-D array

In [46]:
a

array([0, 1, 2, 3, 4])

In [47]:
b = np.arange(5)
b

array([0, 1, 2, 3, 4])

In [58]:
c=np.array([a,b])

In [59]:
c

array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

In [66]:
np.log10(c)

  np.log10(c)


array([[      -inf, 0.        , 0.30103   , 0.47712125, 0.60205999],
       [      -inf, 0.        , 0.30103   , 0.47712125, 0.60205999]])

In [68]:
#c+10                       #paticular number is being sumed
#c**2                        # EXPONENT TO A NUMBER
                     # LOGARITHMIC VALUE
e=[]
#def logz(array):
for i in c:
  f=i/10
  e.append(f)
np.array(e)


array([[0. , 0.1, 0.2, 0.3, 0.4],
       [0. , 0.1, 0.2, 0.3, 0.4]])

In [69]:
a + 10  # add 10 to all elements

array([10, 11, 12, 13, 14])

In [70]:
a * 10  # multiply 10 to all elements

array([ 0, 10, 20, 30, 40])

In [71]:
a # original unchanged array

array([0, 1, 2, 3, 4])

In [72]:
a += 100  # add 100 to all elements

In [73]:
a # modified array

array([100, 101, 102, 103, 104])

In [74]:
l = [0, 1, 2, 3]

In [75]:
[i * 10 for i in l]  # multiply 10 to all elements

[0, 10, 20, 30]

In [76]:
a = np.arange(4)  # 1-D array

In [77]:
a

array([0, 1, 2, 3])

In [78]:
b = np.array([10, 10, 10, 10])  # 1-D array

In [79]:
b

array([10, 10, 10, 10])

In [80]:
a + b  # add element wise

array([10, 11, 12, 13])

In [81]:
a * b  # multiply element wise

array([ 0, 10, 20, 30])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Boolean arrays
_(Also called masks)_

In [82]:
a = np.arange(4)  # 1-D array

In [83]:
a

array([0, 1, 2, 3])

In [84]:
a[0], a[-1]  # first and last element

(np.int64(0), np.int64(3))

In [90]:
a[[0, -1]]  # first and last element with multi indexing

array([0, 3])

In [91]:
a[[True, False, False, True]]  # first and last element with boolean indexing

array([0, 3])

In [96]:
a

array([0, 1, 2, 3])

In [97]:
a >= 2  # element wise comparison

array([False, False,  True,  True])

In [98]:
a[a >= 2]  # boolean indexing

array([2, 3])

In [99]:
a.mean()  # average of all elements

np.float64(1.5)

In [100]:
a[a > a.mean()]   # elements greater than average

array([2, 3])

In [None]:
a[~(a > a.mean())]  # elements less than average

array([0, 1])

In [101]:
a[(a == 0) | (a == 1)]  # elements equal to 0 or 1

array([0, 1])

In [102]:
a[(a <= 2) & (a % 2 == 0)]  # elements less than or equal to 2 and even

array([0, 2])

In [126]:

A = np.random.randint(100, size=(3, 3))  # 3x3 matrix with random integers

In [123]:
#for -ve nums
A = np.random.randint(-50,0, size=(3, 4))  # 3x3 matrix with random integers

In [125]:
A

array([[16, 95, 62, 92],
       [89, 31, 26, 31],
       [99, 97, 32,  6]])

In [133]:
A[np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])]  # boolean indexing

array([28, 50, 33,  7, 31])

In [134]:
# Select elements from A where the boolean mask is False
print(A[~np.array([
    [True, False, True],
    [False, True, False],
    [True, False, True]
])])

[97 92 38 92]


In [135]:
A > 30  # element wise comparison

array([[False,  True,  True],
       [ True,  True,  True],
       [False,  True,  True]])

In [131]:
A[A > 30]  # boolean indexing

array([97, 50, 92, 33, 38, 92, 31])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Linear Algebra

In [147]:
A = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9]
])  # 3x3

In [144]:
#trows error cause:
A = np.array([
    [1, 2],
    [4, 5],
    [7, 8]
])  # 3x3

In [148]:
B = np.array([
    [6, 5],
    [4, 3],
    [2, 1]
])  # 3x2

In [142]:
A.dot(B)  # dot product

array([[20, 14],
       [56, 41],
       [92, 68]])

In [149]:
A @ B  # another way to take dot product

array([[20, 14],
       [56, 41],
       [92, 68]])

In [150]:
B.T  # transpose

array([[6, 4, 2],
       [5, 3, 1]])

In [151]:
A

array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [152]:
B.T @ A # B transpose dot pot with A

array([[36, 48, 60],
       [24, 33, 42]])

In [158]:
# Inverse of A
np.invert(A)
# Determinant of A
#np.linalg.det(A)

array([[ -2,  -3,  -4],
       [ -5,  -6,  -7],
       [ -8,  -9, -10]])

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Size of objects in Memory

### Int, floats

In [170]:
# An integer in Python is > 24bytes
sys.getsizeof(10)#

28

In [171]:
# Longs are even larger
sys.getsizeof(10**100)

72

In [179]:
# Numpy size is much smaller
np.dtype(int, 8).itemsize

8

In [186]:
# Numpy size is much smaller
np.dtype(np.int8).itemsize

1

In [187]:
np.dtype(float).itemsize  # float is 8 bytes

8

### Lists are even larger

In [None]:
# A one-element list
sys.getsizeof([1])

64

In [None]:
# An array of one element in numpy
np.array([1]).nbytes

8

### And performance is also important

In [193]:
l = list(range(1_00_000)) # list with values from 0 - 99999

In [194]:
a = np.arange(100000) # numpy array with values from 0 - 99999

In [195]:
%time np.sum(a ** 2)  # timing the function

CPU times: user 202 µs, sys: 1.04 ms, total: 1.25 ms
Wall time: 854 µs


np.int64(333328333350000)

In [196]:
%time sum([x ** 2 for x in l]) # timing the function

CPU times: user 7.66 ms, sys: 3.05 ms, total: 10.7 ms
Wall time: 10.7 ms


333328333350000

![green-divider](https://user-images.githubusercontent.com/7065401/52071924-c003ad80-2562-11e9-8297-1c6595f8a7ff.png)

## Useful Numpy functions

### `random`

In [None]:
np.random.random(size=2)  # random number between 0 - 1

array([0.06261306, 0.86354836])

In [None]:
np.random.normal(size=2)  # random number with normal distribution

array([-0.95999373,  0.72987844])

In [None]:
np.random.rand(2, 4)  # random number with uniform distribution

array([[0.24621586, 0.54552086, 0.41878152, 0.47819737],
       [0.02654858, 0.19748214, 0.89927504, 0.48441888]])

---
### `arange`

In [None]:
np.arange(10)  # array with values from 0 - 9

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [None]:
np.arange(5, 10)  # array with values from 5 - 9

array([5, 6, 7, 8, 9])

In [None]:
np.arange(0, 1, .1)  # array with values from 0 - 1 with step 0.1

array([0. , 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9])

---
### `reshape`

In [None]:
np.arange(10).reshape(2, 5)  # reshape array to 2 rows, 5 cols

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

In [None]:
np.arange(10).reshape(5, 2)  # reshape array to 5 rows, 2 cols

array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])

---
### `linspace`

In [None]:
np.linspace(0, 1, 5) # 5 equally spaced numbers between 0 and 1

array([0.  , 0.25, 0.5 , 0.75, 1.  ])

In [None]:
np.linspace(0, 1, 20) # 20 equally spaced numbers between 0 and 1

array([0.        , 0.05263158, 0.10526316, 0.15789474, 0.21052632,
       0.26315789, 0.31578947, 0.36842105, 0.42105263, 0.47368421,
       0.52631579, 0.57894737, 0.63157895, 0.68421053, 0.73684211,
       0.78947368, 0.84210526, 0.89473684, 0.94736842, 1.        ])

In [None]:
np.linspace(0, 1, 20, False) # 20 equally spaced numbers between 0 and 1, excluding 1

array([0.  , 0.05, 0.1 , 0.15, 0.2 , 0.25, 0.3 , 0.35, 0.4 , 0.45, 0.5 ,
       0.55, 0.6 , 0.65, 0.7 , 0.75, 0.8 , 0.85, 0.9 , 0.95])

---
### `zeros`, `ones`, `empty`

In [None]:
np.zeros(5)  # array with 5 zeros

array([0., 0., 0., 0., 0.])

In [None]:
np.zeros((3, 3))  # array with 3 rows, 3 cols

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

In [None]:
np.zeros((3, 3), dtype=np.int8)  # array with 3 rows, 3 cols, int type

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]], dtype=int8)

In [None]:
np.ones(5)  # array with 5 ones

array([1., 1., 1., 1., 1.])

In [None]:
np.ones((3, 3))  # array with 3 rows, 3 cols

array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

---
### `identity` and `eye`

In [198]:
np.identity(3)  # identity matrix

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [None]:
np.eye(3, 3)  # identity matrix

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

In [None]:
np.eye(8, 4)

array([[1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [203]:
np.eye(8, 4, k=1)

array([[0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.]])

In [None]:
np.eye(8, 4, k=-3)

array([[0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [0., 0., 0., 0.],
       [1., 0., 0., 0.],
       [0., 1., 0., 0.],
       [0., 0., 1., 0.],
       [0., 0., 0., 1.],
       [0., 0., 0., 0.]])

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)