# Introduction to Numpy

A numpy array is a grid of values, all of the same type, and is indexed by a tuple of nonnegative integers. The number of dimensions is the rank of the array; the shape of an array is a tuple of integers giving the size of the array along each dimension.
<img src="http://pytolearn.csd.auth.gr/b2-numat/20/arrays.png" width=600>

### Numpy Array from List

In [3]:
import numpy as np
l = [1,2,3,4,5,6,7,8,9]
a = np.array(l)
a

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

### Shape of a Numpy Array

In [4]:
a.shape

(9,)

### Reshaping a Numpy Array

In [6]:
# (n, 1) to (1, n)
b = a.reshape(3, 3)
b , b.shape

(array([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]]), (3, 3))

### Data Types

In [9]:
#find data types
a.dtype

dtype('int64')

In [11]:
#declare data types explicitly
a = np.array([1, 2, 3], dtype='float64')
a

array([1., 2., 3.])

### Methods for Array Creation

#### numpy.arange - Range of values

#### numpy.linspace - By specifying the number of elements

#### Zero-Initialised

#### Quiz: One-Initialised

#### Constant Diagonal Values

#### Quiz: Multiple Diagonal Values

### Axis

<img src="https://i.ibb.co/W6sFBJL/numpy-axis-none.png" alt="numpy-axis-none" border="0" width=350>

In [33]:
a = np.arange(15).reshape(3,5)
a

array([[ 0,  1,  2,  3,  4],
       [ 5,  6,  7,  8,  9],
       [10, 11, 12, 13, 14]])

In [None]:
#Axis is None

<img src="https://i.ibb.co/jVzJymp/numpy-axis-1.png" alt="numpy-axis-1" border="0" width=350>

In [30]:
#Row-wise

<img src="https://i.ibb.co/7tfJC1r/numpy-axis-0.png" alt="numpy-axis-0" border="0" width=350>

In [31]:
#Column-wise

### Indexing and Slicing
index starts from 0 till n-1 --> Used for retrieving specific elements in an array <br>
slice is of the form start : stop (stop not included)

<img src="https://i.ibb.co/bjtw7NW/Indexing.png" alt="Indexing" border="0" width=350>

In [23]:
a = np.arange(15).reshape(3, 5)
a[0:2]

array([[0, 1, 2, 3, 4],
       [5, 6, 7, 8, 9]])

<img src="https://i.ibb.co/YQHd0wT/Slicing.png" alt="Slicing" border="0" width=350>

In [22]:
a = np.arange(15).reshape(3, 5)
a[:2, 2:4]

array([[2, 3],
       [7, 8]])

### Broadcasting
Arrays with different sizes cannot be added, subtracted, or generally be used in arithmetic.

A way to overcome this is to duplicate the smaller array so that it is the dimensionality and size as the larger array. This is called array broadcasting and is available in NumPy when performing array arithmetic, which can greatly reduce and simplify your code.

<img src="https://i.ibb.co/ZYTg8Kd/numpy-broad.png" alt="numpy-broad" border="0" width=350>

In [26]:
a = np.arange(5)
b = np.array([10,10,10,10,10])
print(a + b)
b = 10
print(a + b)

[10 11 12 13 14]
[10 11 12 13 14]


<img src="https://i.ibb.co/3RykkVB/numpy-broad-2.png" alt="numpy-broad-2" border="0" width=350>

In [29]:
a = np.arange(6).reshape(3,2)
b = np.array([[10],[20], [30]])
print(a + b)
b = b[0:3, 0:1]
print(a + b)

[[10 11]
 [22 23]
 [34 35]]
[[10 11]
 [22 23]
 [34 35]]


<hr style="border-color:black">
<hr style="border-color:black">
<h1>Introduction to Matplotlib</h1>

### 1D Series

<img src="1DSeries.png" width=450 />

### Data Distributions

<img src="DataDistributions.png" width=550/>

<hr style="border-color:black">

### 1D Series

In [1]:
import matplotlib.pyplot as plt
import numpy as np

### Line Plot
It is basically connecting data points with a straight line. It is useful in understanding the trend over time. It can explain the correlation between points by the trend. An upward trend means positive correlation and downward trend means a negative correlation. It mostly used in forecasting, monitoring models.

#### When to use: Line Plots should be used when single or multiple variables are to be plotted over time.

eg: Stock Market Analysis of Companies, Weather Forecasting.

In [34]:
x = np.arange(5)
y = np.random.randn(5)

#### Line Plot with Dashed Lines

### Subplots of a Line Graph

### Scatter Plot
Scatter plot helps in visualizing 2 numeric variables. It helps in identifying the relationship of the data with each variable i.e correlation or trend patterns. It also helps in detecting outliers in the plot.

#### When to use:  It is used in Machine learning concepts like regression, where x and y are continuous variables. It is also used in clustering scatters or outlier detection.

#### Marker is Triangle

### Bar Graphs
Bar Plot shows the distribution of data over several groups. It is commonly confused with a histogram which only takes numerical data for plotting. It helps in comparing multiple numeric values.

#### When to use:  It is used when to compare between several groups.

Eg: Student marks in an exam.

In [31]:
x = np.arange(5)
y = np.random.randn(5)

a = np.arange(10)
b = np.random.randn(10)

### Vertical Bar Graphs

### Horizontal Bar Graphs

#### Subplots

### Pie Chart
It is a circular plot which is divided into slices to illustrate numerical proportion. The slice of a pie chart is to show the proportion of parts out of a whole.

#### When to use: Pie chart should be used seldom used as It is difficult to compare sections of the chart. Bar plot is used instead as comparing sections is easy.

eg: Market share in Films.

In [35]:
# Data to plot
labels = 'Python', 'C++', 'Ruby', 'Java'
sizes = [215, 130, 245, 210]

### Histogram
A histogram takes in a series of data and divides the data into a number of bins. It then plots the frequency data points in each bin (i.e. the interval of points). It is useful in understanding the count of data ranges.

#### When to use: We should use histogram when we need the count of the variable in a plot.

eg: Number of particular games sold in a store.

In [36]:
data = [21,22,23,4,5,6,77,8,9,10,31,32,33,34,35,36,37,18,49,50,100]

#### Cumulative Histogram

#### Histogram with Different colors of edges and faces And a specified range

In [37]:
### Assignment Question
### x and y - Bar Graph (Vertical and Horizontal) Plotting 
### a and b - Bar Graph (Vertical and Horizontal) Plotting