## Why Python?

It's easier to learn and read than most other languages, given the form of its syntax.

There's a huge, rapidly-growing community of coders that use it.

There are a large number of extra libraries made by third parties, that can make coding a breeze.

Let's start our lesson with something every coder will be familiar with - the obligatory, "Hello World!" program

Open up Python and type this - 



In [9]:
print("Hello World!")
# That's it, believe it or not!
# By the way, this is a comment - you can freely type what you want to, 
# and the compiler will ignore it. Use comments to help readers understand your code, or to put random easter eggs :p

Hello World!


There are two forms of Python - the Shell, and the Editor. The Shell executes statements instantly, while the Editor begins executing statements after you've written them all down and 'run' the program.


## Contents

Today, we'll be going over the following topics:


*   Data Types
*   Conditional Statements
*   Loops
*   Operators
*   Scope of Variables
*   Lists, Tuples and Dictionaries
*   Python Libraries



## Datatypes

Varibles are grouped into different data types, with each data type giving a variable different properties. Using them lets us know what is expected from each variable; furthermore, functions (which you will learn about shortly) also react differently to variables based on their data type.

The most commonly used data types are int(integers), float(decimal values), strings(characters, words and sentences) and bool(boolean values, i.e. True or False). There are several other data types like 'char' and 'complex', but we'll restrict our discussion to these ones, for now.



In [10]:
x = 5
name = "Wallstreet Bets"

print(x)
print(name)

print(type(x))
print(type(name))

5
Wallstreet Bets
<class 'int'>
<class 'str'>


The example code you just saw showed us something new - Python doesn't require you to declare a varible's data type as most other languages do. Rather, it infers it from the value you the variable.

## Conditional Statements

Conditional Statements are used for decision-making.

In [11]:
a = 420
if a == 420:
  print("nice")
else:
  print("meh")

nice


Here we had to make a decision based on the value of a - whether to print 'nice' or not. That is what we use the following keywords for:


*   if
*   else
*   elif

These make use of test expressions, logical operations that evaluate to TRUE or FALSE - *if* the expression is true, the part indented below the 'if' expression is executed. Otherwise, it is ignored.

The use of 'else' and 'elif' can be somewhat intuitively understood by the following code.



In [12]:
b = 69
if b == 420:
  print('nice')
elif b == 69:
  print('niiiiiice')
else:
  print('lame')

niiiiiice


## Loops
Let's look at two programs with the same output:




In [13]:
print("Hello!")
print("Hello!")
print("Hello!")
print("Hello!")
print("Hello!")

Hello!
Hello!
Hello!
Hello!
Hello!


In [14]:
t = 0
while(t < 5):
    print("Hello!")
    t = t + 1

Hello!
Hello!
Hello!
Hello!
Hello!


Which of those two programs seemed more efficient? Furthermore, the first method would be impractical for tasks that need to be repeated thousands of times, as is the case in modern times. 

There are two kinds of loops - bounded and unbounded loops.

Bounded loops - We  specify the exact number of iterations

Unbounded loops - We don't

An *iteration* is what you call each repeating step in a loop.

### While loop

The while loop is an unbounded loop with the following form:

In [15]:
i = 0
while(i<10):
 print(i)
 i = i + 1

0
1
2
3
4
5
6
7
8
9


As you can see, we did not specify the number of iterations.

Loops make use of a test expression as well - they check the expression before each iteration, and execute said iteration only if the test expression evaluates to "TRUE"

### For Loop

We also have the *for* loop, where we do have to specify the number of iterations. Let's look at some example code:

In [16]:
for i in range(0,5,1):
	print(i)


0
1
2
3
4


## Nested Loops

Loops within loops

In [17]:
x = 4
for i in range(1, x + 1):
  for j in range(1, i + 1):
    print(j, end = '')
  print('')

1
12
123
1234


## Fibonacci Sequence Example

A fibonoacci series is a sequence with the first two elements always defined as 0, then 1, and every element after that equal to the sum of the two elements before it i.e. for n > 2, the n<sup>th</sup> element = (n-1)<sup>th</sup> element + (n-2)<sup>th</sup> element

0, 1, 1, 2, 3, 5, 8 ... , (n-1)<sup>th</sup> + (n-2)<sup>th</sup>

In [18]:
#code along with us!
x = int(input("Enter Sequence Length: "))

#Defining the first 2 numbers of the sequence
a = 0
b = 1

if (x <= 0):
    print("Please re-enter the sequence length")
elif x == 1:
    print("Sum is: ", a)
else:
    for i in range(x-2):
        a, b = b, a+b
    print("Sum is: ", b)

Enter Sequence Length: 10
Sum is:  34


## Operators
Operators perfom processes on variables and constants. The different kinds of operators can be understood through examples.

### Arithmetic Operators


1. Addition (+)
2. Subtraction (-)
3. Multiplication (*)
4. Division (/)
5. Modulus(%)
6. Exponent(**)



In [19]:
a = 5
b = 6
#operations in order
print(a+b)
print(b-2)
print(a*4)
print(b/3)
print(a%b)
print(a**2 )


11
4
20
2.0
5
25


### Comparison Operators


1.    Check equality (== )
2.    Check inequality(!= or <>)
3.    Greater than (>)
4.    Less than(<)
5.    Greater than or equal to (>=)
6.    Less than or equal to (<=)




In [20]:
a = 5
b = 6
#operations in order
print(a==b)
print(b!=2)
print(a>4)
print(b<3)
print(a>=b)
print(a<=2)

False
True
True
False
False
False


### Logical Operators



1.  and
2.  or
3.  not



In [21]:
a = 5
b = 6
if(a==b and b>5):
 print("something")

if(a==5 or b==2):
 print("some other thing")

if(not b==9):
 print("b is not equal to 9")

some other thing
b is not equal to 9


## Example Problem: Adding Reversed Numbers

Based on the SPOJ.com problem: https://www.spoj.com/problems/ADDREV/

Given two integers, reverse their digits and add them.

In [22]:
#enter your code here
txt = input("Enter the numbers separated by a space: ")   #Takes in any input as a string
x, y = txt.split(' ')          # The split function splits any string according to the delimiter given

#Now since x and y are still strings, lets use string operations to reverse the number!!

def strrev(string):
    reversedString = ''
    index = len(string) # calculate length of string and save in index
    while index > 0: 
        reversedString += string[ index - 1 ] # save the value of str[index-1] in reverseString
        index = index - 1 # decrement index
    return reversedString
y_reversed = int(strrev(y))
x_reversed = int(strrev(x))
print(x_reversed + y_reversed)

Enter the numbers separated by a space: 24 1
43


## Lists

A list is a collections of entries that is ordered and can be changed. They also allow duplicate entries. Entries of a list don't need to be of the same datatype. Look at some of the examples given below

In [23]:
a = ['Hello', 2, True, 'The End']
print(a)

['Hello', 2, True, 'The End']


You can access individual members of the list too!!

Remember that the indexing starts from 0


In [24]:
print(a[0], a[2])

Hello True


You can find out about the length of a list using the len function

In [25]:
print(len(a))

4


You learnt about for loops earlier right? Let's use that to make a slightly bigger list using the append function

In [26]:
b = []                    #Declare an empty list to which we will add elements

for num in range(10):
  b.append(num)           # It's that simple!!!!

print(b)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [27]:
# Code along with us
# Let us make a list called big_numbers and append all 4 digit numbers to the list
big_numbers = []

for num in range(1000, 10000):
    big_numbers.append(num)

Now let's take a scenario, where you made the list b and you've been asked to extract the second till the seventh element. Will you go through typing b[1], b[2], b[3] and so on? What if we take list big_numbers and you were required to extract 500 of those entries? Too much of a pain? Well worry not! Python allows for a simpler way to extract these elements

In [28]:
print( b[1:7] )   # Yes that's right, you can slice these elements in this manner. [start_index : end_index + 1]

[1, 2, 3, 4, 5, 6]


In [29]:
# You can also extract all the elements till the end in the following way
print( b[1:] )

[1, 2, 3, 4, 5, 6, 7, 8, 9]


Did you know that python allows you to extract negative indices too? You might be thinking about how that works right? Well whenever we provide python a negative index we essentially ask it to start at the end of the list and go backwards while indexing

In [30]:
print( b[-1] ) # SPOILER ALERT: It will print the last element!

9


In [31]:
print( b[-4: -2])

[6, 7]


Now what do we do if we want to increment all the elements of b by 2 and store it in a new list c? You'll most probably think of something like this

In [32]:
c = []

for i in range(len(b)):
  c.append(2 + b[i])

print(c)

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]


While in theory this is essentially what is happening, there is another way which is much smaller which python allows for. It's called List Comperehension. Look at the following example

In [33]:
c = [x+2 for x in b]

print(c)

[2, 3, 4, 5, 6, 7, 8, 9, 10, 11]


As you can see we've got the same results!!

Let's try out an example

In [39]:
# Code along with us!!
# Let us make a new list d whose entries are those entries in c which are greater than 6

d = [x for x in c if x>6]
d

[42, 8, 9, 10, 11]

Well what do we do if we want to change the 3rd element of the list c to the number 42?

In [35]:
# We simply do
c[2] = 42

print(c)

[2, 3, 42, 5, 6, 7, 8, 9, 10, 11]


As you can see the third element of the list has changed

Let us try removing one element from somewhere in between. Let's remove 7 from the list c

In [36]:
c.remove(7)

print(c)

[2, 3, 42, 5, 6, 8, 9, 10, 11]


## Tuples
A tuple is a collection of entries which are ordered in the same way as lists but cannot be changed.

All the regular list operations apply to tuples except for those which attempt to change it's own value

In [37]:
# Let us define a tuple

tup_a = (1,2,3,4,5,6)

print(tup_a)

(1, 2, 3, 4, 5, 6)


Let us try changing the entry of a tuple and see what happens

In [38]:
tup_a[2] = 42

TypeError: 'tuple' object does not support item assignment

Oh no, it errored out :(
  
But this was expected right?

In [40]:
# Code along with us
# Make a new list e which has double the values of tuple a and select the 3rd till 5th element (both inclusive) of both list b and tuple tup_a to compare

e = [2*x for x in tup_a]

print(e[2:5])
print(tup_a[2:5])

[6, 8, 10]
(3, 4, 5)


## Sets
A collection of items that is both unordered and unindexed and doesn't allow duplicate values is called a set. Its main usage is to store multiple items in a single variable.


In [None]:
thisset = {"apple", "banana", "cherry"}
print(thisset) 

#Another way to define it
thisset = set(["apple", "banana", "cherry"])
print(thisset)

{'apple', 'cherry', 'banana'}
{'apple', 'cherry', 'banana'}


Well now that we know how to define a set let's go ahead and see some set operations

In [None]:
one=set([4,5,6,7,1,2,3]) 
two=set([4,5,3,2,1,2]) 
 
print("The sets: ", one, two)
print ("Intersection: ",one & two) 
print ("Union: ",one | two) 
print("Difference: ",one - two)

{1, 2, 3, 4, 5, 6, 7} {1, 2, 3, 4, 5}
{1, 2, 3, 4, 5}
{1, 2, 3, 4, 5, 6, 7}


As you can see with set two, the value which was repeated has got deleted while creating the set

# Dictionaries

A dictionary is a collection of key value pairs which is ordered, changable and does not allow duplicate members. 

Too much terminology which doesn't make sense?

Don't worry about it! Let's consider an analogy. 

What do you do when you save your close friend Manish's phone number on your phone? You type in their name followed by their phone number so that when they call you, you don't just see 10 digits staring at your face but instead see your friend's name. Well while saving that friend's number you have essentially created a dictionary!!!!

You have used the friend's name as the key and his/her phone number as the value in the key value pair of the dictionary

In [None]:
phone_num = {'Manish': 9937593721}  # This is just a random number I typed which starts with 9 please don't call this number, I don't even know if it's valid.

print(phone_num)

{'Manish': 9937593721}


You go on and meet another friend of yours whose name is Harish. He gives you his number so once again you save this number

In [None]:
phone_num['Harish'] = 9203848363  # You can make a new entry by mentioning the key and the value as shown

print(phone_num)

{'Manish': 9937593721, 'Harish': 9203848363}


While you were talking to Harish, somewhere in the conversation Harish tells you that very recently he had switched his phone and he lost all his contacts. He asks you to give him Manish's number as they too are good friends. But wait none of us actually remember all our friends numbers do we? :P

Well the dictionary you created can be used for this purpose!

In [None]:
print( phone_num['Manish'] )

9937593721


While you're talking Harish who is a little absent minded suddenly remembers that he gave you his old phone number by mistake and now wants you to save his new number. Well how do we do that with dictionaries?

In [None]:
phone_num['Harish'] = 9694038272   # It's that simple!

print(phone_num)

{'Manish': 9937593721, 'Harish': 9694038272}


And while he's at it, he asks you to also save his email ID just in case something goes wrong. Well you might be thinking how do we do that? 

Did you know that python also allows us to make a dictionary inside a dictionary?

In [None]:
phone_num['Harish'] = {'number': 9694038272, 'email': 'harishsinha2002@abc.com'}

print(phone_num)

{'Manish': 9937593721, 'Harish': {'number': 9694038272, 'email': 'harishsinha2002@abc.com'}}


Well how convenient is that?

## Functions

Functions are "self contained" modules of code that accomplish a specific task. 
Functions usually "take in" data, process it, and "return" a result. 
Once a function is written, it can be used over and over and over again. 

In [None]:
def add_nums(a, b):
  return a+b

print(add_nums(2,3))

5


Well at this point you might think, why do we even need functions? it would be easier to type a+b than add_nums(a,b)

But what happens when you have bigger pieces of code that you want to run?

Let's say you want to check if a number is even or odd

In [None]:
def odd_even(num):
  if num % 2 == 0:
    return 'Even'
  else:
    return 'Odd'

In [None]:
print(odd_even(4), odd_even(6), odd_even(5))

Even Even Odd


Didn't that just save us a lot of time and unnecesary typing of code?

Let's Solve a question to actually get an idea of what's happening

Write a function in python to take in a list a and a number d. Rotate the array a to the left by one step d times and return the output

Detialed problem statement can be found [here](https://www.hackerrank.com/challenges/array-left-rotation/problem)

In [None]:
def rotLeft(a, d):
    shift = d % len(a)  
    arr = [0]*len(a)    
    for i in range(len(a)):
        arr[i-shift] = a[i]
    
    return arr

# **Python Libraries**


What is a Library? <br> A collection of modules used for specific applications.<br>
Popular Libraries in Python

*   NumPy
*   Pandas
*   OpenCV
*   Matplotlib
*   Scipy
*   Tensorflow
*   PyTorch 
<br> and many more...



##Importing a Library
We use 'import' to get access to the functions of the library

In [None]:
# importing numpy library with the keyword as np
# Syntax: import <library_name> as <shorthand_notation> 
# Note: as part is optional, but recommended
import numpy as np

Here, we use a short hand notation for NumPy as 'np' while importing. Trust me, it saves time :)

# Introduction to NumPy
What is NumPy? <br>
A python library used to deal with large multidimensional arrays. <br>


In [None]:
#Code along with us
#x=np.array([9,23,32,47,53])
#print(x, type(x))

Bro, we already have Lists and Tuples right? Why NumPy? <br>
Let's compare

In [None]:
N = 100000

Let's create a list with N entries containing integers from 0 to N-1 and multiply it element-wise with itself.<br> %%time will time the operation.

In [None]:
%%time
list_ = list(range(N))
for i in range(N):
    list_[i] = list_[i] * list_[i]

CPU times: user 21.9 s, sys: 2.03 s, total: 23.9 s
Wall time: 24 s


Now, let's replicate this process using NumPy. Observe the simplicity in syntax and the latency

In [None]:
%%time
arr = np.arange(N)
arr = arr * arr

CPU times: user 209 ms, sys: 461 ms, total: 670 ms
Wall time: 680 ms


## Creating NumPy arrays

In [None]:
# Creating an array
arr = np.arange(12)
# Datatype of entries in np array
print(arr.dtype)
# Dimension of np array
print(arr.ndim)
# Shape of array
print(arr.shape)
# Size of array
print(arr.size)

int64
1
(12,)
12


Let's create a 2 Dimensional array

In [42]:
# Code along with us
arr = np.random.randn(2,3)
arr

array([[-2.04167351, -1.57521836,  0.30600353],
       [-0.86817682, -0.85993279, -0.47049219]])

## Understanding dimension and shape of an array
Try making a 3 Dimensional array. How about a 4 dimensional array? It's getting more complicated? <br>
No worries, we have NumPy functions to create some basic arrays of required dimensions

In [None]:
# Create a 2x3x4 array with ones
np.ones((2,3,4))

array([[[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]],

       [[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]]])

In [45]:
# Create a 10x3 array with zeros
#Fill the code
arr2 = np.zeros((10,3))
arr2

array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

Let's first define a **function** that gives all info about nature of array

In [46]:
# Function code
def array_info(arr):
    print("dtype is",arr.dtype)
    print("shape is",arr.shape)
    print("Dimension is",arr.ndim)
    print("Size is",arr.size,"\n")  #\n to separate the text in next function call with a new line char


In [47]:
# Calling the function
array_info(arr2)

dtype is float64
shape is (10, 3)
Dimension is 2
Size is 30 



Observe the difference between the following arrays

In [48]:
a = np.ones((6))
b = np.ones((6,1))
c = np.ones((2,3))
array_info(a)
array_info(b)
array_info(c)

dtype is float64
shape is (6,)
Dimension is 1
Size is 6 

dtype is float64
shape is (6, 1)
Dimension is 2
Size is 6 

dtype is float64
shape is (2, 3)
Dimension is 2
Size is 6 



Try creating different sizes of arrays, play around, get the feel

## Reshaping and Flattening

In [50]:
a.reshape(3,2)

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

Try converting arr2 to a one dimensional array without changing the elements using np.reshape()

In [52]:
# Code with us
c.reshape((6))

array([1., 1., 1., 1., 1., 1.])

Use np.ravel to do the same

In [53]:
c.ravel()

array([1., 1., 1., 1., 1., 1.])

## Random normal and random uniform in Numpy
Ever heard of normal and uniform distribution? <br>
*   Normal distribution is a probability distribution of a random variable, which has bell shaped curve<br> 
*   In uniform distribution, probability of choosing any number at random is equally likely. <br>
*   These probability distributions are extremely useful in a broad spectrum of engineering disciplines. We will look more about this in later part of the session

In [54]:
# Creating 2x3 array whose elements are randomly sampled from a normal distribution with mean =0 and std = 1
np.random.randn(2,3)

array([[ 0.54522357,  0.96554642,  2.09053776],
       [ 1.73371611, -0.36783196, -0.55712886]])

In [55]:
# Creating 2x3 array whose elements are randomly sampled from a uniform distribution of [0,1)
np.random.rand(2,3)

array([[0.3110705 , 0.61934981, 0.9789628 ],
       [0.37606582, 0.13196872, 0.35608532]])

np.random.randint samples only integers, try creating one

In [56]:
np.random.randint?

## Other Data types in NumPy array

In [57]:
# Create an array with Boolean entries
arr_bool = np.array([True, False, True, True])
print(arr_bool, type(arr_bool))


[ True False  True  True] <class 'numpy.ndarray'>


In [58]:
# Create an array with str type elements
arr_str = np.array(['1.4','3.14','6.314'])


Can I use alphabetical characters as elements of NumPy array?

## NumPy Operations

In [62]:
arr1 = np.random.randint(0,10,(2,3))
arr2 = np.random.randint(0,10,(2,3))
print(arr1)
print(arr2)

[[4 1 0]
 [4 2 2]]
[[7 5 9]
 [5 5 4]]


In [63]:
# Addition
print(arr1+arr2)
# Subtraction
print(arr1-arr2)
# Multiplication
print(arr1*arr2)    #Notice that it multiplies element-wise
# Division (Element-wise)
print(arr1/arr2)

[[11  6  9]
 [ 9  7  6]]
[[-3 -4 -9]
 [-1 -3 -2]]
[[28  5  0]
 [20 10  8]]
[[0.57142857 0.2        0.        ]
 [0.8        0.4        0.5       ]]


Let's look at element-wise division more detailly

In [64]:
arr = np.zeros((2,3))
arr_inv = 1/arr

  arr_inv = 1/arr


In [65]:
print(arr_inv)

[[inf inf inf]
 [inf inf inf]]


We did get a **run-time error**

In [66]:
np.isinf(arr_inv)

array([[ True,  True,  True],
       [ True,  True,  True]])

In lists, we say that we had to iterate through all the elements to increment each value by some value. NumPy makes our life easy

In [67]:
print(2*arr1 + 2)

[[10  4  2]
 [10  6  6]]


To learn more how this works, browse for **NumPy broadcasting**

Some important mathematical operations. Try running the code below

In [68]:
np.sin(arr1)

array([[-0.7568025 ,  0.84147098,  0.        ],
       [-0.7568025 ,  0.90929743,  0.90929743]])

In [69]:
np.exp(arr1)

array([[54.59815003,  2.71828183,  1.        ],
       [54.59815003,  7.3890561 ,  7.3890561 ]])

## Statistical operations

In [70]:
# Intialise an NumPy array with some vals
arr = np.random.randint(0,50,(3,4))

In [71]:
np.amin(arr)

1

In [72]:
np.amax(arr)

45

In [73]:
np.mean(arr)

24.666666666666668

In [74]:
#axis 0 and axis 1

Browse for NumPy functions to find median, variance, standard deviation and percentile

## Exercise Problem 1
Write a program to multiply two matrices of size $(100, 100)$ in two methods: (a) by using `np.dot(mat_1, mat_2)` and (b) by using for-loops. Comapre the time of execution in both the cases. Check out the documentation of `np.dot` in case that is not familiar to you. 

In [None]:
#Initialise the two matrices
mat_1 = np.random.rand(100,100)
mat_2 = np.random.rand(100,100)
#Intitialise the output matrix with zero


In [None]:
## Using the definition of matrix mutliplication

In [None]:
## Using np.dot function

## Exercise Problem 2
Create two vectors $y$ and $\hat{y}$ having **same** dimensions, where $\hat{y}$ should consist of random numbers between $[0, 1)$ and $y$ should contain $0s$ and $1s$, for example $y = [0, 1, 1, 0, 1, 0, 0, 1, ..., 1]$. Compute the given expression: $$O = -\frac{1}{n}\sum_{i=1}^{n}[y_i\log_2(\hat{y_i}) + (1-y_i)\log_2(1-\hat{y_i})]$$
where $n$ = 100, is the total number of elements in $y$ and $\hat{y}$.

In [None]:
#Given n = 100
#Create a 1D array y, of size 100 with randomly selected 0s and 1s
#Create a 1D array y_hat, of size 100 with numbers randomly (uniformly) selected from [0,1)
#Find logarithmic loss (Also known as logarithmic loss)


The expression $O = -\frac{1}{n}\sum_{i=1}^{n}[y_i\log_2(\hat{y_i}) + (1-y_i)\log_2(1-\hat{y_i})]$, which you have computed is actually a **Cross-Entropy** loss function used in machine learning for classification task which tells us how bad or good model is performing, if $O$ is large then model is performing worst and vice versa.

# Matplotlib


Matplotlib is a popular plotting software. It's name comes from "MATLAB like Plotting library" <br> In this section we will be using a module called pyplot from matlplotlib. A wide range of statistical plots like bar diagrams, pie charts, histograms etc. can be shown using Matplotlib<br>

In [None]:
from matplotlib import pyplot as plt

## Plotting graphs
Let's make a simple $y$ vs $x$ graph for the function $y(x)$ = $3x$ + 4

In [None]:
x = np.linspace(-10,10,100)
y = 3*x + 4

In [None]:
plt.plot(x,y)

Let's try plotting any quadratic function, say $y(x)$ = $x^2$

In [None]:
y = #Fill the code


SyntaxError: ignored

Observe that the plot functions plots the best fit inspite of giving only discrete points as inputs.

Let's consider a noisy data i.e, a data which doesn't have a perfect fit 

In [None]:
# Samples random values from standard normal distribution (Size = 100)
noise = np.random.randn(100)
# Create a lin-spaced array x with 100 elements
x = np.linspace(-10,10,100)
y = 3*x + noise
plt.plot(x,y);

The plot function tries to give a good fit to the curve

## Scatter Plots

In the above example, let's increase the data from 100 to 1000

In [None]:
noise = np.random.randn(1000)
x = np.linspace(-10,10,1000)
y = 3*x + 6*noise
plt.plot(x,y);

The plot does look messy right! We use scatter plots when we want to plot the data points without any curve to fit them

In [None]:
plt.scatter(x,y,s=2);

## Histograms 

To get a better understanding of normal distribution, we can use histograms

In [None]:
# A 1D array of size 10k
x = np.random.randn(10000)
plt.hist(x);

plt.hist() function takes an argument called 'bins'. When it takes positive integral value it defines the number of equal-width bins (Class Intervals) in the range.

In [None]:
plt.hist(x, bins = 11);

Play around with the value of bins, see what happens when you increase it