# Inferential Statistics

**This process of “inferring” insights from sample data is called “Inferential Statistics**

Many a time, you may require a very large amount of data for your analysis which may need too much time and resources to acquire. In such situations, you are forced to work with a smaller sample of the data, instead of having the entire data to work with.

Situations like these arise all the time at big companies like Amazon. For example, say the Amazon QC department wants to know what proportion of the products in its warehouses are defective. Instead of going through all of its products (which would be a lot!), the Amazon QC team can just check a small sample of 1,000 products and then find, for this sample, the defect rate (i.e. the proportion of defective products). Then, based on this sample's defect rate, the team can "infer" what the defect rate is for all the products in the warehouses.

This process of “inferring” insights from sample data is called “Inferential Statistics”.

Note that even after using inferential statistics, you would only be able to estimate the population data from the sample data, but not find the exact values. This is because when you don't have the exact data, you can only make reasonable estimates about it with a limited level of certainty. Therefore, when certainty is limited, we talk in terms of probability.


Prerequisites
You’ll need to brush up on your concepts of probability before you begin this module, specifically the following concepts:

* Basic definition of probability
* Multiplication rule of probability
* Addition rule of probability
* (Combinatorics)

# Problem 1
What is the probability that you would get this combination of balls after 4 trials? (One trial = taking out a ball, noting its colour, and putting it back in the bag.)

Note that the bag contains 2 blue and 3 red balls.

![2.png](attachment:12654494-7ed3-49b6-bc73-e41bfe425092.png)

i.e., the combination Blue-Red-Red-Red.



Options: 
1. 0.4*0.4*0.4*0.4
2. 0.6*0.6*0.6*0.6
3. 0.4*0.6*0.6*0.6
4. 0.4*0.4*0.6*0.6

![3.png](attachment:e49ba431-ebaa-4343-afee-021e2380dd31.png)

# Problem 2

What is the probability that you will get 3 red balls after 4 trials? (One trial = taking out a ball, noting its colour, and putting it back in the bag.)


![4.png](attachment:135ca105-de40-45ff-b075-f5d873e3fdbe.png)

# Basics of Probability

**Topics:**

* Random variables
* Probability distributions
* Expected value


**Three-step process for probability question:**

1. Find all the possible combinations:
2. Find the probability of each combination
3. Use the probabilities to estimate the per player

**Find all the possible combinations:**

![5.png](attachment:24a18616-71e5-4441-a11d-081504abe7f4.png)

**Find the probability of each combination**

![6.png](attachment:25ebd58f-23ce-4694-8a85-18ad9a3a5dd3.png)

So, the random variable X basically converts outcomes of experiments to something measurable.

![13.png](attachment:43901c1a-5576-4f8d-a515-ded451ba8f72.png)

![14.png](attachment:228c9056-7a73-4aa5-88d1-1d9f59a64e31.png)

![12.png](attachment:6ae38f43-1593-4743-a75a-557f5ddfa35c.png)


**Use the probabilities to estimate the per player**

1. **Sample / Experiment data for probability no of red ball occurance is:** 

![9.png](attachment:52829ae1-5894-4d35-965d-c6a8b3d643e8.png)

2. **Now if NUMBER OF PLAYERS is 1000 find the values how many players get 0,1,2,3,4 red balls:**
    
    * X(Random Variable X) -> no of red balls
    * P(X) -> Probability of X 
    * No of Players with X  = No of players * P(X)
    

        P(X = 1) = 0.16
        Total number of players = 1000
        Number of players with 1 red ball = 1000 * 0.16 = 160
        P(X = 2) = 0.347
  
        Total number of players = 1000
        
        Number of players with 0 red ball(s) = 1000 * 0.027 = 27
        Number of players with 1 red ball(s) = 1000 * 0.16 = 160
        Number of players with 2 red balls = 1000 * 0.347 = 347
        Number of players with 3 red ball(s) = 1000 * 0.333 = 333
        Number of players with 4 red ball(s) = 1000 * 0.133 = 133


3. **EXPECTED VALUE**

        Number of players with 0 red ball(s) = 27
        Number of players with 1 red ball(s) = 160
        Number of players with 2 red ball(s) = 347
        Number of players with 3 red ball(s) = 333
        Number of players with 4 red ball(s) = 133
        Total number of red balls = 0 * 27 + 1 * 160 + 2 * 347 + 3 * 333 + 4 * 133 = 2385
  
        Average number of red balls for 1 game = 1000 = 2.385
        

4. **EXPECTED VALUE CALCULATION**

    X can take X1, X2, X3, ........., Xn
    EV = x₁*P(X = x₁) + x2*P(X = x2) + x3*P(X = x3) + ....... + xn*P(X = xn)
    
    Here, P(X = x) denotes the probability that X is equal to x
    
    So, EV = 0*(0.027) + 1*(0.160)+2*(0.347)+3*(0.333)+4*(0.133) = 2.385   


# Probability Distribution
![8.png](attachment:bb903dc4-a763-493a-af4f-4d30471301c9.png)

![9.png](attachment:27b041f4-7210-405c-bbd4-a31968228033.png)

#### Probability Distribution Chart

![10.png](attachment:6951a51f-a656-4b46-8468-b5db550c5d4c.png)


For example, let’s say as a data analyst at a bank, you are trying to find out which of the customers will default on their loan, i.e. stop paying their loans. Based on some data, you have been able to make the following predictions:


| Customer No. | Yearly Income (in rupees) | Amount of Loan Due (in rupees) | Number of Dependents | Default Prediction |
|--------------|---------------------------|--------------------------------|-----------------------|---------------------|
| 1            | ₹10 lakh                  | ₹75 lakh                       | 3                     | Yes                 |
| 2            | ₹15 lakh                  | ₹50 lakh                       | 2                     | No                  |
| 3            | ₹20 lakh                  | ₹40 lakh                       | 1                     | No                  |



Now, **instead of processing the yes/no response, it will be much easier if you define a random variable X**, indicating whether the customer is predicted to default or not. The values will be assigned according to this rule:

| Customer No. | Yearly Income (in rupees) | Amount of Loan Due (in rupees) | Number of Dependents | X (random variable) |
|--------------|---------------------------|----------------------------------|-----------------------|----------------------|
| 1            | ₹10 lakh                  | ₹75 lakh                         | 3                     | 1                    |
| 2            | ₹15 lakh                  | ₹50 lakh                         | 2                     | 0                    |
| 3            | ₹20 lakh                  | ₹40 lakh                         | 1                     | 0                    |


X = 1, if the customer defaults

X = 0, if the customer does not default


Now, the data changes to the following:



Now, in this form, the table is entirely **quantified, i.e. converted to numbers**. Now that the data is entirely in quantitative terms, it becomes **possible to perform a number of different kinds of statistical analyses** on it.


### Rules :
**Valid, complete probability distribution, there are**
   * no negative values
   * total of all probability values adds up to 1
   * probability distribution and frequency distribution would be exactly **similar in shape, just with different scales**

#### Expected Value
So, the **expected value** for a variable X is the value of X we would “expect” to get after performing the experiment once. 
 It is also called the expectation, average, and mean value. Mathematically speaking, for a random variable X that can take values x1,x2,x3,...........,xn the expected value (EV) is given by:

 > EV(X)=x1∗P(X=x1)+x2∗P(X=x2)+x3∗P(X=x3)+...........+xn∗P(X=xn)

As you may recall, for our red ball game, the expected value came out to be 2.385. What does that mean? How does that even help us with our original question, which was how much money, on average, are the players expected to make?

The **expected value should be interpreted as the average value** you get after the **experiment has been conducted an infinite number of times**. For example, the expected value for the number of red balls is 2.385. This means that if we conduct the experiment (play the game) infinite times, the average number of red balls per game would end up being 2.385. 



So, you can clearly see that, after a large number of simulations, the average value does, in fact, converge towards the expected value, which is 2.385.


# Discrete Probability Distributions

* Binomial probability distribution
* Uniform probability distribution
* Cumulative probability

**Prerequsites:**
1. [Addition rule of probability for mutually exclusive events](https://www.mathgoodies.com//lessons/vol6/addition_rules)
2. [Multiplication rule of probability for independent events](https://www.mathgoodies.com//lessons/vol6/independent_events)
3. [nCr (Combinatorics)](https://www.calculatorsoup.com/calculators/discretemathematics/combinations.php)

## Addition Rule

To find the probability of event A or B, we must first determine whether the events are mutually exclusive or non-mutually exclusive. Then we can apply the appropriate 

**Addition Rule:**

### Addition Rule 1:### 
When two events, A and B, are mutually exclusive, the probability that A or B will occur is the sum of the probability of each event. 

P(A or B) = P(A) + P(B)

### Addition Rule 2:###
When two events, A and B, are non-mutually exclusive, there is some overlap between these events. The probability that A or B will occur is the sum of the probability of each event, minus the probability of the overlap.

P(A or B) = P(A) + P(B) – P(A and B)


**Addition Rule 1 (Mutually Exclusive):** 

**Experiment 1:** A single 6-sided die is rolled. What is the probability of rolling a 2 or a 5?

1. The number rolled can be a 2.

2. The number rolled can be a 5.

**Events:** These events are mutually exclusive since they cannot occur at the same time.

**Probabilities:** How do we find the probabilities of these mutually exclusive events? We need a rule to guide us.

**Addition Rule 1:** When two events, A and B, are mutually exclusive, the probability that A or B will occur is the sum of the probability of each event.


 Probabilities: 
 
$ P(2)=1/6$
                
$P(5)	 = 	1/ 6$
                
$P(2 or 5) = P(2)+P(5) = 	1/6	 + 	1/6 = 2/6 = 1/3 $

**Experiment 2:** A glass jar contains 1 red, 3 green, 2 blue, and 4 yellow marbles. If a single marble is chosen at random from the jar, what is the probability that it is yellow or green?

Probabilities:

$P(yellow)	 = 	 4/10$
  
$P(green)	 = 	 3/10$
  
$P(yellow or green)	 = 	P(yellow)	 + 	P(green) = 	 4/10 	 + 	 3/10 = 7/10 $


**Addition Rule 2 (Mutually Not Exclusive)**


**Experiment 4:** In a math class of 30 students, 17 are boys and 13 are girls. On a unit test, 4 boys and 5 girls made an A grade. If a student is chosen at random from the class, what is the probability of choosing a girl or an A student?

**Probabilities:**

$ P(girl or A) = P(girl) + P(A) – P(girl and A)$

$= 	13/30 + 9/30 – 5/30$

$= 	17/30$

**Experiment 6:** On New Year’s Eve, the probability of a person having a car accident is 0.09. The probability of a person driving while intoxicated is 0.32 and probability of a person having a car accident while intoxicated is 0.15. What is the probability of a person driving while intoxicated or having a car accident?

**Probabilities:**

$P(intoxicated or accident)	 = 	P(intoxicated)	 + 	P(accident)	 – 	P(intoxicated and accident)$
  
$= 	0.32	 + 	0.09	 – 	0.15$
$ = 0.26$	

## Multiplication Rule: 
The probability of two or more independent events occurring in sequence can be found by computing the probability of each event separately, and then multiplying the results together.

**Definition:** Two events, A and B, are independent if the fact that A occurs does not affect the probability of B occurring.

**Some other examples of independent events are:**

* Landing on heads after tossing a coin **AND** rolling a 5 on a single 6-sided die.
* Choosing a marble from a jar **AND** landing on heads after tossing a coin.
* Choosing a 3 from a deck of cards, replacing it, **AND** then choosing an ace as the second card.
* Rolling a 4 on a single 6-sided die, **AND** then rolling a 1 on a second roll of the die.

To find the probability of two independent events that occur in sequence, find the probability of each event occurring separately, and then multiply the probabilities. This multiplication rule is defined symbolically below. Note that multiplication is represented by AND.

### Multiplication Rule 1:### 

When two events, A and B, are independent, the probability of both occurring is:

P(A and B) = P(A) · P(B)

**Experiment 1:** A dresser drawer contains one pair of socks with each of the following colors: blue, brown, red, white and black. Each pair is folded together in a matching set. You reach into the sock drawer and choose a pair of socks without looking. You replace this pair and then choose another pair of socks. What is the probability that you will choose the red pair of socks both times?

Probabilities:

$P(red) = 1/5$

$P(red and red) = P(red)·P(red)$

$= 	1/5	· 1/5$

$= 	1/25 $

**Experiment 2:** A coin is tossed and a single 6-sided die is rolled. Find the probability of landing on the head side of the coin and rolling a 3 on the die.

$P(Coin) = 1/2$

$P(Dice) = 1/6$

$P(Coin Dice) = 1/2.1/6 = 1/12$


Multiplication Rule 2: When two events, A and B, are dependent, the probability of both occurring is:

P(A and B)  =  P(A) · P(B|A)

### Multiplication Rule 2:###
When two events, A and B, are dependent, the probability of both occurring is:

P(A and B)  =  P(A) · P(B|A)

**Definition:** Two events are dependent if the outcome or occurrence of the first affects the outcome or occurrence of the second so that the probability is changed.

**Experiment 1:** A card is chosen at random from a standard deck of 52 playing cards. Without replacing it, a second card is chosen. What is the probability that the first card chosen is a queen and the second card chosen is a jack?

Probabilities:

P(queen on first pick)	 = 	 4/52
 
P(jack on 2nd pick given queen on 1st pick)	 = 	 4/51
 
P(queen and jack)	 = 	 4/52 	 · 	 4/51 	 = 	  16/2652  	 = 	  4/663  

Experiment 1 involved two compound, dependent events. The probability of choosing a jack on the second pick given that a queen was chosen on the first pick is called a conditional probability.

The **conditional probability** of an event B in relationship to an event A is the probability that event B occurs given that event A has already occurred. The notation for conditional probability is **P(B|A)** *pronounced as The probability of event B given A]* The notation used above does not mean that B is divided by A. It means the probability of event B given that event A has already occurred. To find the probability of the two dependent events, we use a modified version of Multiplication Rule 1, which was presented in the last lesson.

**Experiment 2:** Mr. Parietti needs two students to help him with a science demonstration for his class of 18 girls and 12 boys. He randomly chooses one student who comes to the front of the room. He then chooses a second student from those still seated. What is the probability that both students chosen are girls?

P(Girl1 and Girl2) = P(Girl1).P(Girl2|Girl1)

 = 18/30 . 17/29 = 306/870=51/145

**Experiment 3:** In a shipment of 20 computers, 3 are defective. Three computers are randomly selected and tested. What is the probability that all three are defective if the first and second ones are not replaced after being tested?

P(Computer1 and Computer2 and Computer3) = P(Computer1) And P(Computer2 And Computer3 | Computer1) And P(Computer3| Computer1 And Computer2)

$ = 3/20.2/19.1/18$

$ = 1/1140 $

### Combinations Formula:

$ C(n,r)=n!/(r!(n−r)!)$

where
n = the set or population
r = subset of n or sample set

**Example1:Choose 2 Prizes from a Set of 6 Prizes** 

You have won first place in a contest and are allowed to choose 2 prizes from a table that has 6 prizes numbered 1 through 6. **How many different combinations of 2 prizes could you possibly choose**?

In this example, we are taking a subset of 2 prizes (r) from a larger set of 6 prizes (n). Looking at the formula, we must calculate “6 choose 2.”

$ C(6,2)= 6!/(2! * (6-2)!) = 6!/(2! * 4!) = 15 Possible Prize Combinations$

The 15 potential combinations are {1,2}, {1,3}, {1,4}, {1,5}, {1,6}, {2,3}, {2,4}, {2,5}, {2,6}, {3,4}, {3,5}, {3,6}, {4,5}, {4,6}, {5,6}



**Example2:Choose 3 Students from a Class of 25**

A teacher is going to choose 3 students from her class to compete in the spelling bee. She wants to figure out how many unique teams of 3 can be created from her class of 25.

In this example, we are taking a subset of 3 students (r) from a larger set of 25 students (n). Looking at the formula, we must calculate “25 choose 3.”

C (25,3)= 25!/(3! * (25-3)!)= 2,300 Possible Teams

**Example3: Sandwich Combinations Problem**

This is a classic math problem and asks something like How many sandwich combinations are possible? and this is how it generally goes.

&&Calculate the possible sandwich combinations if you can choose one item from each of the four categories:**

* 1 bread from 8 options
* 1 meat from 5 options
* 1 cheese from 5 options
* 1 topping from 3 options

Often you will see the answer, **without any reference to the combinations equation C(n,r)**, as the multiplication of the number possible options in each of the categories. 

In this case we calculate:

$C(8,1) = 8, C(5,1) = 5 and C(3,1) = 3$ using the following equation:

$C(n,r) = n! / ( r!(n - r)! )$

$8 × 5 × 5 × 3 = 600$ possible sandwich combinations


We can use this combinations equation to calculate a more complex sandwich problem.

Sandwich Combinations Problem with Multiple Choices
**Calculate the possible combinations if you can choose several items from each of the four categories:**

* 1 bread from 8 options
* 3 meats from 5 options
* 2 cheeses from 5 options
* 0 to 3 toppings from 3 options
* 
Applying the combinations equation, where order does not matter and replacements are not allowed, we calculate the number of possible combinations in each of the categories. You can use the calculator above to prove that each of these is true.

$1 bread from 8 options is C(8,1) = 8$

$3 meats from 5 options C(5,3) = 10$

$2 cheeses from 5 options C(5,2) = 10$

$0 to 3 toppings from 3 options$

$we must calculate each possible number of choices from 0 to 3 and get C(3,0) + C(3,1) + C(3,2) + C(3,3) = 8$

Multiplying the possible combinations for each category we calculate:

$8 × 10 × 10 × 8 = 6,400$ possible sandwich combinations

## Binomial probability distribution

We have gone through the exercise of finding the probability without conducting any experiment. You saw that these theoretical (calculated) values of probability are actually quite close to the experimental values that we got. The small differences that you can notice exist because of the low number of experiments done.

![15.png](attachment:36d572ab-89c7-4b7b-8783-762736abf3af.png)

![16.png](attachment:e06bbd79-06a3-418b-8ca0-9bfc5fc288e5.png)

![17.png](attachment:54f4ccb5-ab92-40e8-a442-e02c03d13e43.png)

![18.png](attachment:df92aa18-865e-4022-8a2f-2262aac14276.png)

![19.png](attachment:39a3718c-4226-4468-b8e7-7de4d3323d23.png)


Recall that we claimed that if we had conducted the upGrad experiment several times, then the resulting experimental probability distribution would have been even closer to the theoretical one.

You can try it out in this [interactive app!](https://da-upgrad.shinyapps.io/5-discrete-binomial)

Now that you know how to find the probability without an experiment, you can calculate the probability for various combinations without much effort. For example, what if the bag for our game had, say, 4 red balls and only 1 blue ball? You don’t need to perform an experiment 100 or 500 times to find the answer. 

Previously, we found the theoretical probability for our game and compared it with the experimental one. Finding the probability without conducting an experiment means that we can find the probability using just pen and paper and with minimal effort.

Now, let’s try to generalise it — 

Q. let’s say that the probability of getting one red ball in one trial is equal to p. In that case, what would be the probability of all 4 balls being red? Let’s see that in the following video.

![20.png](attachment:56d6dfb4-0e2a-4529-bd13-c308d27e75d9.png)

So, the probability distribution for X (i.e., the number of red balls drawn after 4 trials) if the probability of getting a red ball in 1 trial is 'p' is as follows: 

![21.png](attachment:560fea96-f8e4-4f8e-93c6-7d55e041a32f.png)

![22.png](attachment:e70ff368-dc7f-4cc8-889c-48a8518a8ff6.png)

![23.png](attachment:6e301f5e-4e0d-45f8-a7e9-1c37035c9c85.png)


In the previous section, we listed down some conditions that are to be met for the binomial distribution to be applicable. Let’s take a few examples to understand these conditions in detail.

 
| Binomial Distribution Applicable                                        | Binomial Distribution Not Applicable                                        |
|-------------------------------------------------------------------------|-----------------------------------------------------------------------------|
| Tossing a coin 20 times to see how many tails occur                     | Tossing a coin until a head occurs                                          |
| Asking 200 randomly selected people if they are older than 21 or not    | Asking 200 randomly selected people how old they are                        |
| Drawing 4 red balls from a bag, putting each ball back after drawing it | Drawing 4 red balls from a bag, not putting each ball back after drawing it |

 


If you toss a coin 20 times to see how many times you get tails, you are following all the conditions required for a binomial distribution. The total number of trials is fixed (20), and you can only have two outcomes, i.e., tails or heads. The probability of getting a tail is 0.5 each time you toss a coin.

 

In a way, this is similar to drawing 20 balls out of a bag, replacing each ball after drawing it, and seeing how many of the balls are red. Here, the probability of getting a red ball in one trial is 0.5.

 

When you toss a coin until you get heads, the total number of trials is not fixed. This is similar to taking out balls from the bag repeatedly until you draw a red ball. You can still find the probability of getting heads in 1 trial, 2 trials, 3 trials etc. and so on, but you cannot use binomial distribution to find that probability.

 

In the second example, where binomial distribution is not applicable, the experiment does not have only two outcomes, but several. It is similar to taking out balls from a bag that contains red, blue, black, orange, and other-coloured balls. The probability distribution for this experiment cannot be made using binomial distribution.

 

In the final example, the probability of trials is not equal to each other. For example, the probability of drawing a red ball in the first trial is 35. Now, in the second trial, the probability of drawing a red ball would be equal to 24 not 35, as the red ball taken out in the first trial was not put back. Hence, the probability of getting the combination red-red-red-blue, for example, would be $35*24*13*22$, which is not the value we got while deriving binomial distribution $ (35*35*35*25) $. Again, you cannot use binomial distribution to find the probability in this case.

 

In other words, binomial distribution is applicable in situations where there are a fixed number of yes or no questions, with the probability of a yes or a no remaining the same for all questions.

![24.png](attachment:da5efd23-6bf1-431b-b44d-7f768deb5810.png)

# Cumulative probability

**Cumulative probability** of X, denoted by **F(x)**, is defined as the probability of the variable being less than or equal to x.

In mathematical terms, you would write cumulative probability **F(x) = P(X<x)**. For example, F(4) = P(X<4), F(3) = P(X<3).

In the previous example, we only discussed the probability of getting an exact value. For example, we know the probability of X = 4 (4 red balls). But what if the house wants to know the **probability of getting < 3 red balls, as the house knows that for < 3 red balls, the players will lose** and the house will make money?

Sometimes, talking in terms of less than is more useful. For **example — how many employees can get to work in less than 40 minutes**? Let’s explore how you can find the probability for such cases.

![25.png](attachment:99ab3e15-0b9a-4f43-9c02-31145f0f47c7.png)

| x | F(x) = P(X < x) |
|---|------------------|
| 0 | 0.0256           |
| 1 | 0.1792           |
| 2 | 0.5248           |
| 3 | 0.8704           |
| 4 | 1.0000           |

![26.png](attachment:ff0eaffd-b115-484f-befe-8c114b6ed059.png)


Let’s define X as the number of wickets Ishant Sharma would take in the next T20 match he plays. Also, the following is an incomplete table for cumulative probability based on previous experience:

| x (Number of wickets taken by Ishant Sharma in a T20 Match) | F(x)  |
|-------------------------------------------------------------|-------|
| 0                                                           | 0.35  |
| 1                                                           | 0.55  |
| 2                                                           | 0.75  |

 
What is the probability that he would take more than 2 wickets in the next  T20 match he plays?


You know that F(2) i.e. P(X<=2) is 0.75. Now, you have to find the probability of X being higher than 2, i.e. P(X>2). Notice that the sum of P(X<=2) and P(X>2) would be equal to 1, because their sum will cover all possible outcomes. Hence, P(X<=2) + P(X>2) = 1, which gives 0.75 + P(X>2) = 1. Hence, P(X>2) = 0.25.

# Continuous Probability

![32.png](attachment:e5cf1572-ded0-4293-899c-655074f0912f.png)


**CDF** and **PDF** are  two functions that talk about probabilities in terms of intervals rather than the exact values. It is advisable to use them when talking about continuous random variables, not the bar chart distribution that we used for discrete variables.

## Cumulative distribution function

Recall that a **CDF**, or a **cumulative distribution function**, is a distribution that plots the cumulative probability of X against X.

![29.png](attachment:508cbfb9-2d0f-43cc-a0a9-d2d8b55e6cb5.png)


## Cumulative distribution function

A **PDF**, or a **Probability Density Function**, however, is a function in which the area under the curve gives you the cumulative probability.

![30.png](attachment:beb6664a-215c-4562-af25-f66c7201dfad.png)

For example, the area under the curve between 20, the smallest possible value of X, and 28 gives the cumulative probability for X, which is equal to 28.



The main difference between the cumulative probability distribution of a continuous random variable and a discrete one lies in the way you plot them. While a continuous variables’ cumulative distribution is a curve, a distribution for discrete variables looks more like a bar chart.


![31.png](attachment:00ac5068-0de9-4a9b-af42-8f466268ab86.png)

The reason for the difference is that for discrete variables, the cumulative probability does not change very frequently. In the discrete variable example, we only care about what the probability is for 0, 1, 2, 3 and 4. This is because the cumulative probability will not change between, say, 3 and 3.999999. For all values between these two, the cumulative probability is equal to 0.8704.



However, for the continuous variable, i.e., the daily commute time, you have a different cumulative probability value for every value of X. For example, the value of cumulative probability at 21 will be different from its value at 21.1, which will again be different from the one at 21.2, and so on. Hence, you would show its cumulative probability as a continuous curve, not a bar chart.


## Difference between Cumulitive & Continous Probability

![34.png](attachment:19b1b86b-da17-4859-b469-2fa5f3ca46f8.png)


### Uniform distribution

![33.png](attachment:f0f6c402-b0ab-4e08-8e09-9c04307fb259.png)

![35.png](attachment:5a6cddb9-b9ef-403e-8efa-90eccb643e16.png)

Clearly, this area is the area of a rectangle with length 10 and unknown height h. Hence, you can say that 10 * h = 1, which gives us h = 0.1. So, the value of the PDF for all values between 0 and 10 is 0.1. 


Q. For the uniform PDF from the previous question, find the cumulative probability for X = 0.5.

![36.png](attachment:f3a41ce5-fec0-454c-b4e4-f9948757d10f.png)

![37.png](attachment:91ac4a03-e446-4f44-948c-a9c59699f70c.png)

This area = 0.1 * 0.5 = 0.05.



Now you must be wondering when to use PDFs and when to use CDFs. They are both good for continuous variables, but which one is used more in real-life analyses?



Well, PDFs are more commonly used in real life. The reason is that it is much easier to see patterns in PDFs as compared to CDFs. For example, here are the PDF and the CDF of a uniformly distributed continuous random variable:

![38.png](attachment:e1b4cdf1-9782-4064-9321-08a5469e2b2a.png)

The **PDF clearly shows uniformity**, as the probability density’s value remains constant for all possible values. However, the CDF does not show any trends that help you identify quickly that the variable is uniformly distributed.



Now, let’s look at the PDF and the CDF of a symmetrically distributed continuous random variable:

![39.png](attachment:fb95c1be-48da-4fe8-9c94-b9330ca32e76.png)

Again, it is clear that the symmetrical nature of the variable is much more apparent in the PDF than in the CDF.


Hence, generally, PDFs are used more commonly than CDFs.




#### Problem 8

Suppose you work at a sports analysis company and you want to analyse the effect a bowler’s height has on his/her performance. So, you create a list of all 5 wicket hauls in the last decade. Based on this data, they created a cumulative probability distribution for X, where X = height of the bowler who took the 5 wicket haul.

Now, based on the data, you conclude that the cumulative probability, F(175.3 cm) = 0.3. In this case, which of the following statements is correct?

P(X<175.3 cm) = 0.3

P(X<175.3 cm) = 0.3

(Remember that height is a continuous variable.)

* Only statement 1 is correct
* Only statement 2 is correct
* Both statements 1 and 2 are correct

 ![40'.png](attachment:0b96febb-058e-4584-935f-810400b36194.png)


## Normal Distribution

![41.png](attachment:9ad8f3df-48e5-4d86-bed5-62e204cd8bda.png)

All data that is normally distributed follows the 1-2-3 rule. This rule states that there is a -

* 68% probability of the variable lying within 1 standard deviation of the mean
* 95% probability of the variable lying within 2 standard deviations of the mean
* 99.7% probability of the variable lying within 3 standard deviations of the mean

![42.png](attachment:8e873504-a296-47ef-b9f7-cc9c8d16b6a0.png)

![43.png](attachment:fc5cc837-d893-4e0a-a193-00607f3a8d46.png)


This is actually like saying that, if you buy a loaf of bread everyday and measure it, then - (mean weight = 100 g, standard deviation = 1 g)

* For 5 days every week, the weight of the loaf you bought that day will be within 99 g (100-1) and 101 g (100+1).
* For 20 days every 3 weeks, the weight of the loaf you bought that day will be within 98 g (100-2) and 102 g (100+2).
* For 364 days every year, the weight of the loaf you bought that day will be within 97 g (100-3) and 103 g (100+3).

A lot of naturally occurring variables are normally distributed. For example, the heights of a group of adult men would be normally distributed. To try this out, we asked 50 male employees at the UpGrad office for their height and then plotted the probability density function using that data.

![44.png](attachment:55bf300b-502d-4071-b4f9-fbbe7d5049dd.png)

[Distributions](https://charts.upgrad.com/dist-crv/index.html)


In fact, you can select “Uniform” in the drop-down menu and visualise the CDF and PDF for the uniform distribution too. Be sure to play around with a and b, which give the lowest and highest possible values, respectively, for the random variable.


As you learnt in the previous question, it doesn’t matter what the values of µ and σ are. If you want to find the probability, **all you need to know is how far the value of X is from µ, and specifically, what multiple of σ is the difference between X and µ**.

![45.png](attachment:25290616-b0db-4468-8988-09c4c5256dfe.png)

Not only that, you can also use Excel or Python to find the cumulative probability for Z. For example, let’s say you want to find the cumulative probability for Z = 1.5. In Excel, you would type:

 

= NORM.S.DIST(1.5, TRUE)

Basically, the syntax is:

= NORM.S.DIST(z, TRUE)

 

Here, z is the value of the Z score for which you want to find the cumulative probability. TRUE = find cumulative probability, FALSE = find probability density.

 

Also, you can find the probability without standardising. Let’s say that X is normally distributed, with mean (μ) = 35 and standard deviation (σ) = 5. Now, if you want to find the cumulative probability for X = 30, you would type:

 

= NORM.DIST(30, 35, 5, TRUE)

Basically, the syntax is:

= NORM.DIST(x, mean, standard_dev, TRUE)


Not only that, you can also use Excel or Python to find the cumulative probability for Z. For example, let’s say you want to find the cumulative probability for Z = 1.5. In Excel, you would type:

 

= NORM.S.DIST(1.5, TRUE)

Basically, the syntax is:

= NORM.S.DIST(z, TRUE)

 

Here, z is the value of the Z score for which you want to find the cumulative probability. TRUE = find cumulative probability, FALSE = find probability density.

 

Also, you can find the probability without standardising. Let’s say that X is normally distributed, with mean (μ) = 35 and standard deviation (σ) = 5. Now, if you want to find the cumulative probability for X = 30, you would type:

 

= NORM.DIST(30, 35, 5, TRUE)

Basically, the syntax is:

= NORM.DIST(x, mean, standard_dev, TRUE)





### Distributions

As you can see, the value of σ is an indicator of how wide the graph is. This will be true for any graph, not just the normal distribution. A low value of σ means that the graph is narrow, while a high value implies that the graph is wider. This will happen because the wider graph will clearly have more values away from the mean, resulting in a high standard deviation.



Again, there are some more probability distributions that are commonly seen among continuous random variables. They are not covered in this course, but if you want to go through some of them, you can use the links below -

* [Exponential Distribution](https://online.stat.psu.edu/stat414/lesson/15/15.1)
* [Gamma Distribution](https://online.stat.psu.edu/stat414/lesson/15/15.4)
* [Chi-Squared Distribution](https://online.stat.psu.edu/stat414/lesson/15/15.8)
