# Lecture - 04

## Probability
Probability is the measure of the likelihood that an event will occur.

In [1]:
# Probability of getting a 3 when rolling a fair six-sided die
1/6

0.16666666666666666

## Addition and Multiplication Rules in Probability

### Real-Time Example

**Scenario:**
- You are at a store that sells two brands of drinks: Brand A and Brand B.
- The probability that a customer buys a drink from Brand A is 0.3.
- The probability that a customer buys a drink from Brand B is 0.4.
- The probability that a customer buys both Brand A and Brand B drinks is 0.1.

We will use this scenario to apply the addition and multiplication rules.

### Addition Rule Example

**Question:** What is the probability that a customer buys either a Brand A drink or a Brand B drink?

Since the events are not mutually exclusive (a customer can buy both), we use:

\[ P(A \cup B) = P(A) + P(B) - P(A \cap B) \]

```python
# Probabilities
P_A = 0.3  # Probability of buying Brand A
P_B = 0.4  # Probability of buying Brand B
P_A_and_B = 0.1  # Probability of buying both Brand A and Brand B

# Addition rule
P_A_or_B = P_A + P_B - P_A_and_B

print(f"Probability of buying either Brand A or Brand B: {P_A_or_B}")
```

### Multiplication Rule Example

**Scenario:**
- The probability that a customer buys a Brand A drink is 0.3.
- If a customer buys a Brand A drink, the probability that they also buy a Brand B drink is 0.5 (conditional probability).

**Question:** What is the probability that a customer buys both Brand A and Brand B drinks?

Since the events are dependent (the purchase of Brand B depends on the purchase of Brand A), we use:

\[ P(A \cap B) = P(A) \times P(B|A) \]

```python
# Probabilities
P_A = 0.3  # Probability of buying Brand A
P_B_given_A = 0.5  # Probability of buying Brand B given that Brand A was bought

# Multiplication rule for dependent events
P_A_and_B_dependent = P_A * P_B_given_A

print(f"Probability of buying both Brand A and Brand B (dependent): {P_A_and_B_dependent}")
```

### Independent Events Example

**Scenario:**
- The probability that a customer buys a Brand A drink is 0.3.
- The probability that a customer buys a Brand B drink, independent of Brand A, is 0.4.

**Question:** What is the probability that a customer buys both a Brand A and a Brand B drink?

Since the events are independent, we use:

\[ P(A \cap B) = P(A) \times P(B) \]

```python
# Probabilities
P_A = 0.3  # Probability of buying Brand A
P_B = 0.4  # Probability of buying Brand B (independent)

# Multiplication rule for independent events
P_A_and_B_independent = P_A * P_B

print(f"Probability of buying both Brand A and Brand B (independent): {P_A_and_B_independent}")
```

In summary:
- The addition rule calculates the probability of either event happening.
- The multiplication rule calculates the probability of both events happening, considering whether the events are independent or dependent.

## Permutations and Combinations

Sure! Let's consider a real-world example involving permutations and combinations.

### Real-Time Example

**Scenario:**
1. **Permutations:** You have 3 different books (A, B, C) and you want to arrange 2 of them on a shelf. The order matters.
2. **Combinations:** You have 3 different books (A, B, C) and you want to choose 2 of them to take on a trip. The order does not matter.

### Permutations

When the order matters, we use permutations. The formula for permutations of \(n\) objects taken \(r\) at a time is given by:

\[ P(n, r) = \frac{n!}{(n-r)!} \]

In our example, we have 3 books and we want to arrange 2 of them. So we calculate \(P(3, 2)\).

```python
import itertools

# List of books
books = ['A', 'B', 'C']

# Generate permutations of 3 books taken 2 at a time
perm = list(itertools.permutations(books, 2))

print(f"Permutations of 3 books taken 2 at a time: {perm}")
```

### Combinations

When the order does not matter, we use combinations. The formula for combinations of \(n\) objects taken \(r\) at a time is given by:

\[ C(n, r) = \frac{n!}{r!(n-r)!} \]

In our example, we have 3 books and we want to choose 2 of them. So we calculate \(C(3, 2)\).

```python
import itertools

# List of books
books = ['A', 'B', 'C']

# Generate combinations of 3 books taken 2 at a time
comb = list(itertools.combinations(books, 2))

print(f"Combinations of 3 books taken 2 at a time: {comb}")
```

### Output
When you run the code, you will get:

```python
Permutations of 3 books taken 2 at a time: [('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), ('C', 'B')]
Combinations of 3 books taken 2 at a time: [('A', 'B'), ('A', 'C'), ('B', 'C')]
```

### Explanation
1. **Permutations:**
   - The order matters, so 'A' followed by 'B' is different from 'B' followed by 'A'.
   - The permutations are ('A', 'B'), ('A', 'C'), ('B', 'A'), ('B', 'C'), ('C', 'A'), and ('C', 'B').

2. **Combinations:**
   - The order does not matter, so 'A' and 'B' is the same as 'B' and 'A'.
   - The combinations are ('A', 'B'), ('A', 'C'), and ('B', 'C').

These examples show how to use permutations and combinations to solve real-world problems where order either matters or does not matter.

## Confidence Interval
A confidence interval is a range of values that is likely to contain a population parameter with a certain level of confidence.

### Confidence interval (95%) using t-distribution

In [2]:
import numpy as np
import scipy.stats as stats

# Sample data
data = [2, 3, 5, 6, 9]

# Calculate mean and standard error
mean = np.mean(data)
std_err = stats.sem(data)

# Confidence interval (95%) using t-distribution
confidence = 0.95
h_t = std_err * stats.t.ppf((1 + confidence) / 2, len(data) - 1)
confidence_interval_t = (mean - h_t, mean + h_t) # upperbound and lower bound

print(f"Mean: {mean}")
print(f"95% Confidence interval (t-distribution): {confidence_interval_t}")

Mean: 5.0
95% Confidence interval (t-distribution): (1.5995630967087155, 8.400436903291284)


### Confidence interval (95%) using z-distribution

In [3]:
import numpy as np
import scipy.stats as stats

# Sample data
data = [2, 3, 5, 6, 9]

# Calculate mean and standard error
mean = np.mean(data)
std_err = stats.sem(data)

# Confidence interval (95%) using z-distribution
confidence = 0.95
h_z = std_err * stats.norm.ppf((1 + confidence) / 2)
confidence_interval_z = (mean - h_z, mean + h_z)  # upperbound and lower bound

print(f"Mean: {mean}")
print(f"95% Confidence interval (z-distribution): {confidence_interval_z}")


Mean: 5.0
95% Confidence interval (z-distribution): (2.599544161822345, 7.400455838177654)


## p-value
The p-value is the probability of obtaining test results at least as extreme as the observed results, under the assumption that the null hypothesis is correct.

In [4]:
import scipy.stats as stats

# Example data
data1 = [2, 3, 5, 6, 9]
data2 = [1, 4, 5, 8, 10]

# Perform t-test
t_stat, p_value = stats.ttest_ind(data1, data2)

print(f"t-statistic: {t_stat}")
print(f"p-value: {p_value}")

t-statistic: -0.30151134457776346
p-value: 0.7707132785693247


## Simple Hypothesis Testing

In [5]:
import scipy.stats as stats

# Example data
data1 = [2, 3, 5, 6, 9]
data2 = [1, 4, 5, 8, 10]


# Null hypothesis: The means of the two samples are equal
# Alternative hypothesis: The means of the two samples are not equal

# Perform t-test
t_stat, p_value = stats.ttest_ind(data1, data2)

print(f"t-statistic: {t_stat}")
print(f"p-value: {p_value}")

# Significance level
alpha = 0

if p_value < alpha:
    print("Reject the null hypothesis")
else:
    print("Fail to reject the null hypothesis")

t-statistic: -0.30151134457776346
p-value: 0.7707132785693247
Fail to reject the null hypothesis


#### Prepared By,
Ahamed Basith