## Naive Bayes Assignment - 1
**By Shahequa Modabbera**

### Q1. What is Bayes' theorem?

Ans) Bayes' theorem is a fundamental concept in probability theory and statistics that describes how to update or revise our beliefs about an event based on new evidence or information. It establishes a relationship between conditional probabilities, allowing us to calculate the probability of an event given certain conditions or prior knowledge.

The theorem is named after Thomas Bayes, an English mathematician and Presbyterian minister, who formulated it in the 18th century. Bayes' theorem is expressed as follows:

P(A|B) = (P(B|A) * P(A)) / P(B)

Where:
- P(A|B) represents the conditional probability of event A given event B has occurred.
- P(B|A) is the conditional probability of event B given event A has occurred.
- P(A) and P(B) are the probabilities of events A and B, respectively.

In simpler terms, Bayes' theorem states that the probability of event A occurring given that event B has occurred is equal to the probability of event B occurring given that event A has occurred, multiplied by the prior probability of event A, divided by the prior probability of event B.

Bayes' theorem is particularly useful in situations where we have prior knowledge or assumptions about the probabilities involved and want to update those probabilities based on new evidence. It is commonly applied in fields such as statistics, machine learning, and Bayesian inference, where it plays a crucial role in decision-making, hypothesis testing, and predictive modeling.

### Q2. What is the formula for Bayes' theorem?

Ans) The formula for Bayes' theorem is as follows:

P(A|B) = (P(B|A) * P(A)) / P(B)

Where:
- P(A|B) represents the conditional probability of event A given event B has occurred.
- P(B|A) is the conditional probability of event B given event A has occurred.
- P(A) and P(B) are the probabilities of events A and B, respectively.

In words, Bayes' theorem states that the probability of event A occurring given that event B has occurred is equal to the probability of event B occurring given that event A has occurred, multiplied by the prior probability of event A, divided by the prior probability of event B.

Bayes' theorem is a fundamental concept in probability theory and statistics, and it provides a mathematical framework for updating probabilities based on new evidence or information. It is widely used in various fields, including Bayesian inference, machine learning, and decision-making, to make informed predictions and decisions based on available data.

### Q3. How is Bayes' theorem used in practice?

Ans) Bayes' theorem is used in practice in various fields and applications. Here are a few examples:

1. Bayesian Inference: Bayes' theorem is the cornerstone of Bayesian inference, a statistical framework for updating probabilities based on prior knowledge and new evidence. It allows for the incorporation of prior beliefs or information into the estimation of parameters or the prediction of outcomes.

2. Spam Filtering: Bayes' theorem is commonly employed in spam filtering algorithms. By analyzing the content and characteristics of incoming emails, the algorithm calculates the probability that an email is spam given certain features (such as specific keywords or patterns). Bayes' theorem helps update these probabilities based on the presence or absence of those features in the email.

3. Medical Diagnosis: Bayes' theorem plays a crucial role in medical diagnosis, particularly in situations where multiple symptoms and test results are involved. Given the symptoms exhibited by a patient, Bayes' theorem allows physicians to calculate the probability of a specific disease or condition being present, taking into account the prevalence of the disease and the accuracy of the diagnostic tests.

4. Machine Learning: Bayes' theorem is utilized in various machine learning algorithms, such as Naive Bayes classifiers. These classifiers estimate the probability of a particular class label given the observed features of a data point. By leveraging Bayes' theorem, these algorithms can make predictions based on the calculated probabilities.

5. Natural Language Processing: Bayes' theorem finds application in language modeling and natural language processing tasks. It helps determine the probability of a word or phrase occurring in a specific context or given a sequence of previous words. This information is then utilized in tasks such as language generation, machine translation, and speech recognition.

Overall, Bayes' theorem provides a powerful framework for reasoning under uncertainty and updating probabilities based on available evidence. Its applications extend across various fields, enabling informed decision-making, prediction, and estimation.

### Q4. What is the relationship between Bayes' theorem and conditional probability?

Ans) Bayes' theorem is based on conditional probability and provides a way to calculate the reverse conditional probability given the conditional probability in the opposite direction. 

Conditional probability is the probability of an event A occurring given that another event B has already occurred, denoted as P(A|B). It quantifies the likelihood of A happening, taking into account the information provided by B.

Bayes' theorem builds upon conditional probability and states the relationship between the reverse conditional probability, P(B|A), and the conditional probability, P(A|B). It can be mathematically expressed as:

P(B|A) = (P(A|B) * P(B)) / P(A)

In simpler terms, Bayes' theorem allows us to update our belief or estimate of the probability of an event occurring (B) based on new evidence or information (A). It provides a way to calculate the posterior probability (P(B|A)) by multiplying the likelihood of A given B (P(A|B)) with the prior probability of B (P(B)) and normalizing it by dividing by the probability of A (P(A)).

Essentially, Bayes' theorem provides a framework to update probabilities based on new information, allowing us to revise our beliefs or make more informed decisions. It is widely used in statistics, machine learning, and various other fields for inference, prediction, and decision-making.

### Q5. How do you choose which type of Naive Bayes classifier to use for any given problem?

Ans) When choosing which type of Naive Bayes classifier to use for a given problem, it is important to consider the characteristics of the problem and the assumptions made by each type of classifier. Here are some guidelines to help you make a decision:

1. Gaussian Naive Bayes: This classifier assumes that the continuous features in the dataset follow a Gaussian (normal) distribution. It is suitable for problems where the continuous variables can be reasonably assumed to be normally distributed, such as when dealing with measurements like height or weight.

2. Multinomial Naive Bayes: This classifier is specifically designed for problems with discrete features, particularly when the features represent the frequencies or counts of different outcomes. It is commonly used in text classification tasks, where the features can be word frequencies or document frequencies.

3. Bernoulli Naive Bayes: This classifier is similar to the multinomial Naive Bayes, but it is specifically designed for binary features, where the features are either present (1) or absent (0). It is suitable for problems where the features are binary variables, such as document classification tasks where the presence or absence of certain words is used as features.

The choice of the Naive Bayes classifier depends on the nature of the features in our dataset and the assumptions made by each type. If our features are continuous and normally distributed, Gaussian Naive Bayes is a good choice. For discrete features, multinomial or Bernoulli Naive Bayes can be used based on whether the features represent counts or binary values. However, it is important to note that these assumptions may not always hold in real-world datasets, so it is recommended to evaluate the performance of different Naive Bayes classifiers using cross-validation or other validation techniques to choose the most appropriate one for our specific problem.

### Q6. Assignment:
    
### You have a dataset with two features, X1 and X2, and two possible classes, A and B. You want to use Naive Bayes to classify a new instance with features X1 = 3 and X2 = 4. The following table shows the frequency of each feature value for each class:
    
     Class X1=1 X1=2 X1=3 X2=1 X2=2 X2=3 X2=4

        A   3    3    4    4    3    3   3

        B   2    2    1    2    2    2   3

### Assuming equal prior probabilities for each class, which class would Naive Bayes predict the new instance to belong to?

Ans) To predict the class of a new instance using Naive Bayes, we calculate the conditional probability of each class given the feature values and select the class with the highest probability.

The probabilities for the given new instance with features X1 = 3 and X2 = 4.

For class A:

P(A|X1=3, X2=4) = {P(X1=3|A) * P(X2=4|A) * P(A)} / {P(X1=3) * P(X2=4)}

P(X1=3|A) = Frequency of X1=3 for class A / Total instances of class A
            
            = 4 / (3 + 3 + 4 + 4 + 3 + 3 + 3) = 4 / 23

P(X2=4|A) = Frequency of X2=4 for class A / Total instances of class A
            
            = 3 / (3 + 3 + 4 + 4 + 3 + 3 + 3) = 3 / 23

P(A) = Prior probability of class A
       
       = 1/2 (assuming equal prior probabilities for each class)

P(A|X1=3, X2=4) = {(4 / 23) * (3 / 23) * (1/2)} [Since both the denominators are same , they will be cut out]
                
                = 0.011
                
Similarly, for class B:

P(B|X1=3, X2=4) = {P(X1=3|B) * P(X2=4|B) * P(B)} / {P(X1=3) * P(X2=4)}

P(X1=3|B) = Frequency of X1=3 for class B / Total instances of class B
            
            = 1 / (2 + 2 + 1 + 2 + 2 + 2 + 3) = 1 / 14

P(X2=4|B) = Frequency of X2=4 for class B / Total instances of class B
            
            = 3 / (2 + 2 + 1 + 2 + 2 + 2 + 3) = 3 / 14

P(B) = Prior probability of class B
       
       = 1/2 (assuming equal prior probabilities for each class)

P(B|X1=3, X2=4) = (1 / 14) * (3 / 14) * (1/2) [Since both the denominators are same , they will be cut out]
                
                =0.007
                
Finally,

P(A|X1=3, X2=4) = 0.011 / (0.011 + 0.007)
                
                = 0.61
                
                = 61%

P(B|X1=3, X2=4) = 0.007 / (0.011 + 0.007)
                
                = 0.39
               
               = 39%
                
Since P(A|X1=3, X2=4) has the higher probability, Naive Bayes would predict the new instance to belong to class A.