<a href="https://colab.research.google.com/github/AayushiMakhija/spam-email-detection-/blob/main/spam_email_detection.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
pip install pgmpy



In [4]:
from pgmpy.models import BayesianNetwork, MarkovNetwork
from pgmpy.factors.discrete import TabularCPD, DiscreteFactor
from pgmpy.inference import VariableElimination, BeliefPropagation

In [5]:
# ---------------- Bayesian Network ---------------- #

# Step 1: Create the Bayesian Network
bayes_net = BayesianNetwork([
    ('ContainsKeywords', 'IsSpam'),
    ('KnownSender', 'IsSpam')
])

In [6]:
# Step 2: Define CPDs for Bayesian Network

# CPD for Contains Keywords (No = 0, Yes = 1)
cpd_keywords = TabularCPD(variable='ContainsKeywords', variable_card=2, values=[[0.7], [0.3]])

# CPD for Known Sender (No = 0, Yes = 1)
cpd_sender = TabularCPD(variable='KnownSender', variable_card=2, values=[[0.6], [0.4]])

# CPD for Is Spam (No = 0, Yes = 1) given Contains Keywords and Known Sender
cpd_spam = TabularCPD(variable='IsSpam', variable_card=2,
                      values=[[0.9, 0.7, 0.6, 0.2], [0.1, 0.3, 0.4, 0.8]],
                      evidence=['ContainsKeywords', 'KnownSender'], evidence_card=[2, 2])

# Add CPDs to the Bayesian Network
bayes_net.add_cpds(cpd_keywords, cpd_sender, cpd_spam)

In [7]:
# Step 3: Perform inference using Variable Elimination on the Bayesian Network
inference_bayes = VariableElimination(bayes_net)

# Example Query: What is the probability that an email is spam given it contains suspicious keywords but is from a known sender?
result_bayes = inference_bayes.query(variables=['IsSpam'], evidence={'ContainsKeywords': 1, 'KnownSender': 1})
print("\nBayesian Network - Probability of Spam given Contains Keywords and Known Sender:")
print(result_bayes)


Bayesian Network - Probability of Spam given Contains Keywords and Known Sender:
+-----------+---------------+
| IsSpam    |   phi(IsSpam) |
| IsSpam(0) |        0.2000 |
+-----------+---------------+
| IsSpam(1) |        0.8000 |
+-----------+---------------+


In [8]:
# ---------------- Markov Network ---------------- #

# Step 4: Create the Markov Network (Markov Random Field)
markov_net = MarkovNetwork()

# Define undirected edges
markov_net.add_edges_from([
    ('ContainsKeywords', 'KnownSender'),
    ('ContainsKeywords', 'IsSpam'),
    ('KnownSender', 'IsSpam')
])

In [9]:
# Step 5: Define potential functions (factors) for the Markov Network

# Factor between Contains Keywords and Known Sender
factor_keywords_sender = DiscreteFactor(variables=['ContainsKeywords', 'KnownSender'], cardinality=[2, 2],
                                        values=[0.8, 0.2, 0.5, 0.5])

# Factor between Contains Keywords and Is Spam
factor_keywords_spam = DiscreteFactor(variables=['ContainsKeywords', 'IsSpam'], cardinality=[2, 2],
                                      values=[0.9, 0.1, 0.6, 0.4])

# Factor between Known Sender and Is Spam
factor_sender_spam = DiscreteFactor(variables=['KnownSender', 'IsSpam'], cardinality=[2, 2],
                                    values=[0.7, 0.3, 0.4, 0.6])

# Add the factors to the Markov Network
markov_net.add_factors(factor_keywords_sender, factor_keywords_spam, factor_sender_spam)

In [10]:
# Step 6: Perform inference using Belief Propagation on the Markov Network
belief_propagation = BeliefPropagation(markov_net)

# Example Query: Marginal distribution for Is Spam
marginal_spam = belief_propagation.query(variables=['IsSpam'])
print("\nMarkov Network - Marginal Probability of Spam:")
print(marginal_spam)

# Example Query: Joint distribution between Contains Keywords and Known Sender
joint_keywords_sender = belief_propagation.query(variables=['ContainsKeywords', 'KnownSender'])
print("\nMarkov Network - Joint Probability of Contains Keywords and Known Sender:")
print(joint_keywords_sender)


Markov Network - Marginal Probability of Spam:
+-----------+---------------+
| IsSpam    |   phi(IsSpam) |
| IsSpam(0) |        0.8075 |
+-----------+---------------+
| IsSpam(1) |        0.1925 |
+-----------+---------------+

Markov Network - Joint Probability of Contains Keywords and Known Sender:
+---------------------+----------------+-------------------------------------+
| ContainsKeywords    | KnownSender    |   phi(ContainsKeywords,KnownSender) |
| ContainsKeywords(0) | KnownSender(0) |                              0.4706 |
+---------------------+----------------+-------------------------------------+
| ContainsKeywords(0) | KnownSender(1) |                              0.0749 |
+---------------------+----------------+-------------------------------------+
| ContainsKeywords(1) | KnownSender(0) |                              0.2406 |
+---------------------+----------------+-------------------------------------+
| ContainsKeywords(1) | KnownSender(1) |                         