# Assignment: Probability, Bayesian Probability, and Gradient Descent
_Group 6_

### Group Members
- David Akintayo: Probability Distributions  
- Cynthia Mutie: Bayesian Probability  
- Sougnabe Payang: Manual Gradient Descent  
- Elvis Kayonga: Linear Regression with SciPy 

## Part 0 — Data Setup
Load and explore the dataset to be used across all sections.


In [2]:
import pandas as pd

# Name the columns so we understand the data
col_names = ["sepal_length", "sepal_width", "petal_length", "petal_width", "species"]

# Download the data from the internet and load it
df = pd.read_csv(
    "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data",
    header=None, names=col_names
)

# Show the first 5 rows
df.head()


Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,Iris-setosa
1,4.9,3.0,1.4,0.2,Iris-setosa
2,4.7,3.2,1.3,0.2,Iris-setosa
3,4.6,3.1,1.5,0.2,Iris-setosa
4,5.0,3.6,1.4,0.2,Iris-setosa


## Part 1 — Probability Distributions (David Akintayo)

Implement and compare key **probability distributions** using the Iris dataset.  
Tasks:
- Compute and visualize probability distributions (e.g., Normal, Uniform, Exponential).  
- Plot histograms and fit curves for each feature.  
- Analyze which distribution best fits each variable.


## Part 2 — Bayesian Probability (Cynthia Mutie)
Dataset: IMDb Movie Reviews (50k) — file in repo: `data/IMDB Dataset.csv`

Objective: For chosen keywords compute:
- Prior: P(Positive)
- Likelihood: P(keyword | Positive)
- Marginal: P(keyword)
- Posterior: P(Positive | keyword)

We will compute P(Positive | keyword) using plain Python + pandas only.


In [6]:
import pandas as pd

DATA_PATH = r"../data/IMDB Dataset.csv"  # go up one folder
df = pd.read_csv(DATA_PATH)

print("✅ Dataset loaded successfully!")
print("Number of rows:", len(df))
df.head()


✅ Dataset loaded successfully!
Number of rows: 50000


Unnamed: 0,review,sentiment
0,One of the other reviewers has mentioned that ...,positive
1,A wonderful little production. <br /><br />The...,positive
2,I thought this was a wonderful way to spend ti...,positive
3,Basically there's a family where a little boy ...,negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive


In [2]:
#  Compute Bayesian Probabilities 

import numpy as np

# clean reviews
df['review_clean'] = df['review'].str.lower()

# Chosen keywords
positive_keywords = ['great', 'excellent', 'amazing', 'wonderful']
negative_keywords = ['bad', 'boring', 'awful', 'worst']

# Compute prior P(Positive)
p_positive = (df['sentiment'] == 'positive').mean()
print("Prior P(Positive) =", round(p_positive, 3))

results = []

# only compute P(Positive | keyword)
for word in positive_keywords + negative_keywords:
    has_word = df['review_clean'].str.contains(word)
    p_word = has_word.mean()  # marginal P(keyword)
    p_word_given_pos = df.loc[df['sentiment']=='positive', 'review_clean'].str.contains(word).mean()  # likelihood
    posterior = (p_word_given_pos * p_positive) / p_word if p_word > 0 else np.nan  # Bayes
    results.append({
        'Keyword': word,
        'P(Positive)': round(p_positive, 3),
        'P(keyword|Positive)': round(p_word_given_pos, 3),
        'P(keyword)': round(p_word, 3),
        'P(Positive|keyword)': round(posterior, 3)
    })

pd.DataFrame(results)


NameError: name 'df' is not defined

In [1]:
# Implement Bayes' Theorem Function
def bayes_posterior(p_positive, p_word_given_pos, p_word):
    """Compute P(Positive|keyword) using Bayes' theorem"""
    return (p_word_given_pos * p_positive) / p_word

# Compute and print nicely
for word in positive_keywords + negative_keywords:
    has_word = df['review_clean'].str.contains(word)
    p_word = has_word.mean()
    p_word_given_pos = df.loc[df['sentiment'] == 'positive', 'review_clean'].str.contains(word).mean()
    posterior = bayes_posterior(p_positive, p_word_given_pos, p_word)
    print(f"{word:10s} → P(Positive|{word}) = {posterior:.3f}")


NameError: name 'positive_keywords' is not defined

## Part 3 — Manual Gradient Descent (Sougnabe Payang)

Manually calculate the **gradient descent steps** for a simple cost function:  
\[
J(m, b) = \frac{1}{n}\sum_{i=1}^n (y_i - (mx_i + b))^2
\]
Tasks:
- Derive partial derivatives of \(J\) with respect to \(m\) and \(b\).  
- Implement manual gradient descent using loops.


## Part 4 — Gradient Descent with SciPy (Elvis Kayonga)

Use the **SciPy library** to perform linear regression using optimization tools.  
Compare manual vs. automatic gradient descent results.
