### **(I)** **Deployment**

 - To make the Mpox Instagram NLP models easily accessible and interactive, we deployed them using Gradio, a Python library that allows you to create clean web-based interfaces for machine learning models.

 - This deployment serves two main purposes:

 - Enable real-time predictions for:

   - **Sentiment Analysis (via XGBoost)**

   - **Hate Speech Detection (via LightGBM)**

 - Provide model transparency through LIME explainability, allowing users to see which words most influenced each prediction.

**Key Features of the App**

 - Users input an Instagram post related to Mpox.

 - They choose between:

   - Sentiment Classification: Predicts emotional tone (e.g., fear, joy, sadness).

   - Hate Speech Detection: Classifies whether the post is hateful or not.

 - The system:

   - Cleans the text using NLP preprocessing.

   - Passes it through a trained model pipeline (TF-IDF + classifier).

 - Shows the **predicted class**, **class probabilities**, and a **LIME visualization** of the most influential words.

In [None]:
#%pip install gradio

In [None]:
#%pip install markupsafe==2.0.1

In [None]:
#%pip install --upgrade click

In [None]:
import re #Regular expressions for text cleaning
import joblib
import numpy as np
import gradio as gr
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from lime.lime_text import LimeTextExplainer
import csv
import sys
csv.field_size_limit(10**6)# Increase field size limit for CSV reading

# Download NLTK resources
nltk.download('stopwords')
nltk.download('wordnet')

# Load saved models and label encoder
sentiment_model = joblib.load("xgb_sentiment_pipeline.pkl")  # XGBoost pipeline
hate_model = joblib.load("lightgbm_hate_speech_model.pkl")   # LightGBM pipeline
label_encoder = joblib.load("label_encoder.pkl")              # LabelEncoder for sentiment labels

# Text cleaning function
stop_words = set(stopwords.words('english'))
lemmatizer = WordNetLemmatizer()

def clean_text(text):
    text = text.lower()
    text = re.sub(r"http\S+|www\S+", "", text)
    text = re.sub(r"[^a-z\s]", "", text)
    words = text.split()
    cleaned = [lemmatizer.lemmatize(w) for w in words if w not in stop_words]
    return " ".join(cleaned)

# Prediction + LIME explainability function
def classify_text(text, task):
    cleaned = clean_text(text)

    if task == "Sentiment Analysis":
        pred_encoded = sentiment_model.predict([cleaned])[0]
        pred_label = label_encoder.inverse_transform([pred_encoded])[0]
        probs = sentiment_model.predict_proba([cleaned])[0]
        class_labels = label_encoder.inverse_transform(np.arange(len(probs)))
        class_probs = {label: float(prob) for label, prob in zip(class_labels, probs)}

        # LIME Explanation
        explainer = LimeTextExplainer(class_names=list(class_labels))
        exp = explainer.explain_instance(cleaned, sentiment_model.predict_proba, num_features=6)
        html_explanation = exp.as_html()

        return f" Predicted Sentiment: **{pred_label}**", class_probs, html_explanation

    elif task == "Hate Speech Detection":
        pred = hate_model.predict([cleaned])[0]
        probs = hate_model.predict_proba([cleaned])[0]
        class_probs = {label: float(prob) for label, prob in zip(hate_model.classes_, probs)}

        # LIME Explanation
        explainer = LimeTextExplainer(class_names=list(hate_model.classes_))
        exp = explainer.explain_instance(cleaned, hate_model.predict_proba, num_features=6)
        html_explanation = exp.as_html()

        return f" Predicted Class: **{pred}**", class_probs, html_explanation

    else:
        return "Invalid task selected", {}, ""

# Gradio Interface
interface = gr.Interface(
    fn=classify_text,
    inputs=[
        gr.Textbox(lines=4, placeholder="Enter a post...", label="Post"),
        gr.Radio(choices=["Sentiment Analysis", "Hate Speech Detection"], label="Select Task")
    ],
    outputs=[
        gr.Markdown(label="Prediction"),
        gr.Label(label="Class Probabilities"),
        gr.HTML(label="LIME Explanation")
    ],
    title=" Mpox Instagram NLP Analyzer",
    description="Classify Mpox-related posts for sentiment or hate speech. View predictions and LIME-based word importance explanations.",
)

interface.launch(share=True)


[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\JUDAH\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\JUDAH\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


Running on local URL:  http://127.0.0.1:7861
Running on public URL: https://de934a331bba2f1fda.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)




[WinError 2] The system cannot find the file specified
  File "c:\Users\JUDAH\anaconda3\envs\learn-env\lib\site-packages\joblib\externals\loky\backend\context.py", line 257, in _count_physical_cores
    cpu_info = subprocess.run(
  File "c:\Users\JUDAH\anaconda3\envs\learn-env\lib\subprocess.py", line 489, in run
    with Popen(*popenargs, **kwargs) as process:
  File "c:\Users\JUDAH\anaconda3\envs\learn-env\lib\subprocess.py", line 854, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "c:\Users\JUDAH\anaconda3\envs\learn-env\lib\subprocess.py", line 1307, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,


Error while flagging: field larger than field limit (1000000)
