
---

# 🛡️ AI-Based SecureSense – SMS Spam & APK Malware Detection System

This project provides:

* 📩 **SMS Spam Detection using NLP**
* 📱 **APK Malware Detection using Static Features**
* 🤖 **Machine Learning Models (Logistic Regression + Random Forest)**
* 📊 **Probability-Based Risk Meter**
* 🌍 **Public Web Deployment using Flask + ngrok**

### ✅ Supported Security Modules:

* SMS Spam vs Ham Classification
* APK Safe vs Malware Detection (Demo Static Features)

---

## 📘 1 — Project Introduction

This notebook performs:

1. Dependency Installation
2. Dataset Loading from Google Drive
3. SMS Spam Model Training
4. APK Malware Model Training
5. Model Saving
6. Flask Web App Creation
7. Public Deployment via ngrok

---

## 📘 2 — Install All Dependencies

# ===============================

# ✅ CELL 1: Install All Dependencies

# ===============================





In [None]:
!pip install -q scikit-learn pandas numpy flask pyngrok joblib


---

## 📘 3 — Mount Google Drive & Load SMS Dataset

This step:

* Mounts Google Drive
* Loads SMS Spam Dataset
* Cleans unwanted columns

# ===============================

# ✅  Mount Drive & Load Dataset

# ===============================

---

## 📘 4 — Feature Engineering (TF-IDF)

This step:

* Encodes labels
* Extracts TF-IDF features
* Splits train/test data

# ===============================

# ✅  Text Feature Engineering

# ===============================

---

## 📘 5 — Train SMS Spam Detection Model

# ===============================

# ✅  Train SMS Model

# ===============================


---

✅ **Yes, you can absolutely use the Kaggle “Spam Classifier” dataset instead of Google Drive** — and that’s a **cleaner, more reproducible, and more professional approach** 👏

### 📦 Dataset Link

👉 **[https://www.kaggle.com/datasets/huebitsvizg/spam-classifier](https://www.kaggle.com/datasets/huebitsvizg/spam-classifier)**

This means you will **remove the Google Drive mounting / local file dependency** and **load the dataset directly from Kaggle** inside Colab (or your environment).

---

## ✅ WHAT TO REPLACE (Your Old Code ❌)

Remove this part:

```python
from google.colab import drive
drive.mount('/content/drive')

data_csv = '/content/drive/MyDrive/Sasi Projects/spam.csv'
df = pd.read_csv(data_csv, encoding='latin1')
```

---

## ✅ NEW PROFESSIONAL KAGGLE DATASET SETUP (FINAL ✅)

### 📘 New Notebook Cell — Install Kaggle API

```python
# ===============================
# ✅ CELL: Install Kaggle API
# ===============================
!pip install -q kaggle
```

---

### 📘 Upload Kaggle API Key (ONE-TIME STEP)

1. Go to 👉 **[https://www.kaggle.com/settings](https://www.kaggle.com/settings)**
2. Scroll to **API** → Click **Create New Token** (downloads `kaggle.json`)
3. Upload it in Colab:

```python
from google.colab import files
files.upload()
```

---

### 📘 Configure Kaggle & Download Dataset

```python
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json
```

```python
# ✅ Download Spam Classifier Dataset
!kaggle datasets download -d huebitsvizg/spam-classifier
```

---

### 📘 Extract CSV & Load into DataFrame

```python
!unzip -q spam-classifier.zip
import pandas as pd

df = pd.read_csv("spam.csv", encoding='latin1')
print("✅ Dataset loaded from Kaggle. Shape:", df.shape)
df.head()
```

---

✅ After this, you can proceed with:

* Cleaning columns
* Feature extraction (TF-IDF)
* Train / test split
* Model training
* Model saving
* Flask deployment

All **without using Google Drive or manual uploads**.

---

## ✅ BENEFITS OF USING THIS METHOD

| Old Method                        | New Method                             |
| --------------------------------- | -------------------------------------- |
| Manual file upload via Drive      | ✅ Automatic Kaggle download            |
| Risk of missing or outdated files | ✅ Clean, versioned dataset from Kaggle |
| Drive dependency                  | ✅ Fully reproducible anywhere          |
| Manual path handling              | ✅ Simple path — `"spam.csv"`           |

---


In [None]:
# === SecureSense SMS Spam Classifier ===
import pandas as pd, joblib
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report

# Load dataset
from google.colab import drive
drive.mount('/content/drive')

data_csv = '/content/drive/MyDrive/Sasi Projects/spam.csv'
df = pd.read_csv(data_csv, encoding='latin1')

# Clean unwanted columns if present
if 'Unnamed: 2' in df.columns:
    df = df[['v1', 'v2']]
    df.columns = ['label', 'text']

# Prepare features
X = df['text']; y = (df['label'].str.lower() == 'spam').astype(int)

# TF-IDF + Split
vec = TfidfVectorizer(max_features=5000, ngram_range=(1,2))
Xv = vec.fit_transform(X)
Xtr, Xte, ytr, yte = train_test_split(Xv, y, test_size=0.2, random_state=42, stratify=y)

# Train
clf = LogisticRegression(max_iter=1000)
clf.fit(Xtr, ytr)

print("📊 Performance:\n", classification_report(yte, clf.predict(Xte)))

# Save models
joblib.dump(vec, 'sms_vec.joblib')
joblib.dump(clf, 'sms_clf.joblib')
print("✅ Models saved successfully.")


---

## 📘 6 — Train APK Malware Detection Model (Demo Data)

This step:

* Generates synthetic permission features
* Trains Random Forest model

# ===============================

# ✅ CELL 5: Train APK Model

# ===============================



In [None]:
import numpy as np
from sklearn.ensemble import RandomForestClassifier

# Synthetic permission features
X = np.random.randint(0,2,(200,15))  # 200 samples × 15 permissions
y = np.random.randint(0,2,200)       # 0=safe, 1=malware

apk_clf = RandomForestClassifier(n_estimators=100, random_state=42)
apk_clf.fit(X, y)
joblib.dump(apk_clf, 'apk_rf.joblib')
print("✅ APK Static Model saved (demo data).")



---

## 📘 7 — Authenticate ngrok

This step:

* Authenticates ngrok with your account
* Enables secure public HTTPS access
* Prepares the system for live deployment

# ===============================

# ✅ CELL 6: Authenticate ngrok

# ===============================

---

## 🌐 Ngrok Setup (Public Deployment)

Ngrok provides a **secure public HTTPS link** to your locally running Flask application.

🔐 **For security reasons, your ngrok token should NOT be shared publicly.**

### ✅ To Use Ngrok, Follow These Steps:

### 📌 Step 1 — Get Your Auth Token

Go to this link and copy your personal token:
👉 **[https://dashboard.ngrok.com/get-started/your-authtoken](https://dashboard.ngrok.com/get-started/your-authtoken)**

---

### 📌 Step 2 — Add Token Inside Notebook

Paste your token in the following line:

```python
#from pyngrok import ngrok, conf

#conf.get_default().auth_token = "YOUR_NGROK_TOKEN_HERE"
```

---

### 📌 Step 3 — Start Ngrok Tunnel

```python
#public_url = ngrok.connect(8000)
#print("🌍 Public URL:", public_url)
```

✅ After running this, a **shareable public link** will appear here.
You can open it in your browser and access your Flask app from **anywhere in the world** 🌎

---

### ✅ Summary

✔ Secure HTTPS URL

✔ No port forwarding required

✔ Works on Google Colab

✔ Perfect for project demos, reviews, and viva

---


In [None]:
from pyngrok import ngrok
ngrok.set_auth_token("PASTE_YOUR_NGROK_TOKEN_HERE")
print("🔐 Ngrok token authenticated!")



---

## 📘 8 — Load Models for Deployment

# ===============================

#  Load Saved Models

# ===============================

---

## 📘 9 — Create Flask Web Application

This step builds:

* SMS Spam Prediction API
* APK Malware Prediction API
* Risk Level Meter

# ===============================

# ✅ Flask App

# ===============================

---


In [None]:
from flask import Flask, request, render_template_string
import joblib

# Load models
sms_vec = joblib.load('sms_vec.joblib')
sms_clf = joblib.load('sms_clf.joblib')
apk_clf = joblib.load('apk_rf.joblib')

# === HTML Template ===
HTML = """
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>SecureSense: ML Security Suite</title>
<style>
body {
    font-family: 'Segoe UI', sans-serif;
    background: linear-gradient(135deg,#1a1a2e,#16213e);
    color: #fff;
    display: flex; justify-content: center; align-items: center;
    height: 100vh; margin: 0;
}
.container {
    background: #0f3460;
    padding: 40px; border-radius: 20px;
    width: 420px; box-shadow: 0 0 25px rgba(0,0,0,0.4);
}
h1 { text-align: center; color: #00adb5; margin-bottom: 20px; }
textarea, input[type=text] {
    width: 100%; padding: 10px; border-radius: 8px; border: none;
    margin-top: 8px; margin-bottom: 12px;
}
button {
    background: #00adb5; color: white; border: none; border-radius: 8px;
    padding: 10px 20px; cursor: pointer; width: 100%;
}
button:hover { background: #02c39a; }
.result {
    background: #1a1a2e; padding: 15px; border-radius: 10px; margin-top: 15px;
}
.meter { height: 12px; border-radius: 6px; background: #333; margin-top: 8px; }
.bar { height: 12px; border-radius: 6px; transition: width 0.5s ease; }
.low { background: #4caf50; } .medium { background: #ffc107; } .high { background: #f44336; }
</style>
</head>
<body>
<div class="container">
  <h1>🛡️ SecureSense</h1>
  <form method="post">
    <h3>📩 SMS Spam Check</h3>
    <textarea name="sms_text" rows="3" placeholder="Enter SMS message..."></textarea>
    <button name="action" value="sms">Analyze SMS</button>
    <h3>📱 APK Static Analysis</h3>
    <input type="text" name="apk_features" placeholder="e.g. 1,0,1,0,0,1,... (15 values)">
    <button name="action" value="apk">Analyze APK</button>
  </form>
  {% if result %}
  <div class="result">
    <h3>🔍 Result</h3>
    <pre>{{ result }}</pre>
    {% if meter %}
      <div class="meter"><div class="bar {{ meter.color }}" style="width:{{ meter.width }}%"></div></div>
    {% endif %}
  </div>
  {% endif %}
</div>
</body>
</html>
"""

app = Flask(__name__)

@app.route('/', methods=['GET','POST'])
def index():
    result, meter = None, None
    if request.method == 'POST':
        if request.form['action'] == 'sms':
            txt = request.form['sms_text']
            x = sms_vec.transform([txt])
            prob = sms_clf.predict_proba(x)[0,1]
            pred = 'SPAM 🚨' if prob > 0.5 else 'HAM ✅'
            level = 'high' if prob > 0.8 else 'medium' if prob > 0.5 else 'low'
            width = int(prob * 100)
            meter = {'color': level, 'width': width}
            result = f"Prediction: {pred}\nSpam Probability: {prob:.2f}"
        elif request.form['action'] == 'apk':
            try:
                feats = np.array([list(map(int, request.form['apk_features'].split(',')))])
                prob = apk_clf.predict_proba(feats)[0,1]
                pred = 'MALWARE ⚠️' if prob > 0.5 else 'SAFE ✅'
                level = 'high' if prob > 0.8 else 'medium' if prob > 0.5 else 'low'
                width = int(prob * 100)
                meter = {'color': level, 'width': width}
                result = f"Prediction: {pred}\nMalware Probability: {prob:.2f}"
            except:
                result = "❌ Invalid feature format! Please enter 15 comma-separated 0/1 values."
    return render_template_string(HTML, result=result, meter=meter)


---

## 📘 10 — Run Flask Server & ngrok Deployment

# ===============================

# ✅ CELL 9: Run Server & ngrok

# ===============================



In [None]:
from pyngrok import ngrok
public_url = ngrok.connect(5000).public_url
print("🌍 Public URL →", public_url)
app.run(port=5000, use_reloader=False)

In [None]:

---

# 🎉 SecureSense System Ready!

You can now:

✅ Detect SMS Spam
✅ Detect APK Malware
✅ See Probability-Based Risk
✅ Access Public Web App via ngrok
✅ Use for Resume / GitHub / College Submission

---
