# AI-Based Social Media Post & Reel Performance Predictor
## Using Multimodal Machine Learning (Text + Image + Video + Audio)

This project predicts Instagram post/reel performance (High/Medium/Low) using multimodal features. Adapted to your dataset: image_path, caption, hashtags, likes.

**Platform**: Instagram
**Dashboard**: Streamlit (code provided at the end)
**Dataset**: `Projects/Untitled Folder/instagram_dataset.csv` (columns: image_path, caption, hashtags, likes)

Install dependencies: `pip install torch torchvision transformers scikit-learn pandas numpy opencv-python librosa streamlit pillow requests`

In [1]:
import sys
print(sys.executable)

C:\Users\Sid\anaconda3\python.exe


In [2]:
!pip install pillow
!pip install opencv-python
!pip install librosa
!pip install torch
!pip install transformers
!pip install scikit-learn



In [3]:
import pandas as pd
import numpy as np
import requests
from PIL import Image
import cv2
import librosa
import torch
from transformers import BertTokenizer, BertModel, CLIPProcessor, CLIPModel
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
import os
import warnings
warnings.filterwarnings('ignore')

# Set device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

## Step 1: Load and Explore Dataset
Load your dataset with columns: image_path, caption, hashtags, likes. Supplement with open datasets if needed.`

In [5]:
# Load your dataset
df = pd.read_csv('instagram_dataset.csv')
print(df.head())
print(df.info())

# Compute engagement score (using only likes, as per your data)
df['engagement'] = df['likes']

# Categorize performance based on likes
thresholds = df['likes'].quantile([0.25, 0.75])
df['performance'] = pd.cut(df['likes'], bins=[-np.inf, thresholds[0.25], thresholds[0.75], np.inf], labels=['Low', 'Medium', 'High'])

# Note: No reels in this dataset; assuming all are posts (images)

   id                               image_path  \
0   1  https://picsum.photos/seed/pic1/600/600   
1   2  https://picsum.photos/seed/pic2/600/600   
2   3  https://picsum.photos/seed/pic3/600/600   
3   4  https://picsum.photos/seed/pic4/600/600   
4   5  https://picsum.photos/seed/pic5/600/600   

                             caption                                hashtags  \
0   Exploring AI-powered creativity!        #AI #MachineLearning #Innovation   
1         Design meets intelligence.                  #Design #AIArt #Future   
2   Unlocking insights through data.           #DataScience #Analytics #Tech   
3   Building smarter social content.  #ContentStrategy #SocialMedia #AItools   
4  Creativity boosted by algorithms.       #CreativeAI #DeepLearning #Trends   

   likes  
0    120  
1     95  
2    150  
3    180  
4    210  
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 5 columns):
 #   Column      Non-Null Count  Dtype 
---  ------ 

## Step 2: Feature Extraction
Extract features from text and images (no videos/audio in your dataset).

In [6]:
# Initialize models
bert_tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
bert_model = BertModel.from_pretrained('bert-base-uncased').to(device)
clip_processor = CLIPProcessor.from_pretrained('openai/clip-vit-base-patch32')
clip_model = CLIPModel.from_pretrained('openai/clip-vit-base-patch32').to(device)

def extract_text_features(text):
    inputs = bert_tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512).to(device)
    with torch.no_grad():
        outputs = bert_model(**inputs)
    return outputs.last_hidden_state.mean(dim=1).squeeze().cpu().numpy()

def extract_image_features(image_path):
    try:
        if image_path.startswith('http'):  # If URL
            image = Image.open(requests.get(image_path, stream=True).raw).convert('RGB')
        else:  # Local path
            image = Image.open(image_path).convert('RGB')
        inputs = clip_processor(images=image, return_tensors='pt').to(device)
        with torch.no_grad():
            features = clip_model.get_image_features(**inputs)
        return features.squeeze().cpu().numpy()
    except:
        return np.zeros(512)  # CLIP image feature dim

# Apply to dataset
df['text'] = df['caption'] + ' ' + df['hashtags']
df['text_features'] = df['text'].apply(extract_text_features)
df['image_features'] = df['image_path'].apply(extract_image_features)

# Combine features (text + image only)
def combine_features(row):
    text_feat = row['text_features']
    img_feat = row['image_features']
    return np.concatenate([text_feat, img_feat])

df['combined_features'] = df.apply(combine_features, axis=1)
X = np.array(df['combined_features'].tolist())
y = df['performance'].astype('category').cat.codes  # Encode labels

Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.


## Step 3: Train Model
Train a classifier on the combined features.

In [7]:
# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Random Forest
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Evaluate
y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))

# Save model
import joblib
joblib.dump(model, 'performance_predictor.pkl')

              precision    recall  f1-score   support

           0       0.00      0.00      0.00       1.0
           1       0.00      0.00      0.00       0.0

    accuracy                           0.00       1.0
   macro avg       0.00      0.00      0.00       1.0
weighted avg       0.00      0.00      0.00       1.0



['performance_predictor.pkl']

## Step 4: Prediction Function
Function to predict for new content (image path, caption, hashtags).

In [8]:
def predict_performance(caption, hashtags, image_path=None):
    text = caption + ' ' + hashtags
    text_feat = extract_text_features(text)
    img_feat = extract_image_features(image_path) if image_path else np.zeros(512)
    features = np.concatenate([text_feat, img_feat]).reshape(1, -1)
    pred = model.predict(features)[0]
    labels = ['Low', 'Medium', 'High']
    return labels[pred]

# Example
print(predict_performance('Amazing sunset!', '#sunset #photography', 'path/to/image.jpg'))  # Replace with actual path

Medium


## Step 5: Streamlit Dashboard
Save the code below as `app.py` and run `streamlit run app.py`.

```python
import streamlit as st
import joblib
from PIL import Image
import numpy as np
from transformers import BertTokenizer, BertModel, CLIPProcessor, CLIPModel
import torch

# Load model and processors
model = joblib.load('performance_predictor.pkl')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
bert_tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
bert_model = BertModel.from_pretrained('bert-base-uncased').to(device)
clip_processor = CLIPProcessor.from_pretrained('openai/clip-vit-base-patch32')
clip_model = CLIPModel.from_pretrained('openai/clip-vit-base-patch32').to(device)

# Feature extraction functions (copy from above)
def extract_text_features(text):
    inputs = bert_tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512).to(device)
    with torch.no_grad():
        outputs = bert_model(**inputs)
    return outputs.last_hidden_state.mean(dim=1).squeeze().cpu().numpy()

def extract_image_features(image_path):
    try:
        image = Image.open(image_path).convert('RGB')
        inputs = clip_processor(images=image, return_tensors='pt').to(device)
        with torch.no_grad():
            features = clip_model.get_image_features(**inputs)
        return features.squeeze().cpu().numpy()
    except:
        return np.zeros(512)

def predict_performance(caption, hashtags, image_path=None):
    text = caption + ' ' + hashtags
    text_feat = extract_text_features(text)
    img_feat = extract_image_features(image_path) if image_path else np.zeros(512)
    features = np.concatenate([text_feat, img_feat]).reshape(1, -1)
    pred = model.predict(features)[0]
    labels = ['Low', 'Medium', 'High']
    return labels[pred]

st.title('Instagram Post Performance Predictor')
st.write('Upload an image, enter caption and hashtags to predict performance.')

caption = st.text_input('Caption')
hashtags = st.text_input('Hashtags (comma-separated)')

uploaded_file = st.file_uploader('Upload Image', type=['jpg', 'png'])
if uploaded_file:
    image = Image.open(uploaded_file)
    st.image(image, caption='Uploaded Image')
    if st.button('Predict'):
        # Save temp file for processing
        temp_path = 'temp_image.jpg'
        image.save(temp_path)
        pred = predict_performance(caption, hashtags, temp_path)
        st.write(f'Predicted Performance: {pred}')
        os.remove(temp_path)  # Clean up


---

### Final Notes:
- **Running**: Paste into Jupyter cells and run. If `image_path` is a URL, it should work; if local, ensure paths are absolute.
- **Enhancements**: If your dataset grows (e.g., add `comments`, `shares`, `video_path`), I can update further. For better accuracy, collect more data.
- **Issues?**: If the dataset load fails or features don't extract, share error messages for debugging.

Let me know how it goes or if you need more tweaks!