# 🚀 **Mistral Deployment Notebook**

A complete pipeline covering:

- 🤖 **Model Setup:** Load and quantize Ministral-8B  
- 📝 **Prompting:** Define and apply your product listing template  
- ⚡ **Generation:** Use LangChain LLMChain to create JSON listings  
- 🌐 **API & Deployment:** FastAPI with ngrok for instant public access  

## 🛠️ **Install Dependencies**

Make sure to install all necessary packages for smooth operation:

- ⚙️ **Model Quantization**: Support for efficient Mistral model loading.  
- 🔗 **LangChain**: For prompt templating and LLM chaining.  
- 🚀 **API Deployment**: FastAPI and uvicorn for serving the application.


In [None]:
# Core LLM and quantization libraries
!pip install -q -U langchain transformers bitsandbytes accelerate optimum langchain_community

# Web framework and async tooling for API server
!pip install -q fastapi nest-asyncio python-multipart uvicorn

# Ngrok for tunneling your local server to a public URL
!pip install -q pyngrok

## 📥 **Import Required Libraries**

Load essential Python packages for model handling, API, and utilities:

- 🤗 **Transformers**: For loading and managing language models.  
- 🚀 **FastAPI**: To create the web API endpoint.  
- 🌐 **ngrok**: To expose the local server publicly.  
- 📚 **LangChain & Pydantic**: For prompt management and data validation.


In [None]:
# Core libraries
import torch                      # PyTorch for model operations
import json                       # Handling JSON data
import re                         # Regular expressions for text parsing

# Hugging Face Transformers
from transformers import BitsAndBytesConfig, AutoModelForCausalLM, AutoTokenizer, pipeline

# LangChain components for LLM pipelines
from langchain import HuggingFacePipeline, PromptTemplate, LLMChain

# FastAPI for building API
from fastapi import FastAPI, HTTPException

# Ngrok for exposing local server to the internet
from pyngrok import ngrok

# Needed to allow async FastAPI to run in notebooks like Colab
import nest_asyncio

# For request validation in FastAPI
from pydantic import BaseModel

## 🔐 **Hugging Face Authentication**

Set up authentication to access private models and repos:

- 🔑 **Save Token**: Store your Hugging Face token securely using `HfFolder.save_token()`.  
- 🌐 **Environment Variable**: Export `HF_TOKEN` so libraries like `transformers` and `langchain` can use it automatically.


In [None]:
from huggingface_hub import HfFolder
HfFolder.save_token("hf_YOUR_TOKEN_HERE")  # Save token locally

import os
os.environ["HF_TOKEN"] = "hf_YOUR_TOKEN_HERE"  # Set token as env variable


## 🧾 **Amazon Listing Prompt Template**

Set up a powerful prompt to generate high-converting Amazon listings using Mistral:

- 🛍️ **Structured Output**: The prompt enforces a strict JSON format with fields like `title`, `bullet_points`, `product_description`, etc.
- 🎯 **SEO-Optimized Language**: Encourages persuasive, keyword-rich, benefit-driven content tailored for Amazon.
- 🧠 **Smart Formatting Rules**: Follows Amazon listing standards—character limits, formatting styles, and call-to-actions.

In [None]:
# 🧾 Define the base prompt to guide Mistral in generating a full Amazon listing
# The prompt includes detailed structure, formatting, and style instructions


listing_prompt = """You are an expert ecommerce content strategist and Amazon listing specialist with world-class SEO experience. You always write persuasive, structured, keyword-optimized listings that follow Amazon's best practices and maximize conversions and search rankings.

Your task is to write a full Amazon product listing in structured JSON format using the details provided by the user. Use advanced keyword placement, compelling emotional language, and Amazon-specific formatting techniques.

EXAMPLE OF EXCELLENT OUTPUT:

  "title": "ACME Premium Yoga Mat with Non-Slip Surface for All Levels - Extra Thick 1/2 Inch Cushioned Exercise Mat, Eco-Friendly TPE, Teal",
  "bullet_points": [
    "EXTRA THICK CUSHIONING provides ultimate comfort for your joints with 1/2 inch (12.7mm) of high-density padding that protects knees, hips and elbows during floor exercises",
    "NON-SLIP TEXTURED SURFACE ensures stability and prevents injuries with our specially designed dual-layer grip technology that works on any floor surface",
    "ECO-FRIENDLY MATERIALS made from 100% recyclable TPE that contains no PVC, latex, or harmful chemicals, making it safe for you and the environment",
    "PERFECT FOR ALL YOGA STYLES including hot yoga, pilates, and general fitness with moisture-resistant technology that prevents bacteria growth and odors",
    "LIGHTWEIGHT AND PORTABLE design includes a free carrying strap, making this 72\" x 24\" mat easy to transport to studio classes or outdoor sessions"
  ],
  "product_description": "Transform your yoga practice with the ACME Premium Yoga Mat, engineered for practitioners who demand both comfort and performance. Our revolutionary non-slip surface technology provides exceptional grip even during the most intense hot yoga sessions, eliminating the frustration of slipping mats that disrupt your flow.\n\nMeticulously crafted from eco-friendly TPE material, this mat offers superior cushioning at 1/2 inch thickness while remaining surprisingly lightweight. The closed-cell construction prevents moisture absorption, making cleanup effortless and extending the life of your mat far beyond traditional options.\n\nWhether you're a beginner or advanced yogi, the ACME Premium Yoga Mat supports your journey with alignment markers subtly incorporated into the elegant design. Experience the perfect balance of stability, comfort and earth-friendly materials that thousands of 5-star reviews can't stop raving about.",
  "backend_search_terms": "yoga mat exercise mat fitness mat workout mat thick cushioned non slip thick extra large tpe eco friendly meditation pilates floor mat home gym equipment exercise equipment beginners advanced hot yoga bikram alignment markers",
  "attributes":
    "material": "TPE (Thermoplastic Elastomer)",
    "color": "Teal",
    "size": "72 x 24 x 0.5 inches",
    "target_audience": "Adults, All Yoga Levels",
    "recommended_uses": "Yoga, Pilates, Fitness, Meditation, Floor Exercises",
    "special_features": "Non-slip surface, Extra thick, Alignment markers, Eco-friendly",
    "warranty": "1-Year Manufacturer Warranty",
    "country_of_origin": "USA",
    "weight": "2.5 pounds",
    "dimensions": "72 x 24 x 0.5 inches"

===== STRUCTURE FOR YOUR OUTPUT =====

You must create a JSON object with these exact fields:
- title
- bullet_points (array of 5 items)
- product_description
- backend_search_terms
- attributes (object with material, color, size, target_audience, recommended_uses, special_features, warranty, country_of_origin, weight, dimensions)

===== DETAILED INSTRUCTIONS =====

TITLE REQUIREMENTS:
* Begin with the brand name. If not specified, just mention *BRAND_NAME*
* Include primary keyword early
* Stay under 200 characters
* Use strong modifiers (e.g., "premium", "professional", "2024 model")
* Format: [BRAND] [PRIMARY KEYWORD] with [KEY FEATURE] for [TARGET AUDIENCE] - [USP], [SIZE/QUANTITY]

BULLET POINTS REQUIREMENTS:
* Create exactly 5 bullet points
* Start each with 2-3 WORDS IN ALL CAPS highlighting a core benefit
* Follow with detailed explanation using secondary keywords
* Each bullet should be 150-200 characters
* Focus on benefits, not just features

PRODUCT DESCRIPTION REQUIREMENTS:
* Write 2-3 persuasive paragraphs (300-450 words total)
* Include emotional triggers and pain points
* Incorporate primary and secondary keywords naturally
* Use line breaks between paragraphs
* End with a call to action

BACKEND SEARCH TERMS REQUIREMENTS:
* Create a single string of keywords (no commas)
* No repeated words or brand names
* Maximum 250 characters
* Include synonyms, alternate spellings, Misspellings	 and related use cases

ATTRIBUTES REQUIREMENTS:
* Fill all attribute fields based on product information
* If information is missing, make logical assumptions
* Be specific and detailed in each attribute

===== EXAMPLE STRUCTURE (DO NOT COPY THIS JSON, CREATE YOUR OWN) =====


  "title": "*BRAND_NAME*(your title here)",
  "bullet_points": [
    "(your first bullet point here)",
    "(your second bullet point here)",
    "(your third bullet point here)",
    "(your fourth bullet point here)",
    "(your fifth bullet point here)"
  ],
  "product_description": "(your product description here)",
  "backend_search_terms": "(your backend search terms here)",
  "attributes":
    "material": "(material here)",
    "color": "(color here)",
    "size": "(size here)",
    "target_audience": "(target audience here)",
    "recommended_uses": "(recommended uses here)",
    "special_features": "(special features here)",
    "warranty": "(warranty here)",
    "country_of_origin": "(country of origin here)",
    "weight": "(weight here)",
    "dimensions": "(dimensions here)"



Output ONLY the JSON object. No commentary, no explanation.

User product details:
{product_data}

"""


## 🚀 **Model Setup: Mistral 8B with 4-bit Quantization**

- ⚙️ **Model & Tokenizer**: Load Mistral 8B with efficient quantization.
- 🔄 **Pipeline**: Configured for creative text generation.
- 🧩 **Prompt**: Template for product listing generation.
- 🧹 **Output Cleaning**: Extract JSON from raw text.
- 📝 **Listing Generation**: Run prompt and return output.


In [None]:
class LLMService:
    def __init__(self):
        # Initialize model and prompt on creation of the service object
        self.setup_models()
        self.setup_prompt()

    def setup_models(self):
        """
        Initialize the Mistral 8B model with 4-bit quantization for lower VRAM usage.
        Set up the tokenizer and a text-generation pipeline with sampling parameters.
        """
        bnb_config = BitsAndBytesConfig(
            load_in_4bit=True,
            bnb_4bit_quant_type="nf4",
            bnb_4bit_use_double_quant=True,
            bnb_4bit_compute_dtype=torch.float16
        )

        tokenizer = AutoTokenizer.from_pretrained("mistralai/Ministral-8B-Instruct-2410")
        model = AutoModelForCausalLM.from_pretrained(
            "mistralai/Ministral-8B-Instruct-2410",
            quantization_config=bnb_config,
            device_map="auto",
            trust_remote_code=False
        )

        self.pipeline = pipeline(
            "text-generation",
            model=model,
            tokenizer=tokenizer,
            device_map="auto",
            do_sample=True,
            temperature=0.6,
            top_k=40,
            top_p=0.9,
            max_new_tokens=5000,
            repetition_penalty=1.2,
            num_return_sequences=1,
            eos_token_id=tokenizer.eos_token_id,
            pad_token_id=tokenizer.eos_token_id,
        )
        # Wrap pipeline in Langchain wrapper for integration
        self.llm = HuggingFacePipeline(pipeline=self.pipeline)

    def setup_prompt(self):
        """
        Load and prepare the prompt template for product listing generation.
        The prompt expects 'product_data' as input variable.
        """
        self.prompt = PromptTemplate(
            template=listing_prompt,  # Your external prompt string
            input_variables=["product_data"]
        )

    def clean_llava_output(self, raw_text: str) -> dict:
        """
        Extract JSON object from raw model output text.

        Args:
            raw_text (str): Raw string output from the model.

        Returns:
            dict or None: Parsed JSON if found, else None.
        """
        try:
            # Match a ```json { ... } ``` block and extract the JSON inside
            m = re.search(r"```json\s*(\{.*?\})\s*```", raw_text, re.DOTALL)
            if not m:
                return None
            json_str = m.group(1)
            return json.loads(json_str)
        except Exception as e:
            print(f"[clean_llava_output] JSON cleaning failed: {e}")
            return None

    def generate_listing(self, raw_llava_output: str) -> dict:
        """
        Generate an Amazon product listing by cleaning LLaVA output and
        running the cleaned data through the Mistral model.

        Args:
            raw_llava_output (str): Raw text output from LLaVA model.

        Returns:
            dict: Contains either the raw Mistral output or an error message.
        """
        # Clean raw text to extract valid JSON data
        clean_data = self.clean_llava_output(raw_llava_output)
        if clean_data is None:
            return {"error": "Invalid or missing JSON in LLaVA output"}

        # Create a Langchain LLMChain and run prompt with the cleaned data
        chain = LLMChain(prompt=self.prompt, llm=self.llm)
        raw_result = chain.run({"product_data": json.dumps(clean_data, indent=2)})

        # Debug print the raw Mistral output
        print("\n=== MISTRAL RAW OUTPUT ===\n", raw_result, "\n=============================\n")

        return {"raw_mistral_output": raw_result}


## 🌐 **FastAPI Server & Ngrok Tunnel**

- 📥 **API Endpoint**: Receives raw LLaVA output, returns Mistral listing.
- ⚠️ **Error Handling**: Proper HTTP status on failures.
- 🔗 **Ngrok**: Exposes local server to the internet with a public URL.
- 🚀 **Run Server**: Starts FastAPI app with async support.

In [None]:
# Define request schema for incoming data
class ListingRequest(BaseModel):
    llava_output: str  # Raw output from LLaVA to be processed

# Initialize FastAPI app and LLM service instance
app = FastAPI()
service = LLMService()

# API route to generate product listing from LLaVA output
@app.post("/generate-listing")
async def create_listing(request: ListingRequest):
    try:
        result = service.generate_listing(request.llava_output)
        if "error" in result:
            # Return 400 for invalid JSON or input errors
            raise HTTPException(status_code=400, detail=result["error"])
        return result
    except Exception as e:
        # Return 500 for unexpected server errors
        raise HTTPException(status_code=500, detail=str(e))

# Ngrok setup to expose local FastAPI server publicly
NGROK_TOKEN = "YOUR-NGROK-TOKEN-HERE"  # Replace with your actual token
ngrok.set_auth_token(NGROK_TOKEN)
ngrok_tunnel = ngrok.connect(8000)
print(f'🔗 Mistral API URL: {ngrok_tunnel.public_url}')

# Allow nested async loops and start Uvicorn server
nest_asyncio.apply()
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
