# LLM Integration Description
LLM Integration refers to incorporation LLMs, in this case, Gemini, into businesses processes for enhanced efficiency, allowing for applications to leverage advanced NLP capabilites for a wide range of tasks.

Some key concepts include...
* API CAlls
* Prompt Engineering
* Data Handeling
* RAG

In this portion of the project, we are aiming to conduct LLM integration through: 
* Lead Scoring - identify and prioritize high value leads
* Account Health - detects churn risks or upsell opportunities
* Semantic Search - make chatbot retrieve and respond intelligently to business or sales data

### Gemini: What is it?
Gemini is an LLM developed by Google that allows for reasoning, code generation, and instruction following. 

### API - Application Programming Interface
API will help us bridge between the LLM and the sent request. The API Key is a password that identifies the project when you use an APi

## Step 1: Install Packages and Gemini

In [1]:
#install Gemini
# pip install google-generativeai python-dotenv

In [2]:
#Connect to Gemini API
import os
import sys
from dotenv import load_dotenv

# Load environment variables .env
load_dotenv()

# Get the Gemini API key from .env
api_key = os.getenv('GEMINI_API_KEY')

In [3]:
#initialize Gemini and import necessary libraries
from __future__ import annotations

from typing import Dict, Any, List, Optional

import google.generativeai as genai
import pandas as pd
import numpy as np
from dotenv import load_dotenv
import joblib

# Configure Gemini with your API key
genai.configure(api_key=api_key)

# Debug: Print available models
print("Available models:", [m.name for m in genai.list_models()])

# Create the model with correct name
model = genai.GenerativeModel('models/gemini-pro-latest')  # Update model name to match available models




DefaultCredentialsError: 
  No API_KEY or ADC found. Please either:
    - Set the `GOOGLE_API_KEY` environment variable.
    - Manually pass the key with `genai.configure(api_key=my_api_key)`.
    - Or set up Application Default Credentials, see https://ai.google.dev/gemini-api/docs/oauth for more information.

## Step 2: Create Helper Funcitons for Integration

In [None]:
class GBDTLeadScorer:
    def __init__(
        self,
        pipeline=None,
        features: Optional[List[str]] = None,
        model_path: Optional[str] = None,
        feature_path: Optional[str] = None,
    ):
        if pipeline is not None:
            self.pipeline = pipeline
        elif model_path and os.path.exists(model_path):
            self.pipeline = joblib.load(model_path)
        else:
            raise ValueError("Provide a fitted pipeline or a valid model_path.")

        # Prefer explicit features arg; else try load from file; else None
        if features is not None:
            self.features = features
        elif feature_path and os.path.exists(feature_path):
            self.features = joblib.load(feature_path)
        else:
            self.features = None  # pipeline must handle missing order

    def predict(self, features: Dict[str, Any]) -> float:
        X = pd.DataFrame([features])
        if self.features:
            X = X.reindex(columns=self.features)
        proba = self.pipeline.predict_proba(X)[:, 1]
        return float(proba[0])


class AccountHealthScorer:
    def __init__(
        self,
        pipeline=None,
        features: Optional[List[str]] = None,
        model_path: Optional[str] = None,
        feature_path: Optional[str] = None,
    ):
        if pipeline is not None:
            self.pipeline = pipeline
        elif model_path and os.path.exists(model_path):
            self.pipeline = joblib.load(model_path)
        else:
            raise ValueError("Provide a fitted pipeline or a valid model_path.")

        if features is not None:
            self.features = features
        elif feature_path and os.path.exists(feature_path):
            self.features = joblib.load(feature_path)
        else:
            self.features = None

    def predict(self, features: Dict[str, Any]) -> float:
        X = pd.DataFrame([features])
        if self.features:
            X = X.reindex(columns=self.features)
        proba = self.pipeline.predict_proba(X)[:, 1]  # adjust if needed
        return float(proba[0])


class SemanticSearcher:
    """Use an existing Chroma collection/client if you already created one."""
    def __init__(
        self,
        collection=None,
        client=None,
        persist_dir: Optional[str] = None,
        collection_name: str = "crm_docs",
    ):
        if collection is not None:
            self.collection = collection
        else:
            if client is None:
                if not persist_dir:
                    raise ValueError("Provide collection/client OR persist_dir.")
                import chromadb
                client = chromadb.PersistentClient(path=persist_dir)
            self.collection = client.get_or_create_collection(collection_name)

    def search(self, query: str, n_results: int = 5, where: Optional[Dict[str, Any]] = None):
        res = self.collection.query(query_texts=[query], n_results=n_results, where=where)
        docs = res.get("documents", [[]])[0]
        metas = res.get("metadatas", [[]])[0]
        dists = res.get("distances", [[]])[0] if "distances" in res else [None]*len(docs)
        out = []
        for i, txt in enumerate(docs):
            meta = metas[i] if i < len(metas) else {}
            out.append({
                "text": txt,
                "source": (meta or {}).get("source", f"doc_{i}"),
                "score": dists[i]
            })
        return out


## Step 3: Create Main Integration

In [None]:
load_dotenv()
api_key = os.getenv("GEMINI_API_KEY")
if not api_key:
    raise RuntimeError("GEMINI_API_KEY not set in your .env")
genai.configure(api_key=api_key)

class CRMAssistant:
    def __init__(self, lead_scorer, health_scorer, semantic_searcher, model_name="gemini-pro"):
        self.lead_scorer = lead_scorer
        self.health_scorer = health_scorer
        self.semantic_searcher = semantic_searcher
        self.model = genai.GenerativeModel(model_name)

    @staticmethod
    def _fmt(docs: List[Dict[str, Any]]) -> str:
        return "\n".join(f"[{d.get('source','doc')}] {d.get('text','')}" for d in docs)

    def process_lead_score(self, lead_score: float, lead_data: Dict[str, Any]) -> str:
        prompt = f"""
You are a CRM assistant. Given a lead score of {lead_score:.2f} and this lead data:
{lead_data}

Provide:
- 2–4 next actions,
- a short rationale referencing top drivers,
- a one-line priority (High/Med/Low).
"""
        return self.model.generate_content(prompt).text

    def analyze_account_health(self, health_score: float, account_data: Dict[str, Any]) -> str:
        prompt = f"""
You are a CRM assistant. Account health score: {health_score:.2f}.
Account data:
{account_data}

Return:
- 3 targeted recommendations,
- key risks,
- owner + due date for the first action.
"""
        return self.model.generate_content(prompt).text

    def semantic_search_response(self, query: str, context_docs: List[Dict[str, Any]]) -> str:
        ctx = self._fmt(context_docs)
        prompt = f"""
Use the CRM context to answer. If info is missing, say what is needed.

Context:
{ctx}

Question: {query}

Reply with a brief summary and bullet points. Cite sources in [brackets].
"""
        return self.model.generate_content(prompt).text

    def process_query(self, query: str, context: Optional[Dict[str, Any]] = None) -> str:
        q = query.lower()
        if "lead" in q and any(k in q for k in ["score", "convert", "probability"]):
            if context is None:
                return "I need lead features (or a lead_id I can fetch) to score this lead."
            score = self.lead_scorer.predict(context)
            return self.process_lead_score(score, context)

        if any(k in q for k in ["health", "churn", "risk", "renewal"]):
            if context is None:
                return "I need account features (or an account_id I can fetch) to assess health."
            score = self.health_scorer.predict(context)
            return self.analyze_account_health(score, context)

        docs = self.semantic_searcher.search(query)
        return self.semantic_search_response(query, docs)


## Step 4: Usage Example

In [None]:
# Basic LLM Integration Test
# Test function remains the same
def test_llm_integration():
    prompt = "What are the key benefits of using LLMs in CRM systems?"
    
    try:
        response = model.generate_content(
            prompt,
            generation_config={
                "temperature": 0.7,
                "top_p": 0.8,
                "top_k": 40
            }
        )
        
        print("\n=== Testing Gemini LLM Integration ===")
        print("\nPrompt:", prompt)
        print("\nResponse:", response.text)
        print("\n=====================================")
        
    except Exception as e:
        print(f"Error testing LLM integration: {str(e)}")
        print("API Key configured:", bool(api_key))
        print("Available models:", [m.name for m in genai.list_models()])

# Run the test
test_llm_integration()


=== Testing Gemini LLM Integration ===

Prompt: What are the key benefits of using LLMs in CRM systems?

Response: Of course. Integrating Large Language Models (LLMs) into Customer Relationship Management (CRM) systems is a transformative shift, turning CRMs from passive systems of record into proactive, intelligent partners.

The key benefits can be grouped into four main areas: **Supercharging Sales Teams**, **Revolutionizing Customer Service**, **Enhancing Marketing Personalization**, and **Boosting Overall Operational Efficiency**.

Here’s a detailed breakdown of the key benefits in each area:

---

### 1. Supercharging Sales Teams

LLMs act as a "co-pilot" for sales representatives, automating tedious tasks and providing intelligent guidance to help them close deals faster.

*   **Automated Communication & Content Generation:**
    *   **Benefit:** Dramatically reduces the time spent on writing. Sales reps can instantly generate personalized outreach emails, follow-up messages, a