# 🏗️ JESA Tender Evaluation System - Google Colab Version

**AI-Powered Supplier Proposal Analysis using Local LLM (Llama 3.2)**

---

## 📋 Overview

This notebook uses a local Large Language Model (Llama 3.2 3B) running on Google Colab to analyze supplier proposals and rank them based on multiple criteria.

**Features:**
- ✅ Zero cost (uses Google Colab free tier)
- ✅ Complete data privacy (no external APIs)
- ✅ PDF document processing
- ✅ AI-powered analysis with local LLM
- ✅ Weighted scoring and ranking
- ✅ Excel export functionality

**Evaluation Criteria:**
1. Technical Compliance (30%)
2. Price Competitiveness (25%)
3. Company Experience (20%)
4. Timeline Feasibility (15%)
5. Risk Assessment (10%)

---

## ⚙️ Setup Instructions

**IMPORTANT: Enable GPU First!**
1. Click **Runtime** → **Change runtime type**
2. Select **Hardware accelerator**: **T4 GPU**
3. Click **Save**

**Then run all cells in order (or click Runtime → Run all)**


## 📦 Cell 1: Install Required Packages

This cell installs all necessary Python packages. **Run this first!**

⏱️ Expected time: ~2-3 minutes


In [None]:
# Install required packages
print("📦 Installing required packages...")
print("This may take 2-3 minutes...\\n")

!pip install -q transformers accelerate bitsandbytes
!pip install -q pdfplumber PyPDF2
!pip install -q pandas openpyxl
!pip install -q torch torchvision torchaudio

print("\\n✅ All packages installed successfully!")
print("You can now proceed to the next cell.")


## 📚 Cell 2: Import Libraries

Import all necessary Python libraries.


In [None]:
# Import libraries
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
import pdfplumber
import PyPDF2
import pandas as pd
import json
import re
from pathlib import Path
from datetime import datetime
from google.colab import files
import io
import os
from typing import Dict, List, Any

print("✅ All libraries imported successfully!")
print(f"🔥 PyTorch version: {torch.__version__}")
print(f"🎯 CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"🚀 GPU: {torch.cuda.get_device_name(0)}")


## 🤖 Cell 3: Load Llama 3.2 Model

Load the Llama 3.2 3B model from Hugging Face.

⏱️ First time: ~2-4 minutes (downloads ~6GB)

⏱️ Subsequent runs: ~30 seconds (cached)

**Note**: This uses the free Llama 3.2 3B model optimized for Colab's free tier.


In [None]:
print("🤖 Loading Llama 3.2 3B model...")
print("This may take 2-4 minutes on first run (model will be cached for future use)\\n")

# Model selection
MODEL_NAME = "meta-llama/Llama-3.2-3B-Instruct"

# Alternative: Use Phi-3 if Llama is not available
# MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"

try:
    # Load tokenizer
    print("📝 Loading tokenizer...")
    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
    
    # Load model with optimizations for Colab
    print("🧠 Loading model...")
    model = AutoModelForCausalLM.from_pretrained(
        MODEL_NAME,
        torch_dtype=torch.float16,  # Use half precision to save memory
        device_map="auto",           # Automatically use GPU
        trust_remote_code=True,
        attn_implementation="eager"  # Use eager attention to avoid flash-attention warnings
    )
    
    print("\\n✅ Model loaded successfully!")
    print(f"📊 Model: {MODEL_NAME}")
    print(f"💾 Model size: ~6GB")
    print(f"🎯 Device: {next(model.parameters()).device}")
    
except Exception as e:
    print(f"❌ Error loading model: {e}")
    print("\\nTrying alternative model (Phi-3)...")
    
    MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
    tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME, trust_remote_code=True)
    model = AutoModelForCausalLM.from_pretrained(
        MODEL_NAME,
        torch_dtype=torch.float16,
        device_map="auto",
        trust_remote_code=True,
        attn_implementation="eager"  # Use eager attention to avoid flash-attention warnings
    )
    
    # Set pad token if not set
    if tokenizer.pad_token is None:
        tokenizer.pad_token = tokenizer.eos_token
    
    print("\\n✅ Alternative model (Phi-3) loaded successfully!")


## 🔧 Cell 4: Define PDF Processing Functions

Functions to extract text from PDF documents.


In [None]:
def extract_text_from_pdf(pdf_file_bytes, filename="document.pdf"):
    """Extract text from PDF file bytes."""
    try:
        # Save bytes to temporary file
        temp_path = f"/tmp/{filename}"
        with open(temp_path, 'wb') as f:
            f.write(pdf_file_bytes)
        
        # Extract text using pdfplumber
        text_parts = []
        with pdfplumber.open(temp_path) as pdf:
            for page_num, page in enumerate(pdf.pages, 1):
                page_text = page.extract_text()
                if page_text:
                    text_parts.append(page_text)
        
        # Clean up temp file
        os.remove(temp_path)
        
        return '\\n\\n'.join(text_parts)
        
    except Exception as e:
        print(f"❌ Error extracting PDF: {e}")
        return ""

def extract_text_from_file(file_bytes, filename):
    """Extract text from PDF or TXT file."""
    if filename.lower().endswith('.pdf'):
        return extract_text_from_pdf(file_bytes, filename)
    elif filename.lower().endswith('.txt'):
        return file_bytes.decode('utf-8')
    else:
        print(f"⚠️ Unsupported file type: {filename}")
        return ""

print("✅ PDF processing functions defined!")


## 🧠 Cell 5: Define AI Analysis Functions

Functions to analyze proposals using the local LLM.


In [None]:
def analyze_proposal_with_llm(proposal_text, tender_requirements, supplier_name):
    """Analyze a supplier proposal using the local LLM."""
    
    # Create a shorter, more focused prompt for faster processing
    prompt = f"""Analyze this proposal and return JSON scores (0-100):

Tender: {tender_requirements[:1000]}

Proposal: {proposal_text[:1500]}

Return JSON:
{{
  "supplier_name": "{supplier_name}",
  "criteria_scores": {{
    "technical_compliance": {{"score": 85, "justification": "Brief explanation", "evidence": []}},
    "price_competitiveness": {{"score": 75, "justification": "Brief explanation", "evidence": []}},
    "company_experience": {{"score": 90, "justification": "Brief explanation", "evidence": []}},
    "timeline_feasibility": {{"score": 80, "justification": "Brief explanation", "evidence": []}},
    "risk_assessment": {{"score": 85, "justification": "Brief explanation", "evidence": []}}
  }},
  "final_score": 83.0,
  "overall_summary": "Summary",
  "red_flags": [],
  "recommendations": "Recommendation"
}}"""
    
    try:
        # Generate response using the model with optimized settings
        inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=2048).to(model.device)
        
        print(f"   🧠 Starting generation...")
        print(f"   📊 Input length: {inputs['input_ids'].shape[1]} tokens")
        
        import time
        start_time = time.time()
        
        with torch.no_grad():
            outputs = model.generate(
                **inputs,
                max_new_tokens=500,  # Reduced for faster generation
                temperature=0.7,
                do_sample=False,     # Greedy decoding for faster results
                pad_token_id=tokenizer.pad_token_id if tokenizer.pad_token_id is not None else tokenizer.eos_token_id,
                eos_token_id=tokenizer.eos_token_id,
                use_cache=False      # Disable cache to avoid DynamicCache error
            )
        
        generation_time = time.time() - start_time
        print(f"   ⏱️ Generation completed in {generation_time:.1f} seconds")
        
        # Decode the response
        response = tokenizer.decode(outputs[0], skip_special_tokens=True)
        print(f"   📝 Response length: {len(response)} characters")
        print(f"   🔍 Response preview: {response[:100]}...")
        
        # Extract JSON from response
        json_start = response.find('{')
        json_end = response.rfind('}') + 1
        
        if json_start != -1 and json_end > json_start:
            json_content = response[json_start:json_end]
            
            try:
                result = json.loads(json_content)
                result['status'] = 'success'
                result['model_used'] = MODEL_NAME
                result['analysis_timestamp'] = datetime.now().strftime('%Y-%m-%d %H:%M:%S')
                return result
            except json.JSONDecodeError:
                print(f"⚠️ JSON parsing failed for {supplier_name}")
                return create_fallback_result(supplier_name)
        else:
            return create_fallback_result(supplier_name)
            
    except Exception as e:
        print(f"❌ Error analyzing {supplier_name}: {e}")
        return create_fallback_result(supplier_name)

def create_fallback_result(supplier_name):
    """Create a fallback result when analysis fails."""
    return {
        'supplier_name': supplier_name,
        'criteria_scores': {
            'technical_compliance': {'score': 50, 'justification': 'Analysis incomplete', 'evidence': []},
            'price_competitiveness': {'score': 50, 'justification': 'Analysis incomplete', 'evidence': []},
            'company_experience': {'score': 50, 'justification': 'Analysis incomplete', 'evidence': []},
            'timeline_feasibility': {'score': 50, 'justification': 'Analysis incomplete', 'evidence': []},
            'risk_assessment': {'score': 50, 'justification': 'Analysis incomplete', 'evidence': []}
        },
        'final_score': 50.0,
        'overall_summary': 'Analysis could not be completed',
        'red_flags': ['Manual review required'],
        'recommendations': 'Please review manually',
        'status': 'error',
        'model_used': MODEL_NAME,
        'analysis_timestamp': datetime.now().strftime('%Y-%m-%d %H:%M:%S')
    }

print("✅ Analysis functions defined!")


## 📊 Cell 6: Define Scoring and Ranking Functions

Functions to calculate weighted scores and rank suppliers.


In [None]:
def calculate_weighted_score(analysis, weights):
    """Calculate weighted final score for a supplier."""
    criteria_scores = analysis.get('criteria_scores', {})
    weighted_sum = 0.0
    
    for criterion, weight in weights.items():
        criterion_data = criteria_scores.get(criterion, {})
        if isinstance(criterion_data, dict):
            score = criterion_data.get('score', 0)
            weighted_sum += score * (weight / 100.0)
    
    return round(weighted_sum, 2)

def rank_suppliers(analyses, weights):
    """Rank suppliers based on weighted scores."""
    ranked_suppliers = []
    
    for analysis in analyses:
        weighted_score = calculate_weighted_score(analysis, weights)
        supplier_data = analysis.copy()
        supplier_data['weighted_score'] = weighted_score
        ranked_suppliers.append(supplier_data)
    
    # Sort by weighted score (descending)
    ranked_suppliers.sort(key=lambda x: x['weighted_score'], reverse=True)
    
    # Assign ranks
    for i, supplier in enumerate(ranked_suppliers):
        supplier['rank'] = i + 1
    
    return ranked_suppliers

def export_to_excel(ranked_suppliers, weights, filename='tender_evaluation_results.xlsx'):
    """Export results to Excel file."""
    # Create rankings sheet
    rankings_data = []
    for supplier in ranked_suppliers:
        criteria_scores = supplier.get('criteria_scores', {})
        rankings_data.append({
            'Rank': supplier.get('rank', 0),
            'Supplier Name': supplier.get('supplier_name', 'Unknown'),
            'Final Score': supplier.get('weighted_score', 0),
            'Technical': criteria_scores.get('technical_compliance', {}).get('score', 0),
            'Price': criteria_scores.get('price_competitiveness', {}).get('score', 0),
            'Experience': criteria_scores.get('company_experience', {}).get('score', 0),
            'Timeline': criteria_scores.get('timeline_feasibility', {}).get('score', 0),
            'Risk': criteria_scores.get('risk_assessment', {}).get('score', 0)
        })
    
    df_rankings = pd.DataFrame(rankings_data)
    
    # Save to Excel
    with pd.ExcelWriter(filename, engine='openpyxl') as writer:
        df_rankings.to_excel(writer, sheet_name='Rankings', index=False)
    
    return filename

print("✅ Scoring and ranking functions defined!")


## 📁 Cell 7: Upload Tender Requirements Document

Click the button below to upload your tender requirements document (PDF or TXT).


In [None]:
print("📋 Upload Tender Requirements Document")
print("Please select your tender requirements file (PDF or TXT)\\n")

uploaded_tender = files.upload()

# Process the uploaded file
tender_requirements = ""
tender_filename = ""

for filename, file_bytes in uploaded_tender.items():
    tender_filename = filename
    print(f"\\n📄 Processing: {filename}")
    tender_requirements = extract_text_from_file(file_bytes, filename)
    print(f"✅ Extracted {len(tender_requirements)} characters")
    print(f"\\nPreview (first 300 characters):")
    print("-" * 50)
    print(tender_requirements[:300] + "..." if len(tender_requirements) > 300 else tender_requirements)
    print("-" * 50)

if tender_requirements:
    print("\\n🎉 Tender requirements loaded successfully!")
else:
    print("\\n❌ Failed to load tender requirements. Please try again.")


## 📄 Cell 8: Upload Supplier Proposals

Click the button below to upload multiple supplier proposal documents (PDF or TXT).


In [None]:
print("📄 Upload Supplier Proposals")
print("You can select multiple files at once\\n")

uploaded_proposals = files.upload()

# Process all uploaded files
supplier_proposals = []

for filename, file_bytes in uploaded_proposals.items():
    print(f"\\n📄 Processing: {filename}")
    proposal_text = extract_text_from_file(file_bytes, filename)
    
    if proposal_text:
        supplier_proposals.append({
            'name': filename,
            'text': proposal_text
        })
        print(f"✅ Extracted {len(proposal_text)} characters")
    else:
        print(f"❌ Failed to extract text from {filename}")

print(f"\\n🎉 Loaded {len(supplier_proposals)} supplier proposal(s)!")
for i, proposal in enumerate(supplier_proposals, 1):
    print(f"  {i}. {proposal['name']}: {len(proposal['text'])} characters")


## ⚖️ Cell 9: Configure Evaluation Weights

Set the importance of each evaluation criterion. **Weights must total 100%!**


In [None]:
# Configure evaluation weights (must total 100%)
evaluation_weights = {
    'technical_compliance': 30,
    'price_competitiveness': 25,
    'company_experience': 20,
    'timeline_feasibility': 15,
    'risk_assessment': 10
}

# Validate weights
total_weight = sum(evaluation_weights.values())

print("⚖️ Evaluation Weights Configuration")
print("=" * 40)
for criterion, weight in evaluation_weights.items():
    print(f"  {criterion.replace('_', ' ').title()}: {weight}%")
print("=" * 40)
print(f"Total: {total_weight}%")

if abs(total_weight - 100.0) < 0.01:
    print("✅ Weights are valid (total 100%)")
else:
    print(f"❌ Weights must total 100%, currently {total_weight}%")
    print("Please adjust the weights above and run this cell again.")


## 🚀 Cell 10: Run AI Analysis

Analyze all supplier proposals using the local LLM.

⏱️ Expected time: ~2-5 minutes per proposal


In [None]:
print("🤖 Starting AI Analysis...")
print(f"Analyzing {len(supplier_proposals)} proposal(s)...\\n")

analysis_results = []

for i, proposal in enumerate(supplier_proposals, 1):
    print(f"\\n{'='*60}")
    print(f"📊 Analyzing Proposal {i}/{len(supplier_proposals)}: {proposal['name']}")
    print(f"{'='*60}")
    
    result = analyze_proposal_with_llm(
        proposal_text=proposal['text'],
        tender_requirements=tender_requirements,
        supplier_name=proposal['name']
    )
    
    analysis_results.append(result)
    
    if result.get('status') == 'success':
        print(f"✅ Analysis completed for {proposal['name']}")
        print(f"   Final Score: {result.get('final_score', 'N/A')}")
    else:
        print(f"⚠️ Analysis incomplete for {proposal['name']}")
    
    # Clear GPU cache to prevent memory issues
    torch.cuda.empty_cache()

print(f"\\n\\n🎉 Analysis completed for all {len(analysis_results)} proposals!")


## 🏆 Cell 11: Rank Suppliers and Calculate Scores

Calculate weighted scores and rank all suppliers.


In [None]:
print("🏆 Ranking Suppliers...\\n")

# Rank suppliers
ranked_suppliers = rank_suppliers(analysis_results, evaluation_weights)

# Calculate summary statistics
scores = [s.get('weighted_score', 0) for s in ranked_suppliers]

print("📊 EVALUATION SUMMARY")
print("=" * 60)
print(f"Total Suppliers Evaluated: {len(ranked_suppliers)}")
print(f"Average Score: {sum(scores) / len(scores):.1f}")
print(f"Highest Score: {max(scores):.1f}")
print(f"Lowest Score: {min(scores):.1f}")
print(f"Score Range: {max(scores) - min(scores):.1f}")
print("=" * 60)

print("\\n✅ Ranking completed!")


## 📊 Cell 12: Display Results - Rankings Table

View the ranked suppliers with their scores.


In [None]:
print("🏆 SUPPLIER RANKINGS")
print("=" * 100)

# Create rankings dataframe
rankings_data = []
for supplier in ranked_suppliers:
    criteria_scores = supplier.get('criteria_scores', {})
    rankings_data.append({
        'Rank': supplier.get('rank', 0),
        'Supplier': supplier.get('supplier_name', 'Unknown'),
        'Final Score': f"{supplier.get('weighted_score', 0):.1f}",
        'Technical': criteria_scores.get('technical_compliance', {}).get('score', 0),
        'Price': criteria_scores.get('price_competitiveness', {}).get('score', 0),
        'Experience': criteria_scores.get('company_experience', {}).get('score', 0),
        'Timeline': criteria_scores.get('timeline_feasibility', {}).get('score', 0),
        'Risk': criteria_scores.get('risk_assessment', {}).get('score', 0)
    })

df_rankings = pd.DataFrame(rankings_data)
display(df_rankings)

print("\\n" + "=" * 100)


## 📋 Cell 13: Display Detailed Analysis

View detailed analysis for each supplier.


In [None]:
print("📋 DETAILED ANALYSIS")
print("=" * 100)

for supplier in ranked_suppliers:
    print(f"\\n{'#'*100}")
    print(f"RANK #{supplier.get('rank', 0)}: {supplier.get('supplier_name', 'Unknown')}")
    print(f"Final Score: {supplier.get('weighted_score', 0):.1f}/100")
    print(f"{'#'*100}\\n")
    
    # Overall summary
    print(f"📝 Overall Summary:")
    print(f"   {supplier.get('overall_summary', 'N/A')}\\n")
    
    # Recommendations
    print(f"💡 Recommendations:")
    print(f"   {supplier.get('recommendations', 'N/A')}\\n")
    
    # Red flags
    red_flags = supplier.get('red_flags', [])
    if red_flags:
        print(f"⚠️ Red Flags:")
        for flag in red_flags:
            print(f"   • {flag}")
        print()
    
    # Criteria scores
    print(f"📊 Criteria Scores:")
    criteria_scores = supplier.get('criteria_scores', {})
    for criterion, data in criteria_scores.items():
        if isinstance(data, dict):
            score = data.get('score', 0)
            justification = data.get('justification', 'N/A')
            criterion_name = criterion.replace('_', ' ').title()
            print(f"\\n   {criterion_name}: {score}/100")
            print(f"   → {justification}")
            
            evidence = data.get('evidence', [])
            if evidence:
                print(f"   Evidence:")
                for item in evidence[:3]:
                    print(f"      • {item}")
    
    print(f"\\n{'-'*100}")

print("\\n✅ Detailed analysis displayed!")


## 📤 Cell 14: Export Results to Excel

Export the evaluation results to an Excel file and download it.


In [None]:
print("📤 Exporting results to Excel...\\n")

# Generate timestamp for filename
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
excel_filename = f'tender_evaluation_results_{timestamp}.xlsx'

# Export to Excel
export_to_excel(ranked_suppliers, evaluation_weights, excel_filename)

print(f"✅ Excel file created: {excel_filename}")
print(f"\\nClick the button below to download:\\n")

# Download the file
files.download(excel_filename)

print("\\n🎉 Results exported successfully!")


## 💾 Cell 15: Save Results as JSON (Optional)

Export results as JSON for programmatic access.


In [None]:
print("💾 Saving results as JSON...\\n")

# Create export data
export_data = {
    'evaluation_metadata': {
        'timestamp': datetime.now().isoformat(),
        'weights_used': evaluation_weights,
        'total_suppliers': len(ranked_suppliers),
        'model_used': MODEL_NAME
    },
    'ranked_suppliers': ranked_suppliers
}

# Save to JSON file
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
json_filename = f'tender_evaluation_results_{timestamp}.json'

with open(json_filename, 'w', encoding='utf-8') as f:
    json.dump(export_data, f, indent=2, ensure_ascii=False)

print(f"✅ JSON file created: {json_filename}")
print(f"\\nClick the button below to download:\\n")

# Download the file
files.download(json_filename)

print("\\n🎉 JSON export completed!")


---

## 🎓 Usage Instructions

### Quick Start Guide:

1. **Enable GPU**: Runtime → Change runtime type → T4 GPU
2. **Run All Cells**: Runtime → Run all (or run cells sequentially)
3. **Upload Files**: When prompted, select your tender and proposal documents
4. **Review Results**: Scroll down to see rankings and detailed analysis
5. **Download**: Click download buttons to get Excel and JSON files

### Customization:

- **Adjust Weights**: Modify Cell 9 to change evaluation criteria importance
- **Change Model**: Edit Cell 3 to use a different model (e.g., Phi-3)
- **Modify Criteria**: Update the analysis prompt in Cell 5

### Troubleshooting:

- **Out of Memory**: Use a smaller model or enable 4-bit quantization
- **Slow Processing**: This is normal for local LLMs (2-5 min per proposal)
- **Model Download Fails**: Try restarting the runtime
- **GPU Not Available**: Check if you've enabled GPU in runtime settings

---

## 📞 Support

**JESA Tender Evaluation System**
- GitHub: https://github.com/Kazaz-Mohammed/JESA_SUPPLIERS_MODEL.git
- Version: Colab with Local LLM (Llama 3.2)

---

*Made with ❤️ for JESA*
