## üìÅ File Structure Created
```
project/
‚îú‚îÄ‚îÄ screening_history/       # JSON files with full screening data
‚îÇ   ‚îî‚îÄ‚îÄ screening_YYYYMMDD_HHMMSS.json
‚îú‚îÄ‚îÄ exports/                 # CSV exports for easy sharing
‚îÇ   ‚îî‚îÄ‚îÄ screening_YYYYMMDD_HHMMSS.csv
‚îî‚îÄ‚îÄ AI_Project.ipynb        # This notebook
```

## üîß Technical Stack
- **NLP**: sentence-transformers (all-mpnet-base-v2)
- **Visualization**: Plotly (interactive charts)
- **Text Processing**: NLTK, scikit-learn
- **UI**: Gradio 4.x with custom CSS
- **PDF Parsing**: PyMuPDF (fitz)

In [None]:
!pip install gradio PyMuPDF nltk scikit-learn

import gradio as gr
import fitz
import io
import nltk
import re
import string
import numpy as np
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

# Download NLTK data
nltk.download('stopwords')
nltk.download('punkt')
nltk.download('wordnet')


Collecting PyMuPDF
  Downloading pymupdf-1.26.7-cp310-abi3-manylinux_2_28_x86_64.whl.metadata (3.4 kB)
Downloading pymupdf-1.26.7-cp310-abi3-manylinux_2_28_x86_64.whl (24.1 MB)
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m24.1/24.1 MB[0m [31m41.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: PyMuPDF
Successfully installed PyMuPDF-1.26.7


[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package wordnet to /root/nltk_data...


True

In [None]:
!pip install sentence-transformers tf-keras



In [None]:
!pip install plotly

# Fix for Keras 3 compatibility issue
import os
os.environ['TRANSFORMERS_NO_TF'] = '1'



In [None]:
# Fix for Keras 3 compatibility - must be set before imports
import os
os.environ['TRANSFORMERS_NO_TF'] = '1'

from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('all-mpnet-base-v2')

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/571 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/363 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

In [None]:
import json
import os
from datetime import datetime
import plotly.graph_objects as go
import plotly.express as px
from pathlib import Path

# Create directories for storage
Path("screening_history").mkdir(exist_ok=True)
Path("exports").mkdir(exist_ok=True)

In [None]:
import nltk
nltk.download('punkt')
nltk.download('punkt_tab')
nltk.download('stopwords')
nltk.download('wordnet')


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt_tab.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

##Text Cleaning

In [None]:
def clean_text(text):
    text = text.lower()
    text = re.sub(r'[^a-z\s]', ' ', text)
    tokens = nltk.word_tokenize(text)
    stop_words = set(stopwords.words('english'))
    tokens = [w for w in tokens if w not in stop_words and len(w) > 2]
    lemmatizer = WordNetLemmatizer()
    tokens = [lemmatizer.lemmatize(w) for w in tokens]
    cleaned = " ".join(tokens)
    return cleaned

##Buzz word remover

In [None]:
def remove_buzzwords(text):
    buzzwords = {
        "team", "player", "motivated", "driven", "responsible", "innovative", "creative",
        "hardworking", "dedicated", "dynamic", "excellent", "fast", "passionate",
        "communication", "problem", "solving", "self", "starter", "leadership", "organized",
        "goal", "oriented", "results", "focused", "collaborative", "multi", "tasking"
    }

    words = text.split()
    filtered = [w for w in words if w not in buzzwords]
    return " ".join(filtered)


In [None]:
def extract_experience(text):
    """Extract years of experience from resume text"""
    patterns = [
        r'(\d+)\+?\s*(?:years?|yrs?)\s*(?:of)?\s*(?:experience|exp)',
        r'experience[:\s]+(\d+)\+?\s*(?:years?|yrs?)',
        r'(\d+)\+?\s*(?:years?|yrs?)\s*in',
    ]

    years = []
    for pattern in patterns:
        matches = re.findall(pattern, text.lower())
        years.extend([int(m) for m in matches])

    return max(years) if years else 0


def extract_education(text):
    """Extract education level from resume"""
    education_levels = {
        'phd': ['ph\.?d', 'doctorate', 'doctoral'],
        'Bachelors': ['master', 'msc', 'm\.sc', 'ma', 'm\.a', 'mba', 'm\.b\.a', 'mtech', 'm\.tech'],
        'bachelors': ['bachelor', 'bsc', 'b\.sc', 'ba', 'b\.a', 'btech', 'b\.tech', 'be', 'b\.e'],
        'associate': ['associate', 'diploma'],
        'high_school': ['high school', 'secondary']
    }

    text_lower = text.lower()
    for level, patterns in education_levels.items():
        for pattern in patterns:
            if re.search(pattern, text_lower):
                return level
    return 'not_specified'


def extract_certifications(text):
    """Extract certifications from resume"""
    cert_patterns = [
        r'(?:certified|certification)[:\s]+([A-Z][A-Za-z\s&+]+(?:Certified|Certificate|Certification))',
        r'([A-Z]{2,}(?:\s+[A-Z]{2,})*)\s*(?:Certified|Certification)',
        r'(AWS|Azure|GCP|PMP|CISSP|CompTIA|Cisco|Oracle|SAP)\s*[A-Za-z\s]*(?:Certified|Certification)?'
    ]

    certs = set()
    for pattern in cert_patterns:
        matches = re.findall(pattern, text)
        certs.update(matches)

    return list(certs)[:10]  # Limit to 10 certifications


def extract_contact_info(text):
    """Extract email and phone number"""
    email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'
    phone_pattern = r'(?:\+\d{1,3}[-.\s]?)?\(?\d{3}\)?[-.\s]?\d{3}[-.\s]?\d{4}'

    emails = re.findall(email_pattern, text)
    phones = re.findall(phone_pattern, text)

    return {
        'email': emails[0] if emails else None,
        'phone': phones[0] if phones else None
    }

  'phd': ['ph\.?d', 'doctorate', 'doctoral'],
  'Bachelors': ['master', 'msc', 'm\.sc', 'ma', 'm\.a', 'mba', 'm\.b\.a', 'mtech', 'm\.tech'],
  'Bachelors': ['master', 'msc', 'm\.sc', 'ma', 'm\.a', 'mba', 'm\.b\.a', 'mtech', 'm\.tech'],
  'Bachelors': ['master', 'msc', 'm\.sc', 'ma', 'm\.a', 'mba', 'm\.b\.a', 'mtech', 'm\.tech'],
  'Bachelors': ['master', 'msc', 'm\.sc', 'ma', 'm\.a', 'mba', 'm\.b\.a', 'mtech', 'm\.tech'],
  'bachelors': ['bachelor', 'bsc', 'b\.sc', 'ba', 'b\.a', 'btech', 'b\.tech', 'be', 'b\.e'],
  'bachelors': ['bachelor', 'bsc', 'b\.sc', 'ba', 'b\.a', 'btech', 'b\.tech', 'be', 'b\.e'],
  'bachelors': ['bachelor', 'bsc', 'b\.sc', 'ba', 'b\.a', 'btech', 'b\.tech', 'be', 'b\.e'],
  'bachelors': ['bachelor', 'bsc', 'b\.sc', 'ba', 'b\.a', 'btech', 'b\.tech', 'be', 'b\.e'],


In [None]:
def calculate_ats_score(text, original_text):
    """Calculate ATS-friendliness score with detailed feedback"""
    score = 0
    feedback = []
    max_score = 100

    # 1. Length check (15 points)
    word_count = len(text.split())
    if 300 <= word_count <= 800:
        score += 15
    elif word_count < 300:
        feedback.append("‚ùå Resume too short (add more details)")
        score += 5
    else:
        feedback.append("‚ö†Ô∏è Resume too long (consider condensing)")
        score += 10

    # 2. Action verbs (20 points)
    action_verbs = [
        'achieved', 'improved', 'developed', 'designed', 'implemented',
        'managed', 'led', 'created', 'built', 'launched', 'increased',
        'decreased', 'optimized', 'streamlined', 'established', 'coordinated'
    ]
    found_verbs = [v for v in action_verbs if v in text]
    verb_score = min(20, len(found_verbs) * 2)
    score += verb_score
    if verb_score < 10:
        feedback.append(f"‚ùå Add more action verbs (found {len(found_verbs)}, need 5+)")

    # 3. Quantifiable achievements (25 points)
    numbers_pattern = r'\d+%|\$\d+|[\d,]+\+?(?:\s*(?:users|customers|clients|million|thousand|projects))'
    quantifiables = re.findall(numbers_pattern, original_text.lower())
    quant_score = min(25, len(quantifiables) * 5)
    score += quant_score
    if quant_score < 15:
        feedback.append(f"‚ùå Add quantifiable achievements (found {len(quantifiables)}, need 3+)")

    # 4. Section headers (15 points)
    headers = ['experience', 'education', 'skills', 'projects', 'summary', 'objective']
    found_headers = [h for h in headers if h in original_text.lower()]
    header_score = min(15, len(found_headers) * 3)
    score += header_score
    if header_score < 9:
        feedback.append(f"‚ö†Ô∏è Add standard sections (found {len(found_headers)}/6)")

    # 5. Contact information (10 points)
    contact = extract_contact_info(original_text)
    if contact['email']:
        score += 5
    else:
        feedback.append("‚ùå Missing email address")
    if contact['phone']:
        score += 5
    else:
        feedback.append("‚ö†Ô∏è Missing phone number")

    # 6. Keywords density (15 points)
    unique_words = len(set(text.split()))
    density_score = min(15, unique_words // 10)
    score += density_score
    if density_score < 10:
        feedback.append("‚ö†Ô∏è Limited vocabulary (add more diverse keywords)")

    if not feedback:
        feedback.append("‚úÖ Resume is ATS-friendly!")

    return round(score, 2), feedback

In [None]:
def save_screening_results(df, job_description, metadata=None):
    """Save screening results with timestamp"""
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")

    # Save to CSV
    csv_path = f"exports/screening_{timestamp}.csv"
    df.to_csv(csv_path, index=False)

    # Save to JSON with metadata
    json_data = {
        'timestamp': timestamp,
        'datetime': datetime.now().isoformat(),
        'job_description': job_description[:500],  # First 500 chars
        'results': df.to_dict(orient='records'),
        'metadata': metadata or {}
    }

    json_path = f"screening_history/screening_{timestamp}.json"
    with open(json_path, 'w') as f:
        json.dump(json_data, f, indent=2)

    return csv_path, json_path


def load_screening_history():
    """Load all screening history"""
    history = []
    history_dir = Path("screening_history")

    if history_dir.exists():
        for json_file in sorted(history_dir.glob("*.json"), reverse=True):
            try:
                with open(json_file, 'r') as f:
                    data = json.load(f)
                    history.append({
                        'timestamp': data.get('datetime', 'Unknown'),
                        'job_desc_preview': data.get('job_description', '')[:100] + '...',
                        'num_candidates': len(data.get('results', [])),
                        'top_score': max([r.get('Hybrid Score (%)', 0) for r in data.get('results', [])], default=0),
                        'file': str(json_file)
                    })
            except:
                continue

    return history


def get_screening_by_file(filepath):
    """Load specific screening result"""
    with open(filepath, 'r') as f:
        data = json.load(f)
    return pd.DataFrame(data['results'])

In [None]:
def create_score_distribution_chart(df):
    """Create bar chart for score distribution"""
    fig = go.Figure()

    fig.add_trace(go.Bar(
        x=df['Resume'],
        y=df['Hybrid Score (%)'],
        name='Hybrid Score',
        marker_color='#60a5fa',
        text=df['Hybrid Score (%)'],
        textposition='auto',
    ))

    fig.update_layout(
        title='Resume Score Distribution',
        xaxis_title='Candidate',
        yaxis_title='Score (%)',
        template='plotly_dark',
        height=400,
        showlegend=False
    )

    return fig


def create_skill_gap_analysis(df, jd_keywords):
    """Create radar chart showing skill gaps"""
    top_candidate = df.iloc[0]

    categories = ['Semantic Match', 'Skill Overlap', 'Keyword Match', 'ATS Score']
    values = [
        top_candidate.get('Semantic Match (%)', 0),
        top_candidate.get('Skill Overlap (%)', 0),
        top_candidate.get('Keyword Bonus (%)', 0) * 10,  # Scale up
        top_candidate.get('ATS Score (%)', 0)
    ]

    fig = go.Figure()

    fig.add_trace(go.Scatterpolar(
        r=values,
        theta=categories,
        fill='toself',
        name=top_candidate['Resume'],
        marker_color='#60a5fa'
    ))

    fig.update_layout(
        polar=dict(
            radialaxis=dict(visible=True, range=[0, 100])
        ),
        showlegend=True,
        title=f"Skill Profile: {top_candidate['Resume']}",
        template='plotly_dark',
        height=400
    )

    return fig


def create_comparison_chart(df):
    """Create multi-metric comparison"""
    fig = go.Figure()

    metrics = ['Semantic Match (%)', 'Skill Overlap (%)', 'Hybrid Score (%)']
    colors = ['#60a5fa', '#34d399', '#f59e0b']

    for i, metric in enumerate(metrics):
        if metric in df.columns:
            fig.add_trace(go.Bar(
                name=metric.replace(' (%)', ''),
                x=df['Resume'],
                y=df[metric],
                marker_color=colors[i]
            ))

    fig.update_layout(
        barmode='group',
        title='Multi-Metric Comparison',
        xaxis_title='Candidate',
        yaxis_title='Score (%)',
        template='plotly_dark',
        height=450,
        legend=dict(orientation="h", yanchor="bottom", y=1.02, xanchor="right", x=1)
    )

    return fig


def generate_gap_analysis_text(matched_keywords, missing_keywords, jd_skills, matched_skills):
    """Generate detailed gap analysis"""
    analysis = []

    if missing_keywords:
        analysis.append(f"**Missing Keywords ({len(missing_keywords)}):** {', '.join(missing_keywords[:10])}")

    missing_skills = [s for s in jd_skills if s not in matched_skills]
    if missing_skills:
        analysis.append(f"\n**Skill Gaps ({len(missing_skills)}):** {', '.join(missing_skills[:8])}")

    if matched_skills:
        analysis.append(f"\n**Strengths ({len(matched_skills)}):** {', '.join(matched_skills[:8])}")

    return "\n".join(analysis) if analysis else "No significant gaps identified"

In [None]:
def extract_skills_from_jd(job_desc, top_n=15):
    """
    Extracts top skills or keywords from the job description using TF-IDF.
    """
    from sklearn.feature_extraction.text import TfidfVectorizer
    vectorizer = TfidfVectorizer(stop_words='english', max_features=top_n)
    tfidf_matrix = vectorizer.fit_transform([job_desc])
    skills = vectorizer.get_feature_names_out()
    return [s.lower() for s in skills]


##File Text Extraction

In [None]:
def extract_text_from_file(file_obj):
    filename = file_obj.name
    if filename.endswith(".pdf"):
        text = ""
        with fitz.open(stream=file_obj.read(), filetype="pdf") as doc:
            for page in doc:
                text += page.get_text()
        return filename, text
    elif filename.endswith(".txt"):
        text = file_obj.read().decode("utf-8", errors="ignore")
        return filename, text
    else:
        return filename, ""

##Keyword extractor

In [None]:
from collections import Counter

def extract_top_keywords(text, top_n=15):
    words = text.split()
    freq = Counter(words)
    keywords = [w for w, _ in freq.most_common(top_n)]
    return keywords


##Main Function

In [None]:
def process_input(job_description, resume_paths, min_experience=0):
    try:
        if not job_description.strip():
            return None, None, None, None, "‚ö†Ô∏è Please enter a job description."

        resumes = {}
        resumes_original = {}
        for path in resume_paths:
            filename = path.split("/")[-1].split("\\")[-1]
            with open(path, "rb") as f:
                if filename.endswith(".pdf"):
                    text = ""
                    with fitz.open(stream=f.read(), filetype="pdf") as doc:
                        for page in doc:
                            text += page.get_text()
                elif filename.endswith(".txt"):
                    f.seek(0)
                    text = f.read().decode("utf-8", errors="ignore")
                else:
                    continue
            if len(text.strip()) > 0:
                resumes[filename] = text
                resumes_original[filename] = text

        if not resumes:
            return None, None, None, None, "‚ö†Ô∏è No valid resumes uploaded."

        # --- CLEANING ---
        jd_clean = clean_text(job_description)
        jd_clean = remove_buzzwords(jd_clean)
        resumes_cleaned = {name: remove_buzzwords(clean_text(text)) for name, text in resumes.items()}

        # --- SCORING (Semantic + Keyword + ATS Hybrid) ---
        df_scores = score_resumes_semantic_hybrid_v2(jd_clean, resumes_cleaned, resumes_original)

        # Filter by minimum experience
        if min_experience > 0:
            df_scores = df_scores[df_scores['Experience (Years)'] >= min_experience]

        if df_scores.empty:
            return None, None, None, None, f"‚ö†Ô∏è No candidates meet the minimum experience requirement ({min_experience} years)."

        # --- SAVE RESULTS ---
        csv_path, json_path = save_screening_results(
            df_scores,
            job_description,
            metadata={'num_resumes': len(resumes), 'min_experience': min_experience}
        )

        # --- CREATE VISUALIZATIONS ---
        score_chart = create_score_distribution_chart(df_scores)
        comparison_chart = create_comparison_chart(df_scores)
        skill_profile = create_skill_gap_analysis(df_scores, extract_top_keywords(jd_clean))

        # --- GENERATE SUMMARY ---
        top_candidate = df_scores.iloc[0]
        summary = f"""
### üìä Screening Summary

**Total Candidates Analyzed:** {len(df_scores)}
**Best Match:** {top_candidate['Resume']} ({top_candidate['Hybrid Score (%)']}%)

**Top Candidate Details:**
- üéØ Hybrid Score: {top_candidate['Hybrid Score (%)']}%
- ü§ñ ATS Score: {top_candidate['ATS Score (%)']}%
- üíº Experience: {top_candidate['Experience (Years)']} years
- üéì Education: {top_candidate['Education']}
- üìß Email: {top_candidate['Email']}

**Saved to:**
- CSV: `{csv_path}`
- JSON: `{json_path}`
"""

        return score_chart, comparison_chart, skill_profile, df_scores, summary

    except Exception as e:
        import traceback
        return None, None, None, None, f"‚ùå Error: {e}\n\n{traceback.format_exc()}"

##Extract keywords from th JD

In [None]:
from collections import Counter

def extract_top_keywords(text, top_n=15):
    words = text.split()
    freq = Counter(words)
    keywords = [w for w, _ in freq.most_common(top_n)]
    return keywords


##Scoring

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity
import pandas as pd
import numpy as np

def score_resumes_semantic_hybrid_v2(job_description, resumes_cleaned, resumes_original, top_n_keywords=15):
    import numpy as np
    from sentence_transformers import util

    names = list(resumes_cleaned.keys())
    texts = [job_description] + list(resumes_cleaned.values())

    # --- Semantic Embeddings ---
    embeddings = model.encode(texts, convert_to_tensor=True)
    jd_emb = embeddings[0]
    resume_embs = embeddings[1:]
    similarities = util.cos_sim(jd_emb, resume_embs)[0].cpu().numpy()
    semantic_scores = np.round(similarities * 100, 2)

    # --- Keyword Insights ---
    jd_keywords = extract_top_keywords(job_description, top_n=top_n_keywords)
    jd_skills = extract_skills_from_jd(job_description)

    details = []
    for i, (name, text) in enumerate(resumes_cleaned.items()):
        words = set(text.split())
        matched = [kw for kw in jd_keywords if kw in words]
        skill_overlap = [s for s in jd_skills if s in words]
        missing = [kw for kw in jd_keywords if kw not in words]

        # --- Weighted Hybrid Scoring ---
        skill_weight = len(skill_overlap) / max(len(jd_skills), 1)
        keyword_bonus = len(matched) / max(len(jd_keywords), 1)
        hybrid_score = 0.7 * semantic_scores[i] + 0.2 * (skill_weight * 100) + 0.1 * (keyword_bonus * 100)

        # --- Extract Additional Info ---
        original_text = resumes_original[name]
        experience_years = extract_experience(original_text)
        education = extract_education(original_text)
        certifications = extract_certifications(original_text)
        contact = extract_contact_info(original_text)

        # --- ATS Score ---
        ats_score, ats_feedback = calculate_ats_score(text, original_text)

        details.append({
            "Resume": name,
            "Hybrid Score (%)": round(hybrid_score, 2),
            "Semantic Match (%)": semantic_scores[i],
            "Skill Overlap (%)": round(skill_weight * 100, 2),
            "Keyword Bonus (%)": round(keyword_bonus * 100, 2),
            "ATS Score (%)": ats_score,
            "Experience (Years)": experience_years,
            "Education": education.replace('_', ' ').title(),
            "Certifications": ', '.join(certifications[:3]) if certifications else 'None',
            "Email": contact['email'] or 'Not found',
            "Phone": contact['phone'] or 'Not found',
            "Matched Keywords": ", ".join(matched[:10]),
            "Missing Keywords": ", ".join(missing[:10]),
            "Matched Skills": ", ".join(skill_overlap[:8]),
            "ATS Feedback": " | ".join(ats_feedback[:3])
        })

    df = (
        pd.DataFrame(details)
        .sort_values(by="Hybrid Score (%)", ascending=False)
        .reset_index(drop=True)
    )
    return df

##Gradio Interface





In [None]:
import gradio as gr

# Enhanced Custom CSS for better UI
custom_css = """
@import url('https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700;800&display=swap');

.gradio-container {
    background: linear-gradient(135deg, #0a0e27 0%, #1a1f3a 25%, #2d1b69 50%, #1a1f3a 75%, #0a0e27 100%) !important;
    font-family: 'Inter', -apple-system, BlinkMacSystemFont, 'Segoe UI', system-ui, sans-serif !important;
    min-height: 100vh !important;
}

.contain {
    max-width: 1600px !important;
    margin: auto !important;
    padding: 2rem !important;
}

/* Animated gradient header */
.markdown h2 {
    color: transparent !important;
    font-size: 3rem !important;
    font-weight: 800 !important;
    text-align: center !important;
    margin-bottom: 0.5rem !important;
    background: linear-gradient(90deg, #60a5fa, #a78bfa, #ec4899, #f59e0b, #60a5fa) !important;
    background-size: 200% 200% !important;
    -webkit-background-clip: text !important;
    -webkit-text-fill-color: transparent !important;
    animation: gradient 8s ease infinite !important;
    letter-spacing: -0.02em !important;
    text-shadow: 0 0 30px rgba(96, 165, 250, 0.3) !important;
}

@keyframes gradient {
    0% { background-position: 0% 50%; }
    50% { background-position: 100% 50%; }
    100% { background-position: 0% 50%; }
}

.markdown h3 {
    color: #a78bfa !important;
    font-size: 1.25rem !important;
    font-weight: 600 !important;
    text-align: center !important;
    margin-bottom: 2.5rem !important;
    opacity: 0.9 !important;
}

/* Enhanced label styling */
label {
    color: #cbd5e1 !important;
    font-weight: 600 !important;
    font-size: 0.95rem !important;
    margin-bottom: 0.75rem !important;
    letter-spacing: 0.02em !important;
    text-transform: uppercase !important;
    font-size: 0.85rem !important;
}

/* Improved textbox styling */
.scroll-hide {
    background: rgba(15, 23, 42, 0.8) !important;
    border: 1.5px solid rgba(139, 92, 246, 0.3) !important;
    border-radius: 1rem !important;
    color: #e2e8f0 !important;
    backdrop-filter: blur(16px) !important;
    transition: all 0.3s ease !important;
}

textarea {
    background: rgba(15, 23, 42, 0.95) !important;
    color: #e2e8f0 !important;
    border: 1.5px solid rgba(139, 92, 246, 0.3) !important;
    border-radius: 1rem !important;
    padding: 1.25rem !important;
    font-size: 0.95rem !important;
    line-height: 1.6 !important;
    transition: all 0.3s ease !important;
}

textarea:focus {
    border-color: #a78bfa !important;
    outline: none !important;
    box-shadow: 0 0 0 4px rgba(167, 139, 250, 0.15), 0 8px 24px rgba(139, 92, 246, 0.2) !important;
    transform: translateY(-2px) !important;
}

textarea::placeholder {
    color: #64748b !important;
    font-style: italic !important;
}

/* Enhanced file upload */
.file-upload, .file-preview {
    background: rgba(15, 23, 42, 0.8) !important;
    border: 2px dashed rgba(139, 92, 246, 0.5) !important;
    border-radius: 1.25rem !important;
    backdrop-filter: blur(16px) !important;
    padding: 2rem !important;
    transition: all 0.3s ease !important;
}

.file-upload:hover {
    border-color: #a78bfa !important;
    background: rgba(139, 92, 246, 0.1) !important;
    transform: translateY(-2px) !important;
    box-shadow: 0 8px 24px rgba(139, 92, 246, 0.2) !important;
}

/* Premium button styling */
.primary {
    background: linear-gradient(135deg, #8b5cf6 0%, #6366f1 50%, #3b82f6 100%) !important;
    border: none !important;
    border-radius: 9999px !important;
    color: white !important;
    font-weight: 700 !important;
    font-size: 1.1rem !important;
    padding: 1rem 3rem !important;
    transition: all 0.4s cubic-bezier(0.4, 0, 0.2, 1) !important;
    box-shadow: 0 8px 32px rgba(139, 92, 246, 0.4), 0 0 0 1px rgba(255, 255, 255, 0.1) inset !important;
    text-transform: uppercase !important;
    letter-spacing: 0.05em !important;
}

.primary:hover {
    transform: translateY(-3px) scale(1.02) !important;
    box-shadow: 0 16px 48px rgba(139, 92, 246, 0.6), 0 0 80px rgba(139, 92, 246, 0.3) !important;
    background: linear-gradient(135deg, #9f7aea 0%, #7c3aed 50%, #6366f1 100%) !important;
}

.secondary {
    background: rgba(71, 85, 105, 0.3) !important;
    border: 1.5px solid rgba(148, 163, 184, 0.3) !important;
    border-radius: 9999px !important;
    color: #cbd5e1 !important;
    font-weight: 600 !important;
    padding: 0.75rem 2rem !important;
    transition: all 0.3s ease !important;
}

.secondary:hover {
    background: rgba(100, 116, 139, 0.4) !important;
    border-color: rgba(203, 213, 225, 0.5) !important;
    transform: translateY(-2px) !important;
}

/* Enhanced dataframe */
.dataframe {
    background: rgba(15, 23, 42, 0.9) !important;
    border: 1.5px solid rgba(139, 92, 246, 0.3) !important;
    border-radius: 1.25rem !important;
    backdrop-filter: blur(16px) !important;
    overflow: hidden !important;
    box-shadow: 0 8px 32px rgba(0, 0, 0, 0.3) !important;
}

table {
    color: #e2e8f0 !important;
    border-collapse: separate !important;
    border-spacing: 0 !important;
    font-size: 0.9rem !important;
}

thead tr {
    background: linear-gradient(135deg, rgba(139, 92, 246, 0.3), rgba(99, 102, 241, 0.3)) !important;
}

th {
    color: #c4b5fd !important;
    font-weight: 700 !important;
    padding: 1.25rem 1rem !important;
    text-align: left !important;
    border-bottom: 2px solid rgba(139, 92, 246, 0.5) !important;
    text-transform: uppercase !important;
    font-size: 0.8rem !important;
    letter-spacing: 0.05em !important;
}

td {
    padding: 1rem !important;
    border-bottom: 1px solid rgba(71, 85, 105, 0.2) !important;
}

tbody tr {
    transition: all 0.2s ease !important;
}

tbody tr:hover {
    background: rgba(139, 92, 246, 0.15) !important;
    transform: scale(1.01) !important;
}

/* Slider styling */
input[type="range"] {
    accent-color: #8b5cf6 !important;
}

/* Tab styling */
.tab-nav button {
    color: #94a3b8 !important;
    font-weight: 600 !important;
    font-size: 1rem !important;
    padding: 1rem 2rem !important;
    border-radius: 1rem 1rem 0 0 !important;
    transition: all 0.3s ease !important;
    border: none !important;
    background: transparent !important;
}

.tab-nav button.selected {
    color: #c4b5fd !important;
    background: rgba(139, 92, 246, 0.15) !important;
    border-bottom: 3px solid #8b5cf6 !important;
}

.tab-nav button:hover {
    background: rgba(139, 92, 246, 0.1) !important;
    color: #c4b5fd !important;
}

/* Block/Card styling */
.block {
    background: rgba(15, 23, 42, 0.6) !important;
    border: 1.5px solid rgba(139, 92, 246, 0.25) !important;
    border-radius: 1.5rem !important;
    padding: 2rem !important;
    backdrop-filter: blur(16px) !important;
    margin-bottom: 2rem !important;
    box-shadow: 0 4px 24px rgba(0, 0, 0, 0.2) !important;
    transition: all 0.3s ease !important;
}

.block:hover {
    border-color: rgba(139, 92, 246, 0.4) !important;
    box-shadow: 0 8px 32px rgba(139, 92, 246, 0.15) !important;
}

/* Plot/Chart containers */
.plot-container {
    background: rgba(15, 23, 42, 0.8) !important;
    border-radius: 1.25rem !important;
    padding: 1rem !important;
    border: 1.5px solid rgba(139, 92, 246, 0.25) !important;
}

/* Footer hide */
footer {
    display: none !important;
}

/* Info boxes */
.markdown p {
    color: #cbd5e1 !important;
    line-height: 1.8 !important;
}

.markdown strong {
    color: #e0e7ff !important;
    font-weight: 700 !important;
}

.markdown ul {
    color: #cbd5e1 !important;
}

/* Scrollbar styling */
::-webkit-scrollbar {
    width: 10px !important;
    height: 10px !important;
}

::-webkit-scrollbar-track {
    background: rgba(15, 23, 42, 0.5) !important;
    border-radius: 10px !important;
}

::-webkit-scrollbar-thumb {
    background: rgba(139, 92, 246, 0.5) !important;
    border-radius: 10px !important;
}

::-webkit-scrollbar-thumb:hover {
    background: rgba(139, 92, 246, 0.7) !important;
}
"""

with gr.Blocks(css=custom_css, theme=gr.themes.Soft(primary_hue="purple", secondary_hue="blue")) as demo:
    gr.Markdown("## üß† Advanced AI Resume Screening System")
    gr.Markdown("### Powered by Semantic Analysis ‚Ä¢ ATS Scoring ‚Ä¢ Experience Parsing ‚Ä¢ History Tracking")

    with gr.Tabs():
        # Tab 1: Main Screening
        with gr.Tab("üìã Resume Screening"):
            gr.Markdown("""
            <div style="text-align: center; padding: 1rem; background: rgba(139, 92, 246, 0.1); border-radius: 1rem; margin-bottom: 2rem; border: 1px solid rgba(139, 92, 246, 0.3);">
                <p style="color: #c4b5fd; font-size: 1rem; margin: 0;">
                    üöÄ Upload resumes, set your criteria, and let AI find the perfect candidates
                </p>
            </div>
            """)

            with gr.Row():
                with gr.Column(scale=1):
                    job_desc = gr.Textbox(
                        label="üìÑ Job Description",
                        placeholder="Paste the complete job description here...\n\nExample:\n‚Ä¢ Required skills: Python, Machine Learning\n‚Ä¢ 3+ years of experience\n‚Ä¢ Bachelor's degree in Computer Science",
                        lines=12,
                        max_lines=20
                    )
                    resume_upload = gr.File(
                        label="üìé Upload Candidate Resumes",
                        file_count="multiple",
                        type="filepath",
                        file_types=[".pdf", ".txt"]
                    )

                    with gr.Row():
                        min_exp = gr.Slider(
                            minimum=0,
                            maximum=20,
                            value=0,
                            step=1,
                            label="‚è±Ô∏è Minimum Experience Required",
                            info="Set to 0 to include all candidates"
                        )

                    process_btn = gr.Button(
                        "‚ú® Analyze Resumes",
                        variant="primary",
                        size="lg",
                        scale=2
                    )

                with gr.Column(scale=1):
                    gr.Markdown("### üìä Quick Summary")
                    summary_box = gr.Markdown(
                        value="*Results will appear here after processing...*",
                        elem_classes="summary-box"
                    )

            gr.Markdown("### üìà Visual Analytics")
            with gr.Row():
                score_plot = gr.Plot(label="üìä Score Distribution", show_label=True)
                comparison_plot = gr.Plot(label="üìà Multi-Metric Comparison", show_label=True)

            with gr.Row():
                skill_radar = gr.Plot(label="üéØ Top Candidate Skill Profile", show_label=True)

            gr.Markdown("### üìã Detailed Candidate Analysis")
            output_box = gr.Dataframe(
                label="Complete Results Table",
                wrap=True,
                interactive=False,
                show_label=True
            )

            process_btn.click(
                process_input,
                inputs=[job_desc, resume_upload, min_exp],
                outputs=[score_plot, comparison_plot, skill_radar, output_box, summary_box]
            )

        # Tab 2: Screening History
        with gr.Tab("üìö Screening History"):
            gr.Markdown("""
            <div style="text-align: center; padding: 1rem; background: rgba(139, 92, 246, 0.1); border-radius: 1rem; margin-bottom: 2rem; border: 1px solid rgba(139, 92, 246, 0.3);">
                <p style="color: #c4b5fd; font-size: 1rem; margin: 0;">
                    üìú Access all your previous screening sessions with complete data and insights
                </p>
            </div>
            """)

            with gr.Row():
                with gr.Column(scale=2):
                    gr.Markdown("### üïê Past Screening Sessions")
                with gr.Column(scale=1):
                    refresh_btn = gr.Button("üîÑ Refresh History", variant="secondary", size="sm")

            history_df = gr.Dataframe(
                label="Historical Data",
                headers=["Timestamp", "Job Preview", "Candidates", "Top Score"],
                interactive=False,
                show_label=False,
                wrap=True
            )

            def load_history():
                history = load_screening_history()
                if history:
                    return pd.DataFrame([
                        {
                            'Timestamp': h['timestamp'],
                            'Job Preview': h['job_desc_preview'],
                            'Candidates': h['num_candidates'],
                            'Top Score': f"{h['top_score']}%"
                        } for h in history
                    ])
                return pd.DataFrame()

            refresh_btn.click(load_history, outputs=history_df)

            # Load history on tab open
            demo.load(load_history, outputs=history_df)

        # Tab 3: Analytics Insights
        with gr.Tab("üìä Analytics & Insights"):
            gr.Markdown("""
            <div style="text-align: center; padding: 1rem; background: rgba(139, 92, 246, 0.1); border-radius: 1rem; margin-bottom: 2rem; border: 1px solid rgba(139, 92, 246, 0.3);">
                <p style="color: #c4b5fd; font-size: 1rem; margin: 0;">
                    üí° Learn about the advanced AI metrics powering your resume screening
                </p>
            </div>
            """)

            with gr.Row():
                with gr.Column():
                    gr.Markdown("""
                    ### üéØ Scoring Metrics Explained

                    **Hybrid Score** (Primary Metric)
                    - üß† 70% Semantic Match: AI understands context and meaning
                    - üîß 20% Skill Overlap: Technical skill alignment
                    - üîë 10% Keyword Match: Exact terminology matching

                    **ATS Score** (Resume Quality)
                    - üìè Resume length optimization (300-800 words)
                    - ‚ö° Action verbs and impact language
                    - üìä Quantifiable achievements with metrics
                    - üìë Standard section structure
                    - üìß Contact information completeness
                    - üìö Vocabulary diversity and richness
                    """)

                with gr.Column():
                    gr.Markdown("""
                    ### üîç Extracted Information

                    **Experience Parsing**
                    - üìÖ Years of professional experience
                    - üè¢ Company names and job titles
                    - ‚è±Ô∏è Employment duration patterns

                    **Education Detection**
                    - üéì Degree levels (PhD, Masters, Bachelors, etc.)
                    - üè´ Field of study identification
                    - üìú Academic achievements

                    **Certifications**
                    - ‚òÅÔ∏è Cloud: AWS, Azure, GCP
                    - üîí Security: CISSP, CompTIA
                    - üìã Project Management: PMP, Agile
                    - üõ†Ô∏è Technical: Cisco, Oracle, SAP
                    """)

            gr.Markdown("""
            ---
            ### üöÄ How to Get the Best Results

            1. **Detailed Job Descriptions**: Include specific skills, requirements, and qualifications
            2. **Multiple Resumes**: Upload 3-10 resumes for meaningful comparisons
            3. **Set Experience Filters**: Use the slider to focus on qualified candidates
            4. **Review All Metrics**: Don't rely on just one score - check ATS feedback and skill gaps
            5. **Export Data**: Save results to CSV for deeper analysis or team collaboration

            ### üìÅ Export Locations
            - **CSV Files**: `exports/screening_YYYYMMDD_HHMMSS.csv`
            - **JSON Data**: `screening_history/screening_YYYYMMDD_HHMMSS.json`
            """)

demo.launch(share=False)

  with gr.Blocks(css=custom_css, theme=gr.themes.Soft(primary_hue="purple", secondary_hue="blue")) as demo:
  with gr.Blocks(css=custom_css, theme=gr.themes.Soft(primary_hue="purple", secondary_hue="blue")) as demo:


Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
Note: opening Chrome Inspector may crash demo inside Colab notebooks.
* To create a public link, set `share=True` in `launch()`.


<IPython.core.display.Javascript object>

