# AI-Readiness Assessment Tool

A web-based tool that analyzes company websites to assess their AI readiness potential and identify transformation opportunities. Built for Caprae Capital Partners' AI-RaaS (AI Readiness as a Service) offering.

## Overview

This tool helps private equity firms and investment professionals quickly evaluate a company's AI readiness by analyzing their public web presence. It extracts technology indicators, leadership information, and growth signals to provide an overall AI readiness score.

## Features

- **Website Analysis**: Automatically crawls and analyzes company websites
- **AI Readiness Scoring**: Calculates a 1-10 score based on technology indicators, leadership, and growth potential
- **Technology Detection**: Identifies AI, data, cloud, and automation technologies
- **Leadership Assessment**: Extracts information about the company's leadership team
- **Opportunity Identification**: Suggests potential AI transformation opportunities based on the analysis

## Datasets:
1. **Technology Adoption and Indicators**
Dataset: AI in the Workplace Report
- Source: **McKinsey** & Company: this report provides insights into how companies are investing in and implementing AI technologies, highlighting adoption rates, maturity levels, and organizational readiness.
Access: AI in the Workplace Report
Dataset: AI Use Cases Across Industries

- Source: **Google Cloud**: this resource showcases real-world applications of AI across various industries, offering examples of technology adoption and innovation.
Access: AI Use Cases
2. **Leadership** Roles and Digital Leadership
Dataset: Digital Leadership Analysis
- Source: **Springer**: this research examines the evolution of digital leadership as portrayed in The New York Times from 2020 to 2022, analyzing content and sentiment related to leadership roles in the digital age.
Access: Digital Leadership Analysis
3. **Growth Indicators**
Dataset: AI-Driven ESG Performance Data
- Source: **Nature**: this study explores the impact of AI on Environmental, Social, and Governance (ESG) performance, providing data on how AI technologies influence corporate growth and sustainability practices.
Access: AI and ESG Performance
4. **Sentiment Analysis**
Dataset: Sentiment Analysis Datasets Overview

- Source: **Analytics Vidhya**: This article provides an overview of top sentiment analysis datasets, which can be utilized to train models for analyzing sentiments in textual data.
Access: Top Sentiment Analysis Datasets
Dataset: Employee Sentiment Analysis Dataset

- Source: **Aura**: This dataset offers insights into employee morale, enabling organizations to assess workplace sentiment and identify areas for improvement.
Access: Employee Sentiment Dataset
By integrating these datasets, you can enrich your project's analysis of technological maturity, leadership dynamics, growth trajectories, and sentiment within organizations.

Note: This is a short demo and does not contain all of the necessary files to run

##Data Collection: Web Scraping
We'll use requests and BeautifulSoup to scrape the content of a company website. The goal is to extract textual data and analyze it using advanced AI models.

**1. Fetch Website Data**

In [None]:
import requests
from bs4 import BeautifulSoup

def fetch_website_data(url):
    """
    Fetch website content and parse HTML.
    Returns the raw text of the website and its metadata.
    """
    response = requests.get(url)

    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html.parser')
        text = ' '.join([p.text for p in soup.find_all('p')])  # Extract text from paragraph tags
        return text
    else:
        print("Failed to fetch website data.")
        return ""


**2. modules/analyzer.py:** Contains the AIReadinessScorer class.

This is where your original AIReadinessScorer logic will reside, along with the modifications or additions for using pretrained models, as needed.

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification
import torch
import pandas as pd

class AIReadinessScorer:
    def __init__(self):
        self.tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')
        self.model = BertForSequenceClassification.from_pretrained('bert-base-uncased')

        # Load company dataset (structured data)
        self.company_data = pd.DataFrame(columns=['company_size', 'tech_score', 'leadership_score', 'growth_score'])

    def get_text_embedding(self, text):
        """Get text embedding from BERT model"""
        inputs = self.tokenizer(text, return_tensors='pt', truncation=True, padding=True, max_length=512)
        with torch.no_grad():
            outputs = self.model(**inputs)
        return outputs.logits.squeeze().tolist()  # return the raw output logits

    def calculate_tech_score(self, tech_indicators):
        # Here, use pretrained models to predict tech readiness based on data
        return sum(tech_indicators)  # Placeholder for more advanced logic

    def calculate_leadership_score(self, leadership_team):
        # Use NLP model to assess the leadership team’s tech maturity
        leadership_text = " ".join([person['title'] for person in leadership_team])
        leadership_embedding = self.get_text_embedding(leadership_text)
        return leadership_embedding[0]  # Use first embedding component for simplicity

    def calculate_growth_score(self, growth_indicators, company_size):
        # Use pretrained model to analyze growth trends
        growth_score = len(growth_indicators)  # Placeholder for more advanced logic
        if company_size == 'Small Company/Startup':
            growth_score *= 1.5
        return growth_score

    def calculate_score(self, analysis_results):
        """Calculate the overall AI readiness score using advanced AI models"""
        tech_score = self.calculate_tech_score(analysis_results['tech_indicators'])
        leadership_score = self.calculate_leadership_score(analysis_results['leadership_team'])
        growth_score = self.calculate_growth_score(
            analysis_results['growth_indicators'],
            analysis_results['company_size_indicator']
        )

        # Combine model scores to create a final readiness score
        ai_readiness_score = tech_score * 0.6 + leadership_score * 0.25 + growth_score * 0.15
        return ai_readiness_score

    def identify_opportunities(self, ai_readiness_score):
        """Generate AI opportunities based on advanced model insights"""
        if ai_readiness_score < 3:
            return ['Basic Data Infrastructure', 'AI Readiness Assessment']
        elif ai_readiness_score < 7:
            return ['Process Automation', 'Data Analytics']
        else:
            return ['Advanced AI Solution Deployment', 'Predictive Analytics']

    def generate_results(self, analysis_results):
        """Generate AI readiness results with opportunities"""
        ai_readiness_score = self.calculate_score(analysis_results)
        opportunities = self.identify_opportunities(ai_readiness_score)

        return {
            'ai_readiness_score': ai_readiness_score,
            'opportunities': opportunities,
            'detailed_analysis': analysis_results
        }


**3. scorer.py**

This file will define the core logic for calculating the AI readiness score based on various factors. It’s essentially the module where the AI readiness model is implemented.

**Purpose:**

The scorer.py will be responsible for calculating the overall AI readiness score, including evaluating specific categories like technology infrastructure, leadership, and growth potential. It will break down the readiness score into different components, which are then combined to produce an overall score.

**Key Functions:**

- `calculate_tech_score(): `This function evaluates the technology indicators (e.g., AI/ML capabilities, data, cloud infrastructure) and calculates a score for each of them based on weighted categories.
- `calculate_leadership_score():` This function checks the leadership team's composition and assigns a score based on the presence of technical leadership (e.g., CTO, CIO, etc.).
- `calculate_growth_score():` This function assigns a score based on growth indicators like market expansion or product launches, while also factoring in the company's size (small, mid-size, large).
- `calculate_score():` The main function that combines the individual scores from tech, leadership, and growth to create an overall AI readiness score. This function will return a normalized score on a scale of 1–10.

- `identify_opportunities():` Based on the readiness score and missing components (e.g., AI infrastructure, leadership, etc.), this function suggests AI transformation opportunities for the company.

In [None]:
from transformers import BertTokenizer, BertForSequenceClassification
import torch
import math

class AIReadinessScorer:
    def __init__(self):
        self.tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
        self.model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=1)



    def calculate_tech_score(self, tech_descriptions):
        """Use BERT to score technology readiness based on textual descriptions."""
        tech_score = 0

        for category, description in tech_descriptions.items():
            weight = self.category_weights.get(category, 1.0)
            inputs = self.tokenizer(description, return_tensors="pt", truncation=True, padding=True)
            with torch.no_grad():
                outputs = self.model(**inputs).logits
            category_score = weight * torch.sigmoid(outputs).item()
            tech_score += category_score

        return tech_score

    def calculate_leadership_score(self, leadership_team):
        """Score leadership based on BERT's sentiment classification."""
        if not leadership_team:
            return 0

        leadership_score = 0
        for person in leadership_team:
            title = person['title'].lower()
            inputs = self.tokenizer(title, return_tensors="pt", truncation=True, padding=True)
            with torch.no_grad():
                outputs = self.model(**inputs).logits
            leadership_score += torch.sigmoid(outputs).item() * 2

        return min(leadership_score, 5)

    def calculate_growth_score(self, growth_descriptions, company_size):
        """Use BERT to assess growth potential from descriptions."""
        growth_score = 0
        for desc in growth_descriptions:
            inputs = self.tokenizer(desc, return_tensors="pt", truncation=True, padding=True)
            with torch.no_grad():
                outputs = self.model(**inputs).logits
            growth_score += torch.sigmoid(outputs).item() * 0.5

        size_factor = {'Small': 1.2, 'Mid': 1.0, 'Large': 0.8}.get(company_size, 1.0)
        growth_score *= size_factor

        return min(growth_score, 2)

    def calculate_score(self, analysis_results):
        """Compute AI readiness score using BERT-based evaluations."""
        tech_score = self.calculate_tech_score(analysis_results.get('tech_indicators', {}))
        leadership_score = self.calculate_leadership_score(analysis_results.get('leadership_team', []))
        growth_score = self.calculate_growth_score(
            analysis_results.get('growth_descriptions', []),
            analysis_results.get('company_size', 'Unknown')
        )

        raw_score = tech_score + leadership_score + growth_score
        ai_readiness_score = max(1, min(10, round(raw_score / 2)))

        return {
            'ai_readiness_score': ai_readiness_score,
            'components': {
                'technology_score': round(tech_score, 1),
                'leadership_score': round(leadership_score, 1),
                'growth_score': round(growth_score, 1)
            }
        }


**4. Lead_scorer.py**

The **AIEnhancedLeadScorer** class uses advanced AI models to evaluate leadership potential, assess technology investments, identify pain points, and analyze sentiment, helping companies assess their readiness for AI adoption and digital transformation.

**Key Methods:**
1. `advanced_decision_maker_score:`

- Classifies leadership roles (e.g., C-Level, Technology Leader) and assigns a strategic score.
- Returns the average strategic score and the primary contact (person with the highest score).
2. `advanced_tech_investment_analysis:`

- Assesses the impact of technology investments based on company indicators (e.g., AI/ML adoption).
- Returns a score based on the technology impact.
3. `advanced_pain_point_extraction:`

- Extracts and classifies pain points from company descriptions.
- Returns the top 5 pain points with the highest confidence scores.
4. `calculate_lead_score:`

- Combines the decision maker score, tech investment score, and sentiment analysis to calculate the final lead score.
- Categorizes the lead as Hot, Warm, or Nurture based on the score.
Provides insights like primary contact, pain points, and sentiment.

In [None]:
import torch
from transformers import (
    AutoModelForSequenceClassification,
    AutoTokenizer,
    pipeline
)
import numpy as np
import re

class AIEnhancedLeadScorer:
    def __init__(self):
        # Pre-trained models for advanced analysis
        self.leadership_classifier = pipeline(
            "zero-shot-classification",
            model="facebook/bart-large-mnli"
        )

        self.tech_impact_classifier = pipeline(
            "zero-shot-classification",
            model="facebook/bart-large-mnli"
        )

        self.pain_point_extractor = pipeline(
            "text-classification",
            model="facebook/roberta-hate-speech-dynabench-r4-target"
        )

        self.sentiment_analyzer = pipeline(
            "sentiment-analysis",
            model="distilbert-base-uncased-finetuned-sst-2-english"
        )

        # Leadership role categories for classification
        self.leadership_categories = [
            'C-Level Executive',
            'Technology Leader',
            'Strategic Decision Maker',
            'Operational Manager'
        ]

        # Technology impact categories
        self.tech_impact_categories = [
            'Digital Transformation',
            'Innovation Driver',
            'Technological Modernization',
            'Efficiency Improvement'
        ]

    def advanced_decision_maker_score(self, leadership_team):
        if not leadership_team:
            return 0, None

        strategic_scores = []
        primary_contact = None

        for person in leadership_team:
            classification = self.leadership_classifier(
                person['title'],
                self.leadership_categories
            )

            top_category = classification['labels'][0]
            top_score = classification['scores'][0]

            if top_category == 'C-Level Executive':
                strategic_score = top_score * 10
            elif top_category == 'Technology Leader':
                strategic_score = top_score * 8
            elif top_category == 'Strategic Decision Maker':
                strategic_score = top_score * 6
            else:
                strategic_score = top_score * 4

            strategic_scores.append(strategic_score)

            if not primary_contact or strategic_score > primary_contact.get('score', 0):
                primary_contact = {
                    'name': person['name'],
                    'title': person['title'],
                    'score': strategic_score
                }

        final_score = np.mean(strategic_scores) if strategic_scores else 0
        return final_score, primary_contact

    def advanced_tech_investment_analysis(self, tech_indicators):
        if not tech_indicators:
            return 0

        tech_description = " ".join([
            f"{category}: {', '.join(indicators.get('indicators', {}).keys())}"
            for category, indicators in tech_indicators.items()
        ])

        impact_classification = self.tech_impact_classifier(
            tech_description,
            self.tech_impact_categories
        )

        top_impact = impact_classification['labels'][0]
        impact_score = impact_classification['scores'][0]

        if top_impact == 'Digital Transformation':
            tech_score = impact_score * 10
        elif top_impact == 'Innovation Driver':
            tech_score = impact_score * 8
        elif top_impact == 'Technological Modernization':
            tech_score = impact_score * 6
        else:
            tech_score = impact_score * 4

        return tech_score

    def advanced_pain_point_extraction(self, text):
        chunks = [text[i:i+512] for i in range(0, len(text), 512)]

        pain_points = []
        for chunk in chunks:
            results = self.pain_point_extractor(chunk)

            for result in results:
                if result['score'] > 0.7:
                    pain_points.append({
                        'text': chunk,
                        'confidence': result['score']
                    })

        return sorted(pain_points, key=lambda x: x['confidence'], reverse=True)[:5]

    def calculate_lead_score(self, analysis_results, ai_readiness_score):
        leadership_team = analysis_results.get('leadership_team', [])
        tech_indicators = analysis_results.get('tech_indicators', {})
        growth_indicators = analysis_results.get('growth_indicators', [])
        company_size = analysis_results.get('company_size_indicator', 'Unknown')

        text_content = ' '.join([
            f"Company size: {company_size}",
            f"Growth indicators: {', '.join(growth_indicators)}",
            "Focused on improving efficiency and reducing costs.",
            "Challenged by legacy systems and manual processes.",
            "Committed to innovation and transformation."
        ])

        decision_maker_score, primary_contact = self.advanced_decision_maker_score(leadership_team)
        tech_investment_score = self.advanced_tech_investment_analysis(tech_indicators)
        pain_points = self.advanced_pain_point_extraction(text_content)

        sentiment = self.sentiment_analyzer(text_content[:512])[0]
        sentiment_factor = 1.1 if sentiment['label'] == 'POSITIVE' else 0.9

        lead_score = (
            decision_maker_score * 0.3 +
            tech_investment_score * 0.25 +
            ai_readiness_score * 0.2
        ) * sentiment_factor

        if lead_score >= 8:
            lead_tier = "Hot"
        elif lead_score >= 6:
            lead_tier = "Warm"
        else:
            lead_tier = "Nurture"

        sales_insights = {
            'lead_score': round(lead_score, 1),
            'lead_tier': lead_tier,
            'score_components': {
                'decision_maker_score': round(decision_maker_score, 1),
                'tech_investment_score': round(tech_investment_score, 1),
                'ai_readiness_factor': round(ai_readiness_score * 0.2, 1),
                'sentiment_factor': sentiment_factor
            },
            'primary_contact': primary_contact,
            'pain_points': [point['text'] for point in pain_points],
            'sentiment': sentiment
        }

        return sales_insights


## Installation

1. Clone this repository
```bash
git clone https://github.com/yourusername/ai-readiness-assessment.git
cd ai-readiness-assessment
```

2. Create a virtual environment and install dependencies
```bash
python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate
pip install -r requirements.txt
```

3. Download NLTK data (required for text analysis)
```bash
python -c "import nltk; nltk.download('punkt')"
```

## Usage

1. Start the Flask application
```bash
python app.py
```

2. Open your browser and navigate to `http://127.0.0.1:5000/`

3. Enter a company URL (e.g., company.com) and click "Analyze"

4. View the AI readiness assessment results

## Project Structure

- `app.py`: Main Flask application file
- `modules/`: Core functionality modules
  - `scraper.py`: Website scraping functionality
  - `analyzer.py`: Content analysis logic
  - `scorer.py`: AI readiness scoring algorithm
- `static/`: Static files (CSS, JavaScript)
- `templates/`: HTML templates
- `utils/`: Helper functions

## Development Notes

This project was developed as part of the Caprae Capital Partners AI-Readiness Pre-Screening Challenge. It focuses on delivering a high-impact tool that aligns with the business needs of a private equity firm specializing in AI transformation.

## Future Enhancements

- Integration with company databases for additional information
- Industry-specific assessment criteria
- Email validation and enhanced contact discovery
- CRM integration for lead management

## License

This project is proprietary and confidential.