# AI-Readiness Assessment Tool

A web-based tool that analyzes company websites to assess their AI readiness potential and identify transformation opportunities. Built for Caprae Capital Partners' AI-RaaS (AI Readiness as a Service) offering based on https://getcohesiveai.com functions

## Overview

This tool helps private equity firms and investment professionals quickly evaluate a company's AI readiness by analyzing their public web presence. It extracts technology indicators, leadership information, and growth signals to provide an overall AI readiness score.

## Features

- **Website Analysis**: Automatically crawls and analyzes company websites
- **AI Readiness Scoring**: Calculates a 1-10 score based on technology indicators, leadership, and growth potential
- **Technology Detection**: Identifies AI, data, cloud, and automation technologies
- **Leadership Assessment**: Extracts information about the company's leadership team
- **Opportunity Identification**: Suggests potential AI transformation opportunities based on the analysis

## Target Market:
1. **Private Equity Firms:**

- AI-Readiness Assessment Tool helps investors identify companies with high potential for AI adoption and transformation, enabling better-informed investment decisions.
2. **Venture Capitalists:**

- The tool supports venture capitalists in finding early-stage companies that are ready to integrate AI technologies, allowing them to target startups with high growth potential.
3. **Corporate Investors:**

- Companies looking to invest in or acquire businesses for digital transformation can leverage this tool to assess the AI readiness of potential targets.
4. **Consulting Firms:**

- Consulting firms focused on digital transformation can use the tool to assist clients in determining their AI readiness and uncover opportunities for improvement.

## Datasets:
1. **Technology Adoption and Indicators**
Dataset: AI in the Workplace Report
- Source: **McKinsey** & Company: this report provides insights into how companies are investing in and implementing AI technologies, highlighting adoption rates, maturity levels, and organizational readiness.
Access: AI in the Workplace Report
Dataset: AI Use Cases Across Industries

- Source: **Google Cloud**: this resource showcases real-world applications of AI across various industries, offering examples of technology adoption and innovation.
Access: AI Use Cases
2. **Leadership** Roles and Digital Leadership
Dataset: Digital Leadership Analysis
- Source: **Springer**: this research examines the evolution of digital leadership as portrayed in The New York Times from 2020 to 2022, analyzing content and sentiment related to leadership roles in the digital age.
Access: Digital Leadership Analysis
3. **Growth Indicators**
Dataset: AI-Driven ESG Performance Data
- Source: **Nature**: this study explores the impact of AI on Environmental, Social, and Governance (ESG) performance, providing data on how AI technologies influence corporate growth and sustainability practices.
Access: AI and ESG Performance
4. **Sentiment Analysis**
Dataset: Sentiment Analysis Datasets Overview

- Source: **Analytics Vidhya**: This article provides an overview of top sentiment analysis datasets, which can be utilized to train models for analyzing sentiments in textual data.
Access: Top Sentiment Analysis Datasets
Dataset: Employee Sentiment Analysis Dataset

- Source: **Aura**: This dataset offers insights into employee morale, enabling organizations to assess workplace sentiment and identify areas for improvement.
Access: Employee Sentiment Dataset
By integrating these datasets, you can enrich your project's analysis of technological maturity, leadership dynamics, growth trajectories, and sentiment within organizations.

##Data Collection: Web Scraping
We'll use requests and BeautifulSoup to scrape the content of a company website. The goal is to extract textual data and analyze it using advanced AI models.

**1. Fetch Website Data**

In [None]:
import requests
from bs4 import BeautifulSoup

def fetch_website_data(url):
    """
    Fetch website content and parse HTML.
    Returns the raw text of the website and its metadata.
    """
    response = requests.get(url)

    if response.status_code == 200:
        soup = BeautifulSoup(response.content, 'html.parser')
        text = ' '.join([p.text for p in soup.find_all('p')])  # Extract text from paragraph tags
        return text
    else:
        print("Failed to fetch website data.")
        return ""


**2. modules/analyzer.py:** Contains the AIReadinessScorer class.

This is where your original AIReadinessScorer logic will reside, along with the modifications or additions for using pretrained models, as needed.

In [None]:
from bs4 import BeautifulSoup
import re
from nltk.tokenize import word_tokenize
from utils.helpers import clean_text

class ContentAnalyzer:
    def __init__(self):
        # AI and technology readiness indicators (keywords and phrases)
        self.tech_indicators = {
            'ai_ml': ['machine learning', 'artificial intelligence', 'ai', 'ml', 'deep learning',
                      'neural network', 'computer vision', 'nlp', 'natural language processing'],
            'data': ['data analytics', 'big data', 'data science', 'data lake', 'data warehouse',
                    'business intelligence', 'predictive analytics', 'data-driven'],
            'cloud': ['cloud', 'aws', 'azure', 'google cloud', 'saas', 'iaas', 'paas',
                     'serverless', 'microservices', 'containerization', 'docker', 'kubernetes'],
            'integration': ['api', 'integration', 'webhook', 'rest api', 'graphql', 'middleware',
                           'interoperability', 'connected systems'],
            'automation': ['automation', 'workflow', 'robotic process automation', 'rpa',
                          'business process automation', 'intelligent automation']
        }

        # Leadership indicators
        self.leadership_titles = ['CEO', 'CTO', 'Chief Technology', 'Chief Digital', 'Chief Information',
                                 'VP of Engineering', 'VP of Technology', 'Chief Data', 'Head of IT',
                                 'Director of Technology', 'Chief Innovation', 'Chief AI', 'CIO']

        # Growth indicators
        self.growth_phrases = ['growing', 'expansion', 'hiring', 'new office', 'funding',
                              'venture capital', 'investment', 'series', 'launch', 'scaling',
                              'accelerating', 'growth']

    def extract_leadership_team(self, soups):
        """Extract leadership team information from page soups"""
        leadership_team = []

        for page_type, soup in soups.items():
            # Priority for team and about pages
            priority = 1 if page_type in ['team', 'leadership', 'about'] else 0

            # Look for common team member containers
            team_sections = soup.find_all(['div', 'section'], class_=lambda c: c and any(term in str(c).lower()
                                                                for term in ['team', 'leadership', 'people', 'staff']))

            # If no specific containers found, check the whole page
            if not team_sections and priority:
                team_sections = [soup]

            for section in team_sections:
                # Look for name elements
                name_elements = section.find_all(['h2', 'h3', 'h4', 'h5', 'strong'])

                for name_elem in name_elements:
                    name_text = name_elem.get_text().strip()
                    # Check if it looks like a name (First Last format)
                    if re.match(r'^[A-Z][a-z]+(?:\s+[A-Z][a-z]+)+$', name_text):
                        # Try to find title near the name
                        title = "Unknown"

                        # Check next siblings for title
                        next_elems = list(name_elem.next_siblings)[:3]  # Check next 3 elements
                        for elem in next_elems:
                            if hasattr(elem, 'get_text'):
                                text = elem.get_text().strip()
                                if any(title_word in text.lower() for title_word in ['ceo', 'cto', 'chief', 'vp', 'head', 'director']):
                                    title = text
                                    break

                        # If we haven't found a title, look at parent container
                        if title == "Unknown":
                            parent = name_elem.parent
                            if parent:
                                parent_text = parent.get_text().strip()
                                title_match = re.search(r'(?:CEO|CTO|Chief|VP|Head|Director|Manager)[^\n\.]*', parent_text)
                                if title_match:
                                    title = title_match.group(0)

                        # Add to leadership team if not already present
                        if not any(leader['name'] == name_text for leader in leadership_team):
                            leadership_team.append({
                                'name': name_text,
                                'title': title,
                                'priority': priority
                            })

        # Sort by priority and return
        return sorted(leadership_team, key=lambda x: x.pop('priority', 0), reverse=True)

    def extract_contact_info(self, texts):
        """Extract contact information from texts"""
        contact_info = {
            'emails': [],
            'phones': []
        }

        # Email pattern
        email_pattern = r'\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}\b'

        # Phone pattern - various formats
        phone_pattern = r'\b(?:\+\d{1,2}\s?)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}\b'

        for text in texts.values():
            # Find emails
            emails = re.findall(email_pattern, text)
            contact_info['emails'].extend(emails)

            # Find phones
            phones = re.findall(phone_pattern, text)
            contact_info['phones'].extend(phones)

        # Remove duplicates
        contact_info['emails'] = list(set(contact_info['emails']))
        contact_info['phones'] = list(set(contact_info['phones']))

        return contact_info

    def analyze_content(self, pages_content, base_url):
        """Analyze website content for AI readiness indicators"""
        # Convert HTML to soup objects
        soups = {page_type: BeautifulSoup(content, 'html.parser') for page_type, content in pages_content.items()}

        # Extract text from each page
        texts = {page_type: clean_text(soup.get_text()) for page_type, soup in soups.items()}

        # Combined text for overall analysis
        combined_text = ' '.join(texts.values()).lower()

        # Initialize results
        results = {
            'tech_indicators': {},
            'leadership_team': [],
            'contact_info': {},
            'growth_indicators': [],
            'company_size_indicator': 'Unknown',
            'base_url': base_url
        }

        # Check for tech indicators
        for category, indicators in self.tech_indicators.items():
            category_count = 0
            category_indicators = {}

            for indicator in indicators:
                count = combined_text.count(indicator.lower())
                if count > 0:
                    category_indicators[indicator] = count
                    category_count += count

            if category_indicators:
                results['tech_indicators'][category] = {
                    'total': category_count,
                    'indicators': category_indicators
                }

        # Extract leadership team
        results['leadership_team'] = self.extract_leadership_team(soups)

        # Extract contact information
        results['contact_info'] = self.extract_contact_info(texts)

        # Check for growth indicators
        for phrase in self.growth_phrases:
            if phrase in combined_text:
                results['growth_indicators'].append(phrase)

        # Estimate company size
        if any(term in combined_text for term in ['fortune 500', 'enterprise', 'global company']):
            results['company_size_indicator'] = 'Large Enterprise'
        elif any(term in combined_text for term in ['mid-size', 'medium business', 'growing company']):
            results['company_size_indicator'] = 'Mid-size Company'
        elif any(term in combined_text for term in ['startup', 'small business', 'small team']):
            results['company_size_indicator'] = 'Small Company/Startup'

        return results

**3. scorer.py**

This file will define the core logic for calculating the AI readiness score based on various factors. It’s essentially the module where the AI readiness model is implemented.

**Purpose:**

The scorer.py will be responsible for calculating the overall AI readiness score, including evaluating specific categories like technology infrastructure, leadership, and growth potential. It will break down the readiness score into different components, which are then combined to produce an overall score.

**Key Functions:**

- `calculate_tech_score(): `This function evaluates the technology indicators (e.g., AI/ML capabilities, data, cloud infrastructure) and calculates a score for each of them based on weighted categories.
- `calculate_leadership_score():` This function checks the leadership team's composition and assigns a score based on the presence of technical leadership (e.g., CTO, CIO, etc.).
- `calculate_growth_score():` This function assigns a score based on growth indicators like market expansion or product launches, while also factoring in the company's size (small, mid-size, large).
- `calculate_score():` The main function that combines the individual scores from tech, leadership, and growth to create an overall AI readiness score. This function will return a normalized score on a scale of 1–10.

- `identify_opportunities():` Based on the readiness score and missing components (e.g., AI infrastructure, leadership, etc.), this function suggests AI transformation opportunities for the company.

In [None]:
class AIReadinessScorer:
    def __init__(self):
        # Category weights for scoring
        self.category_weights = {
            'ai_ml': 3.0,       # AI/ML technologies are most important
            'data': 2.5,        # Data infrastructure is critical
            'cloud': 2.0,       # Cloud adoption indicates technical maturity
            'integration': 1.5, # Integration capabilities are important
            'automation': 2.0   # Automation shows process maturity
        }

        # Leadership score factors
        self.tech_leadership_titles = [
            'cto', 'chief technology', 'vp of engineering', 'chief information',
            'chief digital', 'chief data', 'head of it', 'director of technology',
            'chief innovation', 'chief ai', 'technology director', 'cio'
        ]

    def calculate_tech_score(self, tech_indicators):
        """Calculate technology score based on indicators"""
        tech_score = 0

        for category, data in tech_indicators.items():
            # Get the weight for this category
            weight = self.category_weights.get(category, 1.0)

            # Calculate score based on total mentions with diminishing returns
            # We use log scaling to prevent overly high scores from repeated mentions
            # Formula: weight * log(1 + total_mentions)
            import math
            category_score = weight * math.log(1 + data['total'])
            tech_score += category_score

        return tech_score

    def calculate_leadership_score(self, leadership_team):
        """Calculate leadership score based on technical leadership presence"""
        if not leadership_team:
            return 0

        leadership_score = 0

        # Check for technical leadership roles
        for person in leadership_team:
            title = person['title'].lower()
            if any(tech_title in title for tech_title in self.tech_leadership_titles):
                leadership_score += 2  # Strong indicator
            elif 'tech' in title or 'digital' in title or 'data' in title or 'it' in title:
                leadership_score += 1  # Moderate indicator

        # Cap leadership score
        return min(leadership_score, 5)

    def calculate_growth_score(self, growth_indicators, company_size):
        """Calculate growth score based on indicators and company size"""
        growth_score = len(growth_indicators) * 0.5  # 0.5 points per growth indicator

        # Company size factor
        if company_size == 'Small Company/Startup':
            growth_score *= 1.2  # Startups with growth indicators are good candidates
        elif company_size == 'Mid-size Company':
            growth_score *= 1.0  # Neutral
        else:  # Large Enterprise
            growth_score *= 0.8  # Large companies may be harder to transform

        # Cap growth score
        return min(growth_score, 2)

    def identify_opportunities(self, analysis_results, ai_readiness_score):
        """Identify potential AI transformation opportunities"""
        opportunities = []
        tech_indicators = analysis_results['tech_indicators']

        # Basic opportunities based on readiness score
        if ai_readiness_score <= 3:
            opportunities.append({
                "title": "Basic Data Infrastructure Implementation",
                "description": "Establish foundational data collection and storage systems to prepare for AI initiatives."
            })
            opportunities.append({
                "title": "AI Readiness Assessment",
                "description": "Conduct a detailed analysis of current systems and processes to identify initial AI opportunities."
            })
        elif ai_readiness_score <= 6:
            opportunities.append({
                "title": "Process Automation Integration",
                "description": "Implement automated workflows for routine business processes to improve efficiency."
            })
            opportunities.append({
                "title": "Data Analytics Implementation",
                "description": "Deploy analytics solutions to extract business insights from existing data assets."
            })
        else:
            opportunities.append({
                "title": "Advanced AI Solution Deployment",
                "description": "Implement sophisticated AI models to enhance decision-making and create competitive advantages."
            })
            opportunities.append({
                "title": "Predictive Analytics Enhancement",
                "description": "Leverage existing data infrastructure for forecasting and predictive business intelligence."
            })

        # Check for specific opportunities based on missing or present indicators
        categories = tech_indicators.keys()

        if 'data' in categories and 'ai_ml' not in categories:
            opportunities.append({
                "title": "AI Model Implementation",
                "description": "Leverage existing data assets by implementing machine learning models for predictive capabilities."
            })

        if 'cloud' in categories and 'integration' not in categories:
            opportunities.append({
                "title": "API Development for System Integration",
                "description": "Create APIs to connect cloud systems with other business applications for improved data flow."
            })

        if 'automation' in categories and 'ai_ml' not in categories:
            opportunities.append({
                "title": "Intelligent Automation Upgrade",
                "description": "Enhance existing automation with AI capabilities for more adaptive and intelligent processes."
            })

        if 'ai_ml' in categories and 'data' not in categories:
            opportunities.append({
                "title": "Robust Data Infrastructure Development",
                "description": "Build comprehensive data pipeline to fully leverage existing AI capabilities."
            })

        # Return top 3 opportunities
        return opportunities[:3]

    def calculate_score(self, analysis_results):
        """Calculate overall AI readiness score and identify opportunities"""
        # Get individual component scores
        tech_score = self.calculate_tech_score(analysis_results.get('tech_indicators', {}))
        leadership_score = self.calculate_leadership_score(analysis_results.get('leadership_team', []))
        growth_score = self.calculate_growth_score(
            analysis_results.get('growth_indicators', []),
            analysis_results.get('company_size_indicator', 'Unknown')
        )

        # Calculate raw score (max theoretical value around 20)
        raw_score = tech_score + leadership_score + growth_score

        # Normalize to 1-10 scale
        ai_readiness_score = max(1, min(10, round(raw_score / 2)))

        # Create final results
        final_results = {
            'ai_readiness_score': ai_readiness_score,
            'score_components': {
                'technology_score': round(tech_score, 1),
                'leadership_score': round(leadership_score, 1),
                'growth_score': round(growth_score, 1)
            },
            'tech_indicators': analysis_results.get('tech_indicators', {}),
            'leadership_team': analysis_results.get('leadership_team', []),
            'contact_info': analysis_results.get('contact_info', {}),
            'growth_indicators': analysis_results.get('growth_indicators', []),
            'company_size_indicator': analysis_results.get('company_size_indicator', 'Unknown'),
        }

        # Identify transformation opportunities
        final_results['transformation_opportunities'] = self.identify_opportunities(analysis_results, ai_readiness_score)

        return final_results

**4. Lead_scorer.py**

The **AIEnhancedLeadScorer** class uses advanced AI models to evaluate leadership potential, assess technology investments, identify pain points, and analyze sentiment, helping companies assess their readiness for AI adoption and digital transformation.

**Key Methods:**
1. `advanced_decision_maker_score:`

- Classifies leadership roles (e.g., C-Level, Technology Leader) and assigns a strategic score.
- Returns the average strategic score and the primary contact (person with the highest score).
2. `advanced_tech_investment_analysis:`

- Assesses the impact of technology investments based on company indicators (e.g., AI/ML adoption).
- Returns a score based on the technology impact.
3. `advanced_pain_point_extraction:`

- Extracts and classifies pain points from company descriptions.
- Returns the top 5 pain points with the highest confidence scores.
4. `calculate_lead_score:`

- Combines the decision maker score, tech investment score, and sentiment analysis to calculate the final lead score.
- Categorizes the lead as Hot, Warm, or Nurture based on the score.
Provides insights like primary contact, pain points, and sentiment.

In [None]:
import torch
from transformers import (
    AutoModelForSequenceClassification,
    AutoTokenizer,
    pipeline
)
import numpy as np
import re

class AIEnhancedLeadScorer:
    def __init__(self):
        # Pre-trained models for advanced analysis
        self.leadership_classifier = pipeline(
            "zero-shot-classification",
            model="facebook/bart-large-mnli"
        )

        self.tech_impact_classifier = pipeline(
            "zero-shot-classification",
            model="facebook/bart-large-mnli"
        )

        self.pain_point_extractor = pipeline(
            "text-classification",
            model="facebook/roberta-hate-speech-dynabench-r4-target"
        )

        self.sentiment_analyzer = pipeline(
            "sentiment-analysis",
            model="distilbert-base-uncased-finetuned-sst-2-english"
        )

        # Leadership role categories for classification
        self.leadership_categories = [
            'C-Level Executive',
            'Technology Leader',
            'Strategic Decision Maker',
            'Operational Manager'
        ]

        # Technology impact categories
        self.tech_impact_categories = [
            'Digital Transformation',
            'Innovation Driver',
            'Technological Modernization',
            'Efficiency Improvement'
        ]

    def advanced_decision_maker_score(self, leadership_team):
        """
        Use AI to classify leadership roles and assess their strategic potential

        Args:
            leadership_team (list): List of leadership team members

        Returns:
            tuple: (strategic_score, primary_contact)
        """
        if not leadership_team:
            return 0, None

        strategic_scores = []
        primary_contact = None

        for person in leadership_team:
            # Use zero-shot classification to assess leadership role
            classification = self.leadership_classifier(
                person['title'],
                self.leadership_categories
            )

            # Extract highest-rated category
            top_category = classification['labels'][0]
            top_score = classification['scores'][0]

            # Assign strategic score based on category
            if top_category == 'C-Level Executive':
                strategic_score = top_score * 10
            elif top_category == 'Technology Leader':
                strategic_score = top_score * 8
            elif top_category == 'Strategic Decision Maker':
                strategic_score = top_score * 6
            else:
                strategic_score = top_score * 4

            strategic_scores.append(strategic_score)

            # Select primary contact based on highest strategic score
            if not primary_contact or strategic_score > primary_contact.get('score', 0):
                primary_contact = {
                    'name': person['name'],
                    'title': person['title'],
                    'score': strategic_score
                }

        # Calculate overall strategic score
        final_score = np.mean(strategic_scores) if strategic_scores else 0
        return final_score, primary_contact

    def advanced_tech_investment_analysis(self, tech_indicators):
        """
        Use AI to assess technology investment impact

        Args:
            tech_indicators (dict): Technology indicators

        Returns:
            float: Enhanced technology investment score
        """
        if not tech_indicators:
            return 0

        # Combine technology categories into a comprehensive text
        tech_description = " ".join([
            f"{category}: {', '.join(indicators.get('indicators', {}).keys())}"
            for category, indicators in tech_indicators.items()
        ])

        # Classify technology impact
        impact_classification = self.tech_impact_classifier(
            tech_description,
            self.tech_impact_categories
        )

        # Calculate score based on classification
        top_impact = impact_classification['labels'][0]
        impact_score = impact_classification['scores'][0]

        # Assign score based on impact category
        if top_impact == 'Digital Transformation':
            tech_score = impact_score * 10
        elif top_impact == 'Innovation Driver':
            tech_score = impact_score * 8
        elif top_impact == 'Technological Modernization':
            tech_score = impact_score * 6
        else:
            tech_score = impact_score * 4

        return tech_score

    def advanced_pain_point_extraction(self, text):
        """
        Use AI to extract and classify pain points

        Args:
            text (str): Company description text

        Returns:
            list: Classified and scored pain points
        """
        # Break text into chunks to avoid input length limitations
        chunks = [text[i:i+512] for i in range(0, len(text), 512)]

        pain_points = []
        for chunk in chunks:
            # Use text classification to identify potential pain points
            results = self.pain_point_extractor(chunk)

            # Filter and process results
            for result in results:
                # Only consider high-confidence results
                if result['score'] > 0.7:
                    pain_points.append({
                        'text': chunk,
                        'confidence': result['score']
                    })

        # Sort by confidence and return top 5
        return sorted(pain_points, key=lambda x: x['confidence'], reverse=True)[:5]

    def calculate_lead_score(self, analysis_results, ai_readiness_score):
        """
        Advanced lead scoring using AI-powered analysis
        """
        # Extract key data
        leadership_team = analysis_results.get('leadership_team', [])
        tech_indicators = analysis_results.get('tech_indicators', {})
        growth_indicators = analysis_results.get('growth_indicators', [])
        company_size = analysis_results.get('company_size_indicator', 'Unknown')

        # Combine text for comprehensive analysis
        text_content = ' '.join([
            f"Company size: {company_size}",
            f"Growth indicators: {', '.join(growth_indicators)}",
            "Focused on improving efficiency and reducing costs.",
            "Challenged by legacy systems and manual processes.",
            "Committed to innovation and transformation."
        ])

        # Perform advanced scoring
        decision_maker_score, primary_contact = self.advanced_decision_maker_score(leadership_team)
        tech_investment_score = self.advanced_tech_investment_analysis(tech_indicators)
        pain_points = self.advanced_pain_point_extraction(text_content)

        # Sentiment analysis of overall text
        sentiment = self.sentiment_analyzer(text_content[:512])[0]
        sentiment_factor = 1.1 if sentiment['label'] == 'POSITIVE' else 0.9

        # Calculate lead score with AI-enhanced components
        lead_score = (
            decision_maker_score * 0.3 +
            tech_investment_score * 0.25 +
            ai_readiness_score * 0.2
        ) * sentiment_factor

        # Determine lead tier
        if lead_score >= 8:
            lead_tier = "Hot"
        elif lead_score >= 6:
            lead_tier = "Warm"
        else:
            lead_tier = "Nurture"

        # Generate sales insights
        sales_insights = {
            'lead_score': round(lead_score, 1),
            'lead_tier': lead_tier,
            'score_components': {
                'decision_maker_score': round(decision_maker_score, 1),
                'tech_investment_score': round(tech_investment_score, 1),
                'ai_readiness_factor': round(ai_readiness_score * 0.2, 1),
                'sentiment_factor': sentiment_factor
            },
            'primary_contact': primary_contact,
            'pain_points': [point['text'] for point in pain_points],
            'sentiment': sentiment
        }

        return sales_insights

## Installation

1. Clone this repository
```bash
git clone https://github.com/yourusername/ai-readiness-assessment.git
cd ai-readiness-assessment
```

2. Create a virtual environment and install dependencies
```bash
python -m venv venv
source venv/bin/activate  # On Windows, use: venv\Scripts\activate
pip install -r requirements.txt
```

3. Download NLTK data (required for text analysis)
```bash
python -c "import nltk; nltk.download('punkt')"
```

## Usage

1. Start the Flask application
```bash
python app.py
```

2. Open your browser and navigate to `http://127.0.0.1:5000/`

3. Enter a company URL (e.g., company.com) and click "Analyze"

4. View the AI readiness assessment results

## Project Structure

- `app.py`: Main Flask application file
- `modules/`: Core functionality modules
  - `scraper.py`: Website scraping functionality
  - `analyzer.py`: Content analysis logic
  - `scorer.py`: AI readiness scoring algorithm
- `static/`: Static files (CSS, JavaScript)
- `templates/`: HTML templates
- `utils/`: Helper functions

## Development Notes

This project was developed as part of the Caprae Capital Partners AI-Readiness Pre-Screening Challenge. It focuses on delivering a high-impact tool that aligns with the business needs of a private equity firm specializing in AI transformation.

## Future Enhancements

- Integration with company databases for additional information
- Industry-specific assessment criteria
- Email validation and enhanced contact discovery
- CRM integration for lead management

## License

This project is proprietary and confidential.