# Web Execution Layer Workshop
## Competitive Intelligence with Bright Data APIs

This workshop demonstrates how to build a competitive intelligence pipeline using Bright Data's Web Execution Layer:

1. **SERP Ranking** - Find your position vs competitors in search results
2. **Deep Dive** - Scrape competitor pages as markdown and analyze with LLM
3. **AI Perception** - See what ChatGPT, Perplexity, and other AI engines say about your brand
4. **Deep Competitor Analysis** - Extract and analyze deeper pages (pricing, features)
5. **Executive Summary** - GPT-powered analysis of all findings

---

### Prerequisites
- Bright Data account with API access
- SERP API zone configured
- Web Unlocker zone configured
- OpenAI API key (pre-configured in environment)

## Section 0: Setup & Configuration

In [1]:
# Install required dependencies
# - requests: For making HTTP calls to Bright Data APIs
# - openai: For GPT-powered keyword generation and analysis
!pip install -q requests openai

In [3]:
#@title ðŸ‘‰ Enter Your Company Details Here
#@markdown ---
#@markdown Enter your company information below, then run the rest of the notebook.
#@markdown ---

MY_BRAND = "" #@param {type:"string"}
MY_DOMAIN = "" #@param {type:"string"}
COUNTRY = "us" #@param {type:"string"}

# ============================================
# API credentials (from Colab secrets)
# ============================================
from google.colab import userdata
import os

os.environ['BRIGHTDATA_API_TOKEN'] = userdata.get('BRIGHTDATA_API_TOKEN')
os.environ['BRIGHTDATA_ZONE_SERP'] = userdata.get('BRIGHTDATA_ZONE_SERP')
os.environ['BRIGHTDATA_ZONE_UNLOCKER'] = userdata.get('BRIGHTDATA_ZONE_UNLOCKER')
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

BRIGHTDATA_API_TOKEN = os.environ.get("BRIGHTDATA_API_TOKEN")
BRIGHTDATA_ZONE_SERP = os.environ.get("BRIGHTDATA_ZONE_SERP", "serp_api1")
BRIGHTDATA_ZONE_UNLOCKER = os.environ.get("BRIGHTDATA_ZONE_UNLOCKER", "unlocker")
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")

# Import required libraries
import requests
import json
import time
from concurrent.futures import ThreadPoolExecutor, as_completed
from openai import OpenAI

# Initialize OpenAI client
openai_client = OpenAI(api_key=OPENAI_API_KEY) if OPENAI_API_KEY else None

print("Configuration loaded:")
print(f"  Brand: {MY_BRAND}")
print(f"  Domain: {MY_DOMAIN}")
print(f"  Country: {COUNTRY}")
print(f"  Bright Data API: {'âœ“ configured' if BRIGHTDATA_API_TOKEN else 'âœ— missing'}")
print(f"  OpenAI API: {'âœ“ configured' if OPENAI_API_KEY else 'âœ— missing'}")

Configuration loaded:
  Brand: Lusha
  Domain: Lusha.com
  Country: us
  Bright Data API: âœ“ configured
  OpenAI API: âœ“ configured


In [4]:
# Execute: Use GPT to generate relevant keywords based on the brand/domain

print("Generating research keywords with GPT...\n")

# Retry initializing OpenAI client if it's None
if openai_client is None and OPENAI_API_KEY:
    openai_client = OpenAI(api_key=OPENAI_API_KEY)
    print("Retried OpenAI client initialization...")

if openai_client:
    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": """You are a competitive intelligence expert. Given a brand and domain, generate 3 search keywords that potential customers would use to find this type of product/service.

Return ONLY valid JSON in this exact format, no other text:
{"keywords": ["keyword1", "keyword2", "keyword3"]}"""
            },
            {
                "role": "user",
                "content": f"Brand: {MY_BRAND}\nDomain: {MY_DOMAIN}"
            }
        ],
        max_tokens=150
    )

    try:
        generated = json.loads(response.choices[0].message.content)
        keywords_list = generated['keywords']
        print("GPT generated keywords:")
        for kw in keywords_list:
            print(f"  - {kw}")
    except:
        keywords_list = [f"{MY_BRAND} alternatives", f"best {MY_DOMAIN.split('.')[0]} tools"]
        print(f"Using fallback keywords: {keywords_list}")
else:
    keywords_list = [MY_BRAND, f"{MY_BRAND} review", f"{MY_BRAND} alternatives"]
    print(f"No OpenAI API key - using fallback keywords: {keywords_list}")

Generating research keywords with GPT...

GPT generated keywords:
  - B2B contact database
  - sales prospecting tool
  - lead generation software


---
## Section 1: SERP Ranking

In this section, we'll use Bright Data's SERP API to:
1. Search Google for your target keywords
2. Find where YOUR domain ranks in the results
3. Identify your main competitors

### How the SERP API Works
The SERP API sends requests through Bright Data's proxy network and returns structured JSON data from Google search results. The `brd_json=1` parameter tells the API to parse the HTML and return clean JSON.

In [50]:
# Setup: Define function to search Google via SERP API

def search_keyword(keyword, country):
    """
    Search Google for a keyword and return structured results.
    """
    search_url = f"https://www.google.com/search?q={requests.utils.quote(keyword)}&gl={country}&brd_json=1"

    print(f"  Searching: {keyword}...")

    try:
        response = requests.post(
            "https://api.brightdata.com/request",
            headers={
                "Authorization": f"Bearer {BRIGHTDATA_API_TOKEN}",
                "Content-Type": "application/json"
            },
            json={
                "zone": BRIGHTDATA_ZONE_SERP,
                "url": search_url,
                "format": "raw"
            },
            timeout=60
        )

        if response.status_code == 200:
            return response.json()
        else:
            print(f"    âœ— Error: {response.status_code} - {response.text[:100]}")
            return None

    except Exception as e:
        print(f"    âœ— Exception: {str(e)}")
        return None

print("âœ“ search_keyword() function defined")

âœ“ search_keyword() function defined


In [52]:
# Execute: Run SERP searches for all keywords in parallel

print(f"Searching {len(keywords_list)} keywords in {COUNTRY}...\n")

serp_results = {}

with ThreadPoolExecutor() as executor:
    futures = {}
    for keyword in keywords_list:
        future = executor.submit(search_keyword, keyword, COUNTRY)
        futures[future] = keyword
        time.sleep(0.05)  # 50ms delay between API calls

    for future in as_completed(futures):
        keyword = futures[future]
        result = future.result()
        if result:
            serp_results[keyword] = result
            organic_count = len(result.get('organic', []))
            print(f"    âœ“ '{keyword}': Found {organic_count} organic results")
        else:
            print(f"    âœ— '{keyword}': No results")

print(f"\nâœ“ Completed {len(serp_results)}/{len(keywords_list)} searches")

Searching 3 keywords in us...

  Searching: B2B contact database...
  Searching: sales prospecting tool...
  Searching: lead generation software...
    âœ“ 'lead generation software': Found 9 organic results
    âœ“ 'B2B contact database': Found 10 organic results
    âœ“ 'sales prospecting tool': Found 10 organic results

âœ“ Completed 3/3 searches


In [8]:
# Setup: Define helper function to extract domain from URL

def extract_domain(url):
    """Extract the root domain from a URL."""
    try:
        from urllib.parse import urlparse
        parsed = urlparse(url)
        domain = parsed.netloc.lower()
        if domain.startswith('www.'):
            domain = domain[4:]
        return domain
    except:
        return None

print("âœ“ extract_domain() function defined")

âœ“ extract_domain() function defined


In [62]:
# Execute: Analyze SERP results to find YOUR ranking
# This cell processes the raw SERP data to extract:
# - Your brand's position for each keyword
# - List of competitors and their positions
# - Frequency count of competitors across all keywords

# Initialize rankings dictionary to store all analysis results
rankings = {
    'brand': MY_BRAND,
    'domain': MY_DOMAIN,
    'country': COUNTRY,
    'keywords': {},           # Per-keyword ranking data
    'all_competitors': {},    # Competitor frequency across all keywords
    'main_competitors': []    # Top competitors (set in next cell)
}

print(f"Analyzing rankings for {MY_DOMAIN}...\n")

# Loop through each keyword's SERP results
for keyword, data in serp_results.items():
    # Get organic (non-ad) search results
    organic = data.get('organic', [])

    my_position = None
    competitors = []

    # Check top 10 results for each keyword
    for result in organic[:10]:
        # Use the rank field directly from Bright Data's API response
        rank = result.get('rank')
        url = result.get('link', '')
        domain = extract_domain(url)
        title = result.get('title', '')

        # Check if this result is OUR domain
        if domain and MY_DOMAIN.lower() in domain:
            my_position = rank
        else:
            # It's a competitor - add to list
            competitors.append({
                'position': rank,
                'domain': domain,
                'title': title
            })
            # Track how often each competitor appears across keywords
            if domain:
                rankings['all_competitors'][domain] = rankings['all_competitors'].get(domain, 0) + 1

    # Store results for this keyword
    rankings['keywords'][keyword] = {
        'my_position': my_position,
        'competitors': competitors[:5]  # Keep top 5 competitors per keyword
    }

    # Print summary for this keyword
    position_str = f"#{my_position}" if my_position else "Not in top 10"
    print(f"'{keyword}': {position_str}")

print(f"\nâœ“ Analyzed {len(rankings['keywords'])} keywords")
print(f"âœ“ Found {len(rankings['all_competitors'])} unique competitor domains")

Analyzing rankings for Lusha.com...

'lead generation software': Not in top 10
'B2B contact database': Not in top 10
'sales prospecting tool': Not in top 10

âœ“ Analyzed 3 keywords
âœ“ Found 21 unique competitor domains


In [10]:
# Execute: Identify main competitors by counting how many times each domain
# appears across all keyword searches. Use GPT to filter out non-competitors
# (forums, review sites, social media, etc.)

# Get top 10 domains by frequency (we'll filter with GPT)
candidate_competitors = sorted(
    [(domain, count) for domain, count in rankings['all_competitors'].items()],
    key=lambda x: x[1],
    reverse=True
)[:10]

# Use GPT to filter out non-competitors
if openai_client and candidate_competitors:
    domains_list = [d[0] for d in candidate_competitors]

    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": f"""You are analyzing search results for {MY_BRAND} ({MY_DOMAIN}).

Given a list of domains, identify which are ACTUAL BUSINESS COMPETITORS (companies offering similar products/services).

EXCLUDE: forums (reddit, quora), social media (linkedin, twitter, youtube), review sites (g2, capterra, trustpilot), news sites, Wikipedia, GitHub, Amazon, etc.

Return ONLY a JSON array of competitor domains, no other text:
["competitor1.com", "competitor2.com"]"""
            },
            {
                "role": "user",
                "content": f"Domains found in search results: {json.dumps(domains_list)}"
            }
        ],
        max_tokens=200
    )

    try:
        filtered_domains = json.loads(response.choices[0].message.content)
        # Rebuild list with counts, preserving order
        main_competitors = [(d, c) for d, c in candidate_competitors if d in filtered_domains][:5]
    except:
        # Fallback to original list if GPT parsing fails
        main_competitors = candidate_competitors[:5]
else:
    main_competitors = candidate_competitors[:5]

rankings['main_competitors'] = [{'domain': d, 'frequency': c} for d, c in main_competitors]

print("="*50)
print("MAIN COMPETITORS (by frequency across keywords):")
for comp in main_competitors:
    print(f"  {comp[0]}: appears in {comp[1]} keyword(s)")

MAIN COMPETITORS (by frequency across keywords):
  dealfront.com: appears in 2 keyword(s)
  kaspr.io: appears in 2 keyword(s)
  cognism.com: appears in 2 keyword(s)
  seamless.ai: appears in 2 keyword(s)
  apollo.io: appears in 1 keyword(s)


In [11]:
# Display rankings summary table

print("\n" + "="*60)
print(f"SERP RANKING SUMMARY: {MY_BRAND}")
print("="*60)
print(f"{'Keyword':<30} {'Position':<15} {'Top Competitor'}")
print("-"*60)

for keyword, data in rankings['keywords'].items():
    pos = data['my_position']
    position_str = f"#{pos}" if pos else "Not ranked"
    top_comp = data['competitors'][0]['domain'] if data['competitors'] else "N/A"
    print(f"{keyword:<30} {position_str:<15} {top_comp}")

print("\n" + "-"*60)
print("Top competitors to watch:")
for i, comp in enumerate(rankings['main_competitors'][:3], 1):
    print(f"  {i}. {comp['domain']}")


SERP RANKING SUMMARY: Lusha
Keyword                        Position        Top Competitor
------------------------------------------------------------
B2B contact database           Not ranked      apollo.io
sales prospecting tool         Not ranked      dealfront.com
lead generation software       Not ranked      zendesk.com

------------------------------------------------------------
Top competitors to watch:
  1. dealfront.com
  2. kaspr.io
  3. cognism.com


In [12]:
# Execute: GPT summarizes SERP findings

if openai_client:
    print("Generating GPT analysis of SERP results...\n")

    serp_summary = f"""
    Brand: {MY_BRAND}
    Domain: {MY_DOMAIN}
    Market: {COUNTRY}

    Keyword Rankings:
    """
    for kw, data in rankings['keywords'].items():
        pos = data['my_position'] or "Not in top 10"
        serp_summary += f"\n    - '{kw}': Position {pos}"
        if data['competitors']:
            serp_summary += f" (Top competitor: {data['competitors'][0]['domain']})"

    serp_summary += f"\n\n    Main Competitors: {', '.join([c['domain'] for c in rankings['main_competitors'][:3]])}"

    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": "You are a competitive intelligence analyst. Provide brief, actionable insights based on SERP ranking data. Be concise - 3-4 bullet points max."
            },
            {
                "role": "user",
                "content": f"Analyze these SERP rankings and provide key insights:\n{serp_summary}"
            }
        ],
        max_tokens=300
    )

    serp_analysis = response.choices[0].message.content
    print("GPT ANALYSIS:")
    print("-"*40)
    print(serp_analysis)
else:
    serp_analysis = None
    print("Skipping GPT analysis (no OpenAI API key configured)")

Generating GPT analysis of SERP results...

GPT ANALYSIS:
----------------------------------------
- Lusha is currently not ranking in the top 10 for critical keywords tied to its core offering, indicating weak SEO presence compared to competitors like apollo.io and dealfront.com.
- Competitors such as dealfront.com and kaspr.io dominate key terms, suggesting they have stronger content strategies or backlink profiles targeting B2B contact databases and prospecting tools.
- To improve rankings, Lusha should optimize for these high-value keywords with targeted content, enhanced on-page SEO, and actively build authoritative backlinks.
- Consider auditing and benchmarking against top competitorsâ€™ SEO strategies to identify content gaps and technical improvements for Lusha.com.


---
## Section 2: Deep Dive into Competitor Pages

Now that we've identified the top competitors in our search results, let's scrape those pages and analyze their strategies.

We'll use Bright Data's **Web Unlocker** to fetch **your domain and competitor pages** as clean markdown, then use GPT to extract competitive insights like:
- Their main value proposition
- Key features they highlight
- Trust signals (testimonials, client logos, stats)
- Pricing transparency
- Relevant URLs for deeper analysis (pricing pages, feature pages, etc.)

This allows us to compare your messaging directly against your competitors.

In [20]:
# Setup: Define function to scrape pages as markdown using Web Unlocker

def scrape_as_markdown(url):
    """
    Scrape a URL and return the content as clean markdown using Web Unlocker.
    """
    print(f"  Scraping: {url[:50]}...")

    try:
        response = requests.post(
            "https://api.brightdata.com/request",
            headers={
                "Authorization": f"Bearer {BRIGHTDATA_API_TOKEN}",
                "Content-Type": "application/json"
            },
            json={
                "zone": BRIGHTDATA_ZONE_UNLOCKER,
                "url": url,
                "format": "raw",
                "data_format": "markdown"
            },
            timeout=60
        )

        if response.status_code == 200:
            markdown_content = response.text
            print(f"    âœ“ Got {len(markdown_content)} chars of markdown")
            return {'markdown': markdown_content[:10000], 'url': url}
        else:
            print(f"    âœ— Error: {response.status_code}")
            return None

    except Exception as e:
        print(f"    âœ— Error: {str(e)[:50]}")
        return None

print("âœ“ scrape_as_markdown() function defined")

âœ“ scrape_as_markdown() function defined


In [21]:
# Execute: Select top competitor HOMEPAGES and our own domain to analyze

print("Selecting pages to analyze...\n")

competitor_urls = []

# Add our own domain first for comparison
my_url = f"https://{MY_DOMAIN}"
competitor_urls.append(my_url)
print(f"  âœ“ Added our domain: {my_url}")

# Add top competitor homepages (not blog posts from SERP)
for comp in rankings['main_competitors'][:3]:
    domain = comp['domain']
    url = f"https://{domain}"
    if url not in competitor_urls:
        competitor_urls.append(url)
        print(f"  âœ“ Added competitor: {url}")

print(f"\nSelected {len(competitor_urls)} pages to deep-dive:")
for i, url in enumerate(competitor_urls, 1):
    label = "(YOUR DOMAIN)" if MY_DOMAIN.lower() in url.lower() else ""
    print(f"  {i}. {url} {label}")

Selecting pages to analyze...

  âœ“ Added our domain: https://Lusha.com
  âœ“ Added competitor: https://dealfront.com
  âœ“ Added competitor: https://kaspr.io
  âœ“ Added competitor: https://cognism.com

Selected 4 pages to deep-dive:
  1. https://Lusha.com (YOUR DOMAIN)
  2. https://dealfront.com 
  3. https://kaspr.io 
  4. https://cognism.com 


In [22]:
# Execute: Scrape the pages as markdown in parallel

print(f"Scraping {len(competitor_urls)} pages...\n")

competitor_content = []

with ThreadPoolExecutor() as executor:
    futures = {}
    for url in competitor_urls:
        future = executor.submit(scrape_as_markdown, url)
        futures[future] = url
        time.sleep(0.05)  # 50ms delay between API calls

    for future in as_completed(futures):
        url = futures[future]
        result = future.result()
        if result:
            label = "(YOUR DOMAIN)" if MY_DOMAIN in url else ""
            print(f"    âœ“ {url[:50]}... {label}")
            competitor_content.append(result)

print(f"\nâœ“ Successfully scraped {len(competitor_content)}/{len(competitor_urls)} pages")

Scraping 4 pages...

  Scraping: https://Lusha.com...
  Scraping: https://dealfront.com...
  Scraping: https://kaspr.io...
  Scraping: https://cognism.com...
    âœ“ Got 19533 chars of markdown
    âœ“ https://dealfront.com... 
    âœ“ Got 9589 chars of markdown
    âœ“ https://kaspr.io... 
    âœ“ Got 23826 chars of markdown
    âœ“ https://cognism.com... 
    âœ“ Got 20645 chars of markdown
    âœ“ https://Lusha.com... (YOUR DOMAIN)

âœ“ Successfully scraped 4/4 pages


In [23]:
# Execute: GPT analyzes pages and extracts relevant URLs
# This cell sends each scraped page to GPT for analysis.
# GPT extracts structured insights and identifies deeper pages to scrape later.

def analyze_page_with_gpt(page):
    """Analyze a single page with GPT and return insights."""
    is_my_domain = MY_DOMAIN in page['url']

    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": """Analyze this page and extract:
1. Main value proposition (1 sentence)
2. Key features/benefits highlighted (2-3 bullets)
3. Target audience signals
4. Pricing info if visible
5. Trust signals (testimonials, client logos, stats)
6. Relevant URLs for deeper analysis, ordered by importance (pricing page first, then features, etc.)

Return JSON format:
{
  "value_proposition": "...",
  "features": ["...", "..."],
  "target_audience": "...",
  "pricing_info": "...",
  "trust_signals": "...",
  "relevant_urls": ["https://...", "https://..."]
}"""
            },
            {
                "role": "user",
                "content": f"URL: {page['url']}\n\nPage Content:\n{page['markdown'][:10000]}"
            }
        ],
        max_tokens=1000
    )

    try:
        analysis = json.loads(response.choices[0].message.content)
    except:
        analysis = {'raw': response.choices[0].message.content, 'relevant_urls': []}

    return {
        'url': page['url'],
        'is_my_domain': is_my_domain,
        'analysis': analysis
    }

if openai_client and competitor_content:
    print(f"Analyzing {len(competitor_content)} pages with GPT in parallel...\n")

    competitor_insights = []

    with ThreadPoolExecutor() as executor:
        futures = {executor.submit(analyze_page_with_gpt, page): page for page in competitor_content}

        for future in as_completed(futures):
            page = futures[future]
            result = future.result()
            competitor_insights.append(result)

            # Display results
            label = "YOUR DOMAIN" if result['is_my_domain'] else "COMPETITOR"
            print(f"\n{'='*50}")
            print(f"[{label}] {result['url'][:50]}...")
            print(f"{'='*50}")
            print(json.dumps(result['analysis'], indent=2))

    print(f"\nâœ“ Analyzed {len(competitor_insights)} pages")
else:
    competitor_insights = []
    print("No pages to analyze")

Analyzing 4 pages with GPT in parallel...


[COMPETITOR] https://cognism.com...
{
  "value_proposition": "Cognism provides up-to-date, high-quality B2B sales intelligence data and actionable insights to help businesses drive intelligent prospecting, boost connect rates, and build pipeline across EMEA markets.",
  "features": [
    "Comprehensive data coverage with millions of verified company and decision-maker contacts, including mobile numbers that boost connect rates by up to 3x.",
    "Purpose-built platform for sales, marketing, and revenue operations teams to target accounts with precise insights and buyer signals.",
    "Seamless integrations and Data-as-a-Service to deliver enriched data directly into your tech stack or warehouse."
  ],
  "target_audience": "B2B sales, marketing, and revenue operations professionals and teams, especially those selling into EMEA markets looking for reliable, GDPR-compliant sales intelligence to enhance demand generation and pipeline growth.",
  

In [24]:
for insight in competitor_insights:
    if insight.get('is_my_domain'):
        print(f"URL: {insight['url']}")
        print(f"Analysis: {insight['analysis']}")


URL: https://Lusha.com
Analysis: {'value_proposition': 'Lusha provides accurate B2B data enriched with real-time buying signals and AI-powered workflows to keep revenue teams in sync and accelerate deal wins.', 'features': ['Extensive database of 300M+ verified contacts with complete company context for precise prospecting.', 'Automated lead enrichment that keeps CRM records complete and sales-ready with validated contact details and firmographic data.', 'Real-time buying signals including job changes, hiring momentum, funding news, and technology adoption to engage prospects at the right time.', 'Automated GTM workflows like list building, enrichment, scoring, routing, and CRM updates to streamline sales processes and reduce manual work.', 'API and integrations to ensure all GTM tools share one clean and updated source of truth.'], 'target_audience': 'Revenue teams including sales, marketing, revops, and recruiting professionals aiming to optimize prospecting, lead enrichment, and eng

---
## Section 3: AI Perception

In this section, we'll query AI engines (ChatGPT, Perplexity, Grok, Gemini) to see:
1. What they say when users ask about your industry
2. Whether YOUR brand is mentioned in their responses
3. How you compare to competitors in AI recommendations

### How AI Engine Scraping Works
Bright Data's Web Scraper API has pre-built endpoints for each AI engine. We send a prompt, and the API handles:
- Opening the AI engine in a real browser
- Submitting the prompt
- Waiting for and extracting the response

In [25]:
# AI Engine configuration

AI_ENGINES = {
    'chatgpt': {
        'dataset_id': 'gd_m7aof0k82r803d5bjm',
        'name': 'ChatGPT',
        'url': 'https://chatgpt.com/'
    },
    'perplexity': {
        'dataset_id': 'gd_m7dhdot1vw9a7gc1n',
        'name': 'Perplexity',
        'url': 'https://www.perplexity.ai'
    },
    'grok': {
        'dataset_id': 'gd_m8ve0u141icu75ae74',
        'name': 'Grok',
        'url': 'https://grok.com/'
    },
    'gemini': {
        'dataset_id': 'gd_mbz66arm2mf9cu856y',
        'name': 'Gemini',
        'url': 'https://gemini.google.com/'
    }
}

print("AI Engines configured:")
for key, config in AI_ENGINES.items():
    print(f"  - {config['name']}")

AI Engines configured:
  - ChatGPT
  - Perplexity
  - Grok
  - Gemini


In [36]:
# Execute: GPT generates GENERIC industry queries for AI engines
# These queries should NOT mention our brand - we want to see if AI engines
# naturally recommend us when users ask about the industry/problem space

print("Generating AI perception queries with GPT...\n")

if openai_client:
    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": """Generate 3 questions that potential customers would ask AI assistants when looking for this type of product/service.

IMPORTANT: Do NOT mention the brand name in any query. These should be generic industry questions like:
- "What's the best B2B contact database?"
- "Which sales prospecting tools are most accurate?"
- "Best tools for finding business email addresses"

We want to see if AI engines naturally recommend the brand without being asked about it directly.

Return ONLY valid JSON array, no other text:
["query 1", "query 2", "query 3"]"""
            },
            {
                "role": "user",
                "content": f"Brand: {MY_BRAND}\nDomain: {MY_DOMAIN}\nIndustry keywords: {', '.join(keywords_list)}"
            }
        ],
        max_tokens=150
    )

    try:
        ai_queries = json.loads(response.choices[0].message.content)
        print("Generated queries (brand NOT mentioned):")
        for q in ai_queries:
            print(f"  - {q}")
    except:
        ai_queries = [f"best {keywords_list[0]} tools", "best sales prospecting software", "most accurate B2B contact database"]
        print(f"Using fallback queries: {ai_queries}")
else:
    ai_queries = [f"best {keywords_list[0]} tools", "best sales prospecting software", "most accurate B2B contact database"]
    print(f"Using default queries: {ai_queries}")

Generating AI perception queries with GPT...

Generated queries (brand NOT mentioned):
  - What are the top B2B contact databases for accurate lead information?
  - Which sales prospecting tools provide verified business emails and phone numbers?
  - Best lead generation software for improving B2B sales outreach?


In [27]:
# Setup: Define function to trigger AI engine query
# This function sends a prompt to an AI engine and returns a snapshot_id.
# The snapshot_id is used to poll for results in the next step.

def trigger_ai_query(engine_key, prompt, country):
    """Step 1: Send prompt to AI engine, get snapshot_id back."""
    engine = AI_ENGINES[engine_key]

    # Handle Gemini not supporting Israel
    query_country = country.upper() if not (engine_key == 'gemini' and country.upper() == 'IL') else ''

    try:
        # Gemini uses different payload format (input wrapper)
        if engine_key == 'gemini':
            payload = {
                "input": [{
                    "url": engine['url'],
                    "prompt": prompt,
                    "country": query_country
                }]
            }
        else:
            # ChatGPT, Perplexity, Grok use array directly
            payload = [{
                "url": engine['url'],
                "prompt": prompt,
                "country": query_country
            }]

        response = requests.post(
            f"https://api.brightdata.com/datasets/v3/trigger?dataset_id={engine['dataset_id']}",
            headers={
                "Authorization": f"Bearer {BRIGHTDATA_API_TOKEN}",
                "Content-Type": "application/json"
            },
            json=payload,
            timeout=30
        )

        # API returns 200 with snapshot_id on success
        if response.status_code == 200:
            data = response.json()
            snapshot_id = data.get('snapshot_id')
            if snapshot_id:
                return snapshot_id
            else:
                print(f"    âœ— {engine['name']}: No snapshot_id - {str(data)[:100]}")
                return None
        else:
            print(f"    âœ— {engine['name']}: HTTP {response.status_code} - {response.text[:100]}")
            return None

    except Exception as e:
        print(f"    âœ— {engine['name']}: {str(e)[:100]}")
        return None

print("âœ“ trigger_ai_query() function defined")

âœ“ trigger_ai_query() function defined


In [65]:
# Execute: Phase 1 - Trigger all AI engine requests in parallel

engines_to_query = ['chatgpt', 'perplexity', 'grok', 'gemini']

print(f"Querying {len(engines_to_query)} engines Ã— {len(ai_queries)} queries...")
print(f"(Using {REDUNDANT_REQUESTS}x redundancy for speed)\n")

# Build list of all engine + query combinations
all_tasks = [(engine, query) for query in ai_queries for engine in engines_to_query]

print("Triggering AI engine requests...")

pending_snapshots = []  # List of (engine, query, snapshot_id)

with ThreadPoolExecutor(max_workers=20) as executor:
    futures = {}
    for engine, query in all_tasks:
        # Send REDUNDANT_REQUESTS identical requests per task
        for _ in range(REDUNDANT_REQUESTS):
            future = executor.submit(trigger_ai_query, engine, query, COUNTRY)
            futures[future] = (engine, query)
            time.sleep(0.05)

    for future in as_completed(futures):
        engine, query = futures[future]
        snapshot_id = future.result()
        if snapshot_id:
            pending_snapshots.append((engine, query, snapshot_id))

print(f"\nâœ“ Triggered {len(pending_snapshots)} snapshot requests")
print(f"  ({len(all_tasks)} unique tasks Ã— {REDUNDANT_REQUESTS} redundant requests each)")

Querying 4 engines Ã— 3 queries...
(Using 3x redundancy for speed)

Triggering AI engine requests...

âœ“ Triggered 36 snapshot requests
  (12 unique tasks Ã— 3 redundant requests each)


In [66]:
# Execute: Phase 2 - Poll snapshots until ready

print("Waiting for snapshots to complete...")
print("(This may take 1-3 minutes)\n")

ready_snapshots = []  # List of (engine, query, snapshot_id)
max_wait = 180  # 3 minutes max
start_time = time.time()

# Track which (engine, query) pairs we've already got results for
completed_tasks = set()

while (time.time() - start_time) < max_wait:
    # Check if we have all unique tasks completed
    if len(completed_tasks) >= len(all_tasks):
        print(f"\nâœ“ All {len(all_tasks)} tasks completed!")
        break

    for engine, query, snapshot_id in pending_snapshots:
        # Skip if we already have a result for this engine+query
        if (engine, query) in completed_tasks:
            continue

        try:
            response = requests.get(
                f"https://api.brightdata.com/datasets/v3/progress/{snapshot_id}",
                headers={"Authorization": f"Bearer {BRIGHTDATA_API_TOKEN}"},
                timeout=10
            )

            if response.status_code == 200:
                status = response.json().get('status')
                if status == 'ready':
                    ready_snapshots.append((engine, query, snapshot_id))
                    completed_tasks.add((engine, query))
                    print(f"  âœ“ {AI_ENGINES[engine]['name']}: {query[:35]}...")
                elif status == 'failed':
                    # Mark as completed so we don't keep checking
                    completed_tasks.add((engine, query))
                    print(f"  âœ— {AI_ENGINES[engine]['name']}: {query[:35]}... (failed)")
        except:
            pass  # Will retry on next loop

    # Progress update
    elapsed = int(time.time() - start_time)
    print(f"  ... {len(completed_tasks)}/{len(all_tasks)} complete ({elapsed}s elapsed)")
    time.sleep(5)

print(f"\nâœ“ {len(ready_snapshots)} snapshots ready for download")

Waiting for snapshots to complete...
(This may take 1-3 minutes)

  ... 0/12 complete (2s elapsed)
  ... 0/12 complete (9s elapsed)
  âœ“ Perplexity: Which sales prospecting tools provi...
  ... 1/12 complete (16s elapsed)
  âœ“ Perplexity: Best lead generation software for i...
  ... 2/12 complete (23s elapsed)
  ... 2/12 complete (30s elapsed)
  ... 2/12 complete (38s elapsed)
  âœ“ Gemini: Best lead generation software for i...
  âœ“ Grok: Which sales prospecting tools provi...
  âœ“ Gemini: Which sales prospecting tools provi...
  ... 5/12 complete (45s elapsed)
  âœ“ Gemini: What are the top B2B contact databa...
  âœ“ ChatGPT: What are the top B2B contact databa...
  âœ“ Grok: Best lead generation software for i...
  âœ“ Perplexity: What are the top B2B contact databa...
  ... 9/12 complete (51s elapsed)
  ... 9/12 complete (57s elapsed)
  ... 9/12 complete (62s elapsed)
  âœ“ ChatGPT: Which sales prospecting tools provi...
  ... 10/12 complete (68s elapsed)
  ... 10/12 complete 

In [67]:
# Execute: Phase 3 - Download results from ready snapshots (in parallel)

print("Downloading results...\n")

ai_results = []
downloaded_tasks = set()

def download_result(engine, query, snapshot_id):
    """Download a single snapshot result."""
    try:
        response = requests.get(
            f"https://api.brightdata.com/datasets/v3/snapshot/{snapshot_id}?format=json",
            headers={"Authorization": f"Bearer {BRIGHTDATA_API_TOKEN}"},
            timeout=30
        )

        if response.status_code == 200:
            data = response.json()
            if isinstance(data, list) and len(data) > 0:
                item = data[0]
                # Extract the response in markdown
                content = item.get('answer_text_markdown', '')
                return {
                    'engine': engine,
                    'engine_name': AI_ENGINES[engine]['name'],
                    'query': query,
                    'content': content
                }
    except:
        pass
    return None

with ThreadPoolExecutor() as executor:
    futures = {}
    for engine, query, snapshot_id in ready_snapshots:
        # Skip duplicates (from redundancy)
        if (engine, query) in downloaded_tasks:
            continue
        downloaded_tasks.add((engine, query))

        future = executor.submit(download_result, engine, query, snapshot_id)
        futures[future] = (engine, query)

    for future in as_completed(futures):
        engine, query = futures[future]
        result = future.result()
        if result:
            ai_results.append(result)
            print(f"  âœ“ {AI_ENGINES[engine]['name']}: {query[:35]}...")

print(f"\nâœ“ Got {len(ai_results)}/{len(all_tasks)} responses")

Downloading results...

  âœ“ Grok: Which sales prospecting tools provi...
  âœ“ Perplexity: Which sales prospecting tools provi...
  âœ“ Gemini: Which sales prospecting tools provi...
  âœ“ Gemini: What are the top B2B contact databa...
  âœ“ Gemini: Best lead generation software for i...
  âœ“ Perplexity: Best lead generation software for i...
  âœ“ Grok: Best lead generation software for i...
  âœ“ ChatGPT: Best lead generation software for i...
  âœ“ Grok: What are the top B2B contact databa...
  âœ“ ChatGPT: Which sales prospecting tools provi...
  âœ“ Perplexity: What are the top B2B contact databa...
  âœ“ ChatGPT: What are the top B2B contact databa...

âœ“ Got 12/12 responses


In [40]:
# Setup: Define function to check brand mentions using LLM

def check_brand_mention(text, brand):
    """Use LLM to check if brand is mentioned and extract position if listed."""
    if not text or not openai_client:
        return {'mentioned': False, 'position': None}

    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": f"""Check if \"{brand}\" is mentioned in the text. Account for spelling variations.

Return ONLY valid JSON:
{{"mentioned": true/false, "position": null or number if in a ranked list}}"""
            },
            {
                "role": "user",
                "content": text[:5000]
            }
        ],
        max_tokens=100
    )

    try:
        return json.loads(response.choices[0].message.content)
    except:
        return {'mentioned': False, 'position': None}

print("âœ“ check_brand_mention() function defined")

âœ“ check_brand_mention() function defined


In [41]:
# Execute: Check each AI response for brand mentions

print(f"Checking {len(ai_results)} responses for '{MY_BRAND}' mentions...\n")

for result in ai_results:
    mention = check_brand_mention(result['content'], MY_BRAND)
    result['mentioned'] = mention['mentioned']
    result['position'] = mention['position']

    status = "âœ“ MENTIONED" if mention['mentioned'] else "âœ— Not mentioned"
    pos = f" (#{mention['position']})" if mention['position'] else ""
    print(f"[{result['engine_name']}] {result['query'][:35]}...")
    print(f"  {status}{pos}\n")

Checking 12 responses for 'Lusha' mentions...

[Grok] What are the top B2B contact databa...
  âœ“ MENTIONED (#5)

[Gemini] Which sales prospecting tools provi...
  âœ“ MENTIONED (#3)

[Perplexity] Which sales prospecting tools provi...
  âœ— Not mentioned

[ChatGPT] Which sales prospecting tools provi...
  âœ“ MENTIONED (#2)

[Perplexity] Best lead generation software for i...
  âœ— Not mentioned

[Gemini] What are the top B2B contact databa...
  âœ“ MENTIONED (#4)

[Grok] Which sales prospecting tools provi...
  âœ“ MENTIONED (#4)

[ChatGPT] Best lead generation software for i...
  âœ“ MENTIONED (#9)

[ChatGPT] What are the top B2B contact databa...
  âœ“ MENTIONED (#3)

[Grok] Best lead generation software for i...
  âœ“ MENTIONED (#3)

[Perplexity] What are the top B2B contact databa...
  âœ“ MENTIONED (#3)

[Gemini] Best lead generation software for i...
  âœ— Not mentioned



In [42]:
# Execute: GPT summarizes AI perception findings

if openai_client and ai_results:
    print("Generating GPT analysis of AI perception...\n")

    # Build summary of all AI engine responses for GPT to analyze
    ai_summary = f"Brand: {MY_BRAND}\n\nAI Engine Responses:\n"
    for r in ai_results:
        mention_str = "Mentioned" if r['mentioned'] else "Not mentioned"
        ai_summary += f"\n- [{r['engine_name']}] Query: '{r['query'][:40]}...'\n"
        ai_summary += f"  Result: {mention_str}\n"
        if r.get('position'):
            ai_summary += f"  Position: #{r['position']}\n"

    # Ask GPT-5 to analyze brand visibility across AI engines
    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": "You are a brand visibility analyst. Analyze how AI engines perceive and recommend brands. Be concise - 3-4 bullet points max."
            },
            {
                "role": "user",
                "content": f"Analyze this brand's visibility across AI engines:\n{ai_summary}"
            }
        ],
        max_tokens=500
    )

    # Store and display the analysis
    ai_analysis = response.choices[0].message.content
    print("GPT ANALYSIS:")
    print("-"*40)
    print(ai_analysis)
else:
    ai_analysis = None
    print("Skipping GPT analysis (no OpenAI API key or no results)")

Generating GPT analysis of AI perception...

GPT ANALYSIS:
----------------------------------------
- Lusha has consistent visibility across multiple AI engines (Grok, Gemini, ChatGPT, Perplexity) for queries related to B2B contact databases and sales prospecting tools, often ranking within the top 5 results.
- ChatGPT and Gemini position Lusha highly (positions #2 to #4), indicating strong recommendation strength on these platforms.
- Perplexity shows mixed visibility, mentioning Lusha for B2B contact databases but not for lead generation or sales prospecting in certain queries, signaling some inconsistency.
- Overall, Lusha is recognized and recommended by most AI engines but could improve presence in specific lead generation software queries, particularly on platforms like Perplexity and Gemini.


---
## Section 4: Deep Competitor Analysis

In this section, we'll go deeper into competitor websites by:
1. Using the URLs extracted in Section 2 (pricing pages, feature pages, etc.)
2. Scraping those deeper pages
3. Analyzing pricing, features, and messaging with GPT

### Approach
We'll use Bright Data's Web Unlocker API to scrape the relevant pages identified earlier.

In [54]:
# Execute: Scrape deeper pages (pricing, features, etc.) from URLs extracted in Section 2
# Get up to 5 deeper URLs per domain (your domain + top 3 competitors)
# Only scrape URLs that match the source domain

deep_urls = []

# Collect relevant URLs per domain (up to 5 each, must match source domain)
for insight in competitor_insights:
    urls = insight.get('analysis', {}).get('relevant_urls', [])
    source_domain = extract_domain(insight['url'])
    label = "(YOUR DOMAIN)" if insight.get('is_my_domain') else ""

    # Add up to 5 URLs that match the source domain
    count = 0
    for url in urls:
        url_domain = extract_domain(url)
        # Only include URLs from the same domain
        if url_domain and source_domain and source_domain in url_domain:
            if url not in deep_urls and count < 5:
                deep_urls.append(url)
                count += 1

    if count > 0:
        print(f"  âœ“ {source_domain} {label}")

print(f"\nScraping deeper pages in parallel...")

deep_content = []

# Scrape all URLs in parallel
with ThreadPoolExecutor() as executor:
    futures = {}
    for url in deep_urls:
        future = executor.submit(scrape_as_markdown, url, silent=True)
        futures[future] = url
        time.sleep(0.05)  # 50ms delay between API calls

    for future in as_completed(futures):
        result = future.result()
        if result:
            deep_content.append(result)
            print(f"\r  Collected: {len(deep_content)} pages", end="", flush=True)

print(f"\nâœ“ Done")

  âœ“ cognism.com 
  âœ“ dealfront.com 
  âœ“ kaspr.io 
  âœ“ lusha.com (YOUR DOMAIN)

Scraping deeper pages in parallel...
  Collected: 20 pages
âœ“ Done


In [55]:
# Setup: Group all scraped pages by company domain
# Combines homepage content with deeper pages (pricing, features, etc.)

company_content = {}

# Add homepage content from competitor_content
for page in competitor_content:
    domain = extract_domain(page['url'])
    if domain not in company_content:
        company_content[domain] = {
            'is_my_domain': MY_DOMAIN.lower() in domain,
            'pages': []
        }
    company_content[domain]['pages'].append(page)

# Add deep page content
for page in deep_content:
    domain = extract_domain(page['url'])
    if domain in company_content:
        company_content[domain]['pages'].append(page)

print(f"Grouped pages into {len(company_content)} companies:")
for domain, data in company_content.items():
    label = "(YOUR COMPANY)" if data['is_my_domain'] else ""
    print(f"  - {domain}: {len(data['pages'])} pages {label}")

Grouped pages into 4 companies:
  - dealfront.com: 6 pages 
  - kaspr.io: 6 pages 
  - cognism.com: 6 pages 
  - lusha.com: 6 pages (YOUR COMPANY)


In [56]:
# Execute: Generate consolidated profile for each company

def analyze_company(domain, data):
    """Analyze all pages from a company and return a consolidated profile."""
    combined_content = ""
    for page in data['pages']:
        combined_content += f"\n\n--- {page['url']} ---\n{page['markdown'][:5000]}"

    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": """Analyze all pages from this company and create a consolidated profile.

Return JSON format:
{
  "company": "...",
  "value_proposition": "1-2 sentences",
  "features": ["feature 1", "feature 2", "feature 3"],
  "pricing": "pricing details or 'Not available'",
  "target_audience": "...",
  "differentiators": ["diff 1", "diff 2"]
}"""
            },
            {
                "role": "user",
                "content": f"Domain: {domain}\n\nPages:\n{combined_content[:15000]}"
            }
        ],
        max_tokens=1000
    )

    try:
        profile = json.loads(response.choices[0].message.content)
    except:
        profile = {'raw': response.choices[0].message.content}

    profile['domain'] = domain
    profile['is_my_domain'] = data['is_my_domain']
    return profile

# Analyze all companies in parallel
print("Generating company profiles...\n")
company_profiles = []

with ThreadPoolExecutor() as executor:
    futures = {executor.submit(analyze_company, domain, data): domain for domain, data in company_content.items()}

    for future in as_completed(futures):
        domain = futures[future]
        profile = future.result()
        company_profiles.append(profile)

        label = "YOUR COMPANY" if profile['is_my_domain'] else "COMPETITOR"
        print(f"âœ“ [{label}] {domain}")

print(f"\nâœ“ Generated {len(company_profiles)} company profiles")

Generating company profiles...

âœ“ [COMPETITOR] kaspr.io
âœ“ [COMPETITOR] dealfront.com
âœ“ [YOUR COMPANY] lusha.com
âœ“ [COMPETITOR] cognism.com

âœ“ Generated 4 company profiles


In [57]:
# Display: Show each company profile

for profile in company_profiles:
    label = "YOUR COMPANY" if profile.get('is_my_domain') else "COMPETITOR"
    print(f"\n{'='*50}")
    print(f"[{label}] {profile.get('domain', 'Unknown')}")
    print(f"{'='*50}")
    print(json.dumps(profile, indent=2))


[COMPETITOR] kaspr.io
{
  "company": "Kaspr",
  "value_proposition": "Kaspr provides sales, recruitment, and founding professionals with instant access to accurate, compliant, and verified European B2B contact data via an easy-to-use LinkedIn Chrome Extension and integrated platform, enabling faster lead generation, seamless workflow integration, and pipeline growth.",
  "features": [
    "Accurate contact data access for over 200 million profiles including 500M+ verified phone numbers and emails",
    "LinkedIn Chrome Extension to extract contact information directly from prospect profiles",
    "Bulk data enrichment for lead lists with real-time verified data from 150+ sources",
    "Native integrations with leading CRMs (HubSpot, Salesforce, Pipedrive, Zoho CRM) and sales tools (Lemlist, Ringover, Aircall, Brevo, Zapier)",
    "All-in-one prospect management dashboard with automation capabilities",
    "GDPR and CCPA compliant data sourcing and handling",
    "No onboarding require

In [58]:
# Execute: Generate competitive comparison of all companies

print("="*60)
print("COMPETITIVE COMPARISON")
print("="*60 + "\n")

profiles_summary = json.dumps(company_profiles, indent=2)

response = openai_client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=[
        {
            "role": "system",
            "content": f"""Compare these companies against {MY_BRAND}. Create a competitive analysis:
1. Comparison table (features, pricing, target market)
2. Each competitor's strengths vs {MY_BRAND}
3. {MY_BRAND}'s competitive advantages
4. Areas where {MY_BRAND} could improve

Be direct and actionable."""
        },
        {
            "role": "user",
            "content": f"Company profiles:\n{profiles_summary}"
        }
    ],
    max_tokens=1500
)

comparison_analysis = response.choices[0].message.content
print(comparison_analysis)

# Store for executive summary
deep_insights = {
    'company_profiles': company_profiles,
    'comparison': comparison_analysis
}

COMPETITIVE COMPARISON

### 1. Comparison Table

| Company    | Key Features                                                                                                     | Pricing                                     | Target Market                                   | Unique Differentiators                                      |
|------------|-----------------------------------------------------------------------------------------------------------------|---------------------------------------------|------------------------------------------------|-------------------------------------------------------------|
| **Lusha**  | - B2B Contact & Company Search<br>- Chrome Extension<br>- Buyer Intent Data & Signals<br>- AI Lead Streaming<br>- Automation for email & meeting analysis<br>- API & CRM integrations<br>- Multi-channel prospecting | Flexible plans, free option, custom enterprise pricing; details on website                          | Revenue teams (Sales, RevOps, Marketing, Recr

---
## Section 5: Executive Summary

This section compiles all findings into a comprehensive competitive intelligence report.

In [63]:
# Execute: Compile all data into a summary structure

executive_summary = {
    'brand': MY_BRAND,
    'domain': MY_DOMAIN,
    'market': COUNTRY,
    'generated_at': time.strftime('%Y-%m-%d %H:%M:%S'),

    # SERP Rankings
    'serp': {
        'keywords_tracked': len(rankings['keywords']),
        'rankings': {kw: data['my_position'] for kw, data in rankings['keywords'].items()},
        'main_competitors': [c['domain'] for c in rankings['main_competitors'][:3]]
    },

    # AI Perception
    'ai_perception': {
        'engines_queried': [r['engine_name'] for r in ai_results],
        'mentions': sum(1 for r in ai_results if r.get('mentioned')),
        'total_queries': len(ai_results)
    },

    # Competitor Insights
    'competitor_insights': competitor_insights,

    # Deep Analysis (now contains company_profiles and comparison)
    'deep_insights': deep_insights
}

print("Executive summary data compiled.")
print(f"\nSections included:")
print(f"  - SERP Rankings: {executive_summary['serp']['keywords_tracked']} keywords")
print(f"  - AI Perception: {executive_summary['ai_perception']['mentions']}/{executive_summary['ai_perception']['total_queries']} mentions")
print(f"  - Competitor Insights: {len(competitor_insights)} pages")
print(f"  - Company Profiles: {len(deep_insights.get('company_profiles', []))} companies")

Executive summary data compiled.

Sections included:
  - SERP Rankings: 3 keywords
  - AI Perception: 9/12 mentions
  - Competitor Insights: 4 pages
  - Company Profiles: 4 companies


In [64]:
# Execute: GPT generates comprehensive executive report

if openai_client:
    print("Generating executive report with GPT...\n")

    # Build context from all sections
    full_context = f"""
COMPETITIVE INTELLIGENCE REPORT
Brand: {MY_BRAND}
Domain: {MY_DOMAIN}
Market: {COUNTRY}

=== SERP RANKINGS ===
"""
    for kw, data in rankings['keywords'].items():
        pos = data['my_position'] or "Not in top 10"
        full_context += f"\n'{kw}': Position {pos}"

    full_context += f"\n\nMain Competitors: {', '.join(executive_summary['serp']['main_competitors'])}"

    full_context += "\n\n=== AI PERCEPTION ==="
    full_context += f"\nMentioned in {executive_summary['ai_perception']['mentions']}/{executive_summary['ai_perception']['total_queries']} AI engine queries"

    full_context += "\n\n=== COMPANY PROFILES ==="
    for profile in deep_insights.get('company_profiles', []):
        label = "(YOUR COMPANY)" if profile.get('is_my_domain') else ""
        full_context += f"\n{profile.get('domain', 'Unknown')} {label}"
        full_context += f"\n  Value prop: {profile.get('value_proposition', 'N/A')}"
        full_context += f"\n  Pricing: {profile.get('pricing', 'N/A')}"

    full_context += "\n\n=== COMPETITIVE COMPARISON ==="
    full_context += f"\n{deep_insights.get('comparison', 'N/A')[:500]}..."

    # Generate executive report
    response = openai_client.chat.completions.create(
        model="gpt-4.1-mini",
        messages=[
            {
                "role": "system",
                "content": """You are a senior competitive intelligence analyst. Generate a concise executive report with:
1. Overall competitive position (1-2 sentences)
2. Key strengths (2-3 bullets)
3. Areas for improvement (2-3 bullets)
4. Top 3 recommended actions

Be direct and actionable."""
            },
            {
                "role": "user",
                "content": f"Generate an executive report based on this data:\n{full_context}"
            }
        ],
        max_tokens=2000
    )

    executive_report = response.choices[0].message.content

    print("="*60)
    print(f"EXECUTIVE REPORT: {MY_BRAND}")
    print(f"Generated: {executive_summary['generated_at']}")
    print("="*60)
    print()
    print(executive_report)
    print()
    print("="*60)
else:
    print("Skipping executive report (no OpenAI API key configured)")

Generating executive report with GPT...

EXECUTIVE REPORT: Lusha
Generated: 2026-01-14 14:59:05

Executive Competitive Intelligence Report: Lusha

1. Overall Competitive Position
Lusha holds a credible position as an AI-powered sales intelligence platform but currently lacks prominent visibility in key SERP rankings such as "lead generation software" and "B2B contact database," limiting its market discoverability relative to competitors.

2. Key Strengths
- Offers AI-enhanced B2B data enriched with real-time buying signals and AI workflows, supporting accelerated pipeline growth.
- Flexible pricing models including free starter options and customizable enterprise plans cater to a broad customer base.
- Mentioned in the majority of AI-related search queries, enhancing technological thought leadership perception.

3. Areas for Improvement
- Suboptimal search engine ranking in critical keywords diminishes organic lead generation and brand exposure.
- Pricing transparency is moderate; ente