# SEO visibility and content gap

Given an AI visibility gap in a certain topic:

- checks if this is due to missing content, or content not being indexed/positioned well by Google
    - extracts (from datocat) or derives (with LLM) N google keywords relacionados con el cluster de prompts
    - compares visibility in Google with competitors
    - TODO: checks for additional (non-ranked) content using _branded_ keywords (e.g. ayudas coches eléctricos Peugeot)
- checks own/competitor's web content that _is_ positioned
    - deeply analyses competitor web content:
        - presence of structured data, listicles, FAQs etc.
        - entities mentioned
        - internal and external linking
    - TODO: compares with own brand content
    - TODO: recommends content to create/modify

ChatGPT context: https://chatgpt.com/share/6925a9d4-f81c-8005-8362-9ca5e8d57c4d

In [94]:
import { load } from "@std/dotenv";

import * as pl from "npm:nodejs-polars";
import * as Plot from "npm:@observablehq/plot";
import { document } from "jsr:@manzt/jupyter-helper";
import { z } from '@zod/zod';

import { askOpenAISafe } from "shared/openai.ts";

import * as utils from "../../src/utils.ts?v=8";
import * as brands from "../../src/brands.ts?v=4";
import * as scrape from "../../src/apis/hasdata/scrape.ts?v=1";
import * as serp from "../../src/apis/hasdata/serp.ts?v=1";
import * as surfaceGap from "../../src/analysis/surfaceGap.ts?v=19";

void await load({
  envPath: "../../.env",
  export: true,
});

const CACHE = "./contentGap2.json"

In [2]:
const { md, html, display } = Deno.jupyter;

In [3]:
// await utils.clearCache(CACHE);

In [9]:
let CONFIG = {
    brandDomain: "peugeot.es",
    sector: "coches eléctricos y SUVs",
    country: "es",
    language: "es",
}

# Brand and competitors

We will compare the "organic" visibility GAP across competitors.

In [10]:
let regenBrands = false;
let briefing = "Include at least these brands: Toyota, Tesla, Nissan, Renault, Hyundai, Kia, MG, BYD, Omoda, Volkswagen, BMW, Audi, Mercedes, Fiat, Opel, Citroën, Seat, Honda";

let brs = await utils.fromCache(CACHE, 'brands') as brands.FlaggedBrand[] | null;
if (!brs || regenBrands) {
    console.log('Generating brand data...');
    const brand = await brands.generateBrandInfo({
        brandDomain: CONFIG.brandDomain,
        language: CONFIG.language,
        sector: CONFIG.sector,
        market: CONFIG.country,
    });
    const competitors = await brands.generateCompetitorsInfo({
        brandDomain: CONFIG.brandDomain,
        language: CONFIG.language,
        sector: CONFIG.sector,
        market: CONFIG.country,
        briefing: briefing
    });
    brs = brands.concatBrands([brand], competitors);
    await utils.toCache(CACHE, brs, 'overwrite', 'brands');
} else {
    console.log('Loaded brand data from cache');
}

Loaded brand data from cache


In [11]:
brs.map(b => b.shortName)

[
  [32m"Peugeot"[39m,  [32m"Toyota"[39m,
  [32m"Tesla"[39m,    [32m"BYD"[39m,
  [32m"Hyundai"[39m,  [32m"Kia"[39m,
  [32m"Renault"[39m,  [32m"Volkswagen"[39m,
  [32m"BMW"[39m,      [32m"Audi"[39m,
  [32m"Mercedes"[39m, [32m"Honda"[39m,
  [32m"MG"[39m
]

# Define topic queries

Define or generate (with LLM) a number of search queries for the topic (or prompt cluster) we're investigating.

## Generate

In [7]:
let regenQueries = false;

const nQueries = 50;
const queriesPrompt = `
Estoy investigando la visibilidad de una marca de coches en repuestas de LLMs y IAs (Google AI Overview and Mode, ChatGPT).
Sospecho que se posicionan mal (posiblemente por falta de contenido) en temas de subvenciones y ayudas estatales a las compras
de coches (especialmente eléctricos). También les interesa la visibilidad en este tema de los SUVs. Dame ${nQueries} prompts
relacionados con el tema que puedo usar para medir la visibilidad en IAs y para medir un posible content gap
`.trim();

let queries = await utils.fromCache(CACHE, 'queries');
if (!queries || regenQueries) {
    console.log('Generating topic queries...');
    queries = await surfaceGap.generateTopicQueries(
        queriesPrompt,
        nQueries,
        CONFIG.language
    );
    await utils.toCache(CACHE, queries, 'overwrite', 'queries');
}
else {
    console.log('Loaded queries from cache');
}

Loaded queries from cache


In [8]:
queries

[
  [32m"ayudas compra coche eléctrico"[39m,
  [32m"subvenciones coche nuevo 2025"[39m,
  [32m"plan moves coches eléctricos"[39m,
  [32m"incentivos fiscales vehículo eléctrico"[39m,
  [32m"descuentos estatales coches"[39m,
  [32m"ayudas coche híbrido enchufable"[39m,
  [32m"subvenciones renovación coche viejo"[39m,
  [32m"programas gobierno movilidad eléctrica"[39m,
  [32m"bonificaciones compra suv eléctrico"[39m,
  [32m"ayudas suv etiqueta cero"[39m,
  [32m"subvenciones suv etiqueta eco"[39m,
  [32m"incentivos compra suv híbrido"[39m,
  [32m"plan renove coches suvs"[39m,
  [32m"ayudas achatarramiento coche antiguo"[39m,
  [32m"subvenciones vehículo baja emisión"[39m,
  [32m"ayudas coche familiar eléctrico"[39m,
  [32m"ayudas estatales coche urbano"[39m,
  [32m"subvenciones coche segunda mano"[39m,
  [32m"financiación pública coche eléctrico"[39m,
  [32m"ayudas punto recarga doméstico"[39m,
  [32m"subvenciones instalación cargador coche"[39m,
 

## From keywords

In [14]:
let kwdPrompts = await utils.fromCache("./kwdPrompts.json");
let kwds = Object.keys(kwdPrompts) as string[];
let queries = Object.values(kwdPrompts) as string[];
queries

[
  [32m"¿Qué ayudas existen para comprar un coche eléctrico en España?"[39m,
  [32m"¿En qué consiste el Plan MOVES III y qué ayudas ofrece?"[39m,
  [32m"¿Qué ayudas hay actualmente para la compra de un coche eléctrico?"[39m,
  [32m"¿Qué subvenciones están disponibles para coches eléctricos?"[39m,
  [32m"¿Qué subvención puedo obtener para instalar un cargador de coche eléctrico?"[39m,
  [32m"¿Qué ayudas existen para comprar un coche híbrido?"[39m,
  [32m"¿Qué opciones de ayudas hay para comprar un coche eléctrico?"[39m,
  [32m"¿Qué ayudas existen para coches híbridos no enchufables?"[39m,
  [32m"¿Qué ayudas o subvenciones hay para instalar un cargador de coche eléctrico en casa?"[39m,
  [32m"¿Incluye el Plan MOVES III ayudas para híbridos no enchufables?"[39m,
  [32m"¿Qué ayudas están disponibles actualmente para coches híbridos?"[39m,
  [32m"¿Qué ayuda puedo solicitar para instalar un cargador de coche eléctrico?"[39m,
  [32m"¿Qué incluye el programa de ayudas

# SERPs

Given the search queries, get the SERPs and optionally expand with the PeopleAlsoAsk and RelatedQuestion components of the results.

In [15]:
let regenSerps = false;

let serps = await utils.fromCache(CACHE, 'serps') as Array<serp.SerpResponse> | null;
if (!serps || regenSerps) {
    console.log('Fetching Serps...');
    serps = await surfaceGap.serps(
        kwds,
        CONFIG.country,
        CONFIG.country,
        CONFIG.language
    );
    await utils.toCache(CACHE, serps, 'overwrite', 'serps');
}
else {
    console.log('Loaded Serps from cache');
}

Fetching Serps...


In [10]:
// TODO: Expand SERPs with relatedQuestions and PAA

# Scrape ChatGPT

Repeat N times for better confidence

In [16]:
import * as gpt from "../../src/apis/brightdata.ts?v=2";
import type { ModelResult } from '../../src/schemas/models.schema.ts';

let regenGPT = false;
let gptResults = await utils.fromCache(CACHE, 'gptResults') as Array<ModelResult> | null;
if (!gptResults || regenGPT) {
    console.log('Fetching ChatGPT responses...');
    gptResults = await gpt.scrapeGPTBatch({
        prompts: queries,
        countryISOCode: CONFIG.country.toUpperCase(),
        useSearch: true,
    });
    await utils.toCache(CACHE, gptResults, 'overwrite', 'gptResults');
} else {
    console.log('Loaded ChatGPT responses from cache');
}

Fetching ChatGPT responses...


# Visibility (non-branded)

Analyze visibility of brand and competitors in SERP organic results and AI Overview if available

In [95]:
let vis = await surfaceGap.visibility(serps, brs);
let visDf: pl.DataFrame = pl.readRecords(vis);

In [96]:
let gptVis = await surfaceGap.gptPresence(gptResults, brs);
let gptVisDf: pl.DataFrame = pl.readRecords(gptVis);

In [98]:
visDf

name,organicCount,aioResponseCount,aioCitedCount
Kia,18,0,30
Toyota,9,0,9
Renault,7,0,16
Tesla,3,0,0
Volkswagen,1,0,3
Peugeot,0,0,0
BYD,0,0,0
Hyundai,0,1,0
BMW,0,0,0
Audi,0,0,3


In [52]:
let allVis = vis.map(v => {
    let gptV = gptVis.find(gv => gv.name === v.name);
    return {
        ...v,
        gptResponseCount: gptV ? gptV.gptResponseCount : 0,
        gptCitedCount: gptV ? gptV.gptCitedCount : 0,
        gptReferenceCount: gptV ? gptV.gptReferenceCount : 0,
    }
});

let allVisDf: pl.DataFrame = pl.readRecords(allVis);
await display(allVisDf);
allVisDf.select(pl.spearmanRankCorr("organicCount", "aioCitedCount"));

name,organicCount,aioResponseCount,aioCitedCount,gptResponseCount,gptCitedCount,gptReferenceCount
Kia,18,0,30,0,2,9
Toyota,9,0,9,0,13,18
Renault,7,0,16,1,3,8
Tesla,3,0,0,0,0,0
Volkswagen,1,0,3,0,0,6
Peugeot,0,0,0,0,0,0
BYD,0,0,0,0,0,0
Hyundai,0,0,0,0,0,0
BMW,0,0,0,0,0,0
Audi,0,0,3,0,1,2


organicCount
0.7942767628710199


In [92]:
// Calculate pairwise Spearman correlation matrix for all numeric columns (lower triangle, no diagonal)
let numericCols = ["organicCount", "aioResponseCount", "aioCitedCount", "gptResponseCount", "gptCitedCount", "gptReferenceCount"];

let corrMatrix: Array<{ col1: string; col2: string; correlation: number }> = [];
for (let i = 0; i < numericCols.length; i++) {
    for (let j = 0; j < numericCols.length; j++) {
        let corr = allVisDf
            .select(pl.spearmanRankCorr(numericCols[i], numericCols[j]))
            .row(0)[0] as number;
        if (true) {//(!Number.isNaN(corr)) {
            corrMatrix.push({
                col1: numericCols[j],
                col2: numericCols[i],
                correlation: corr
            });
        }
    }
}

let corrDf = pl.readRecords(corrMatrix).pivot({ on: "col2", index: "col1", values: "correlation" });
// await display(corrDf);

// Visualize as heatmap (lower triangle, no diagonal, skip NaN)
Plot.plot({
    document,
    marks: [
        Plot.cell(corrMatrix, {
            x: "col1",
            y: "col2",
            fill: "correlation",
            tip: true,
        }),
        Plot.text(corrMatrix, {
            x: "col1",
            y: "col2",
            text: d => d.correlation ? d.correlation.toFixed(2) : "",
            fill: d => Math.abs(d.correlation) > 0.5 ? "white" : "black",
            fontSize: 10,
        })
    ],
    color: {
        scheme: "RdBu",
        domain: [0, 1],
    },
    x: { tickRotate: -45, label: null, domain: numericCols },
    y: { label: null, domain: numericCols },
    title: "Spearman Rank Correlation Matrix",
    style: { backgroundColor: "white" },
    marginBottom: 80,
    marginLeft: 100,
});

In [58]:
let xvar = "organicCount";
let yvar = "gptReferenceCount";

Plot.plot({
    document,
    marks: [
        Plot.dot(allVis, {
            x: xvar,
            y: yvar,
            tip: true,
            fill: "currentColor",
        }),
        Plot.text(allVis, {
            x: xvar,
            y: yvar,
            text: "name",
            dy: -10,
            fontSize: 10,
        })
    ],
    x: { type: "band" },
    y: { grid: true },
    title: "Content Gap Analysis: Organic vs AI Overview Visibility",
    style: { backgroundColor: "white" },
});

# Branded visibility

Do I rank at least, or can we find any relevant content when we include our brand in the topic queries?

In [13]:
// TODO

# Scrape content

In [51]:
let regenContent = false;

type ContentType = {
    urls: Record<string, Array<string>>;
    content: Record<string, scrape.ScrapeResponse>;
};

let content = await utils.fromCache(CACHE, 'competitorContent') as ContentType | null;
if (!content || regenContent) {
    console.log('Scraping SERP URL content...');
    let brandUrls = surfaceGap.extractBrandUrls(serps, brs);
    let flatUrls = Object.values(brandUrls).flat();
    let content = await scrape.scrapeWebBatch(
        flatUrls,
        {
            formats: ['text', 'markdown', 'html'],
            jsRendering: true,
        }
    );
    const urlToContent = Object.fromEntries(
        flatUrls.map((url, i) => [url, content[i]])
    );

    content = { urls: brandUrls, content: urlToContent };
    await utils.toCache(CACHE, content, 'overwrite', 'competitorContent');
} else {
    console.log('Loaded competitor content from cache');
}

Loaded competitor content from cache


In [52]:
let brandUrls = content.urls;
let urlContent = content.content;

In [54]:
brandUrls

{
  Peugeot: [],
  Toyota: [
    [32m"https://www.toyota.es/world-of-toyota/articles-news-events/como-desgravarse-compra-coche-electrico"[39m,
    [32m"https://www.toyota.es/calculadora-ayudas-coches-electricos"[39m,
    [32m"https://www.toyota.es/world-of-toyota/articles-news-events/plan-moves-iii-ayudas-comprar-electrico-hibrido-enchufable-toyota"[39m,
    [32m"https://www.toyota.es/world-of-toyota/articles-news-events/ayudas-plan-360-madrid"[39m,
    [32m"https://www.toyota.es/world-of-toyota/articles-news-events/beneficios-fiscales-flota-electrica-empresas"[39m,
    [32m"https://www.toyota.es/coches/vehiculos-etiqueta-cero"[39m,
    [32m"https://www.toyota.es/world-of-toyota/articles-news-events/ayudas-comprar-coche-electrico"[39m,
    [32m"https://www.toyota.es/promociones/electrico-bateria"[39m,
    [32m"https://www.toyota.es/promociones/toyota-bz4x-electric-4x2-advance-easy-plus"[39m
  ],
  Tesla: [ [32m"https://www.tesla.com/es_es/support/incentives"[39m ],


# Analyze content

In [53]:
let currBrand = 'Peugeot';
let currBrandUrls = brandUrls[currBrand]
currBrandUrls

[]

# Content format analysis

In [17]:
import * as parse from "../../src/analysis/parseHtml.ts?v=203";

In [18]:
let currUrl = currBrandUrls[0];
console.log(`Analyzing content from URL: ${currUrl}`);
let currContent = urlContent[currUrl]?.content || "";
let $ = parse.html(currContent, currUrl);

Analyzing content from URL: undefined


In [19]:
// Manual URL for testing
let currUrl = "https://www.peugeot.es/electricos-e-hibridos/certificados-ahorro-energetico.html"
let scraped = await scrape.scrapeWeb(
    currUrl,
    {
        formats: ['text', 'markdown', 'html'],
        jsRendering: true,
    }
);
let currContent = scraped.content;
let $ = parse.html(currContent, currUrl);

## Structured Data: Json-LD for Google rich results

In [20]:
let structData = parse.structuredData($);

$ = parse.main($);
void 0;

In [21]:
for (const item of structData) {
    console.log(JSON.stringify(item, null, 2));
}

In [22]:
let includedSchemas = parse.checkStructuredDataTypes(structData);
includedSchemas

{
  Article: [33mfalse[39m,
  Author: [33mfalse[39m,
  BlogPosting: [33mfalse[39m,
  BreadcrumbList: [33mfalse[39m,
  Event: [33mfalse[39m,
  FAQPage: [33mfalse[39m,
  HowTo: [33mfalse[39m,
  JobPosting: [33mfalse[39m,
  LocalBusiness: [33mfalse[39m,
  Organization: [33mfalse[39m,
  Person: [33mfalse[39m,
  Product: [33mfalse[39m,
  Recipe: [33mfalse[39m,
  Review: [33mfalse[39m,
  Service: [33mfalse[39m,
  SoftwareApplication: [33mfalse[39m,
  VideoObject: [33mfalse[39m,
  WebSite: [33mfalse[39m
}

## Headings

In [23]:
let headings = parse.headings($);
for (let h of headings) {
    console.log(`${h.tag}: ${h.text}`);
}

h1: CAE: 1.000€ POR PASARTE AL ELÉCTRICO
h2: AHORRA 1.000€ POR CAMBIAR TU COCHE DE COMBUSTIÓN
POR UN PEUGEOT 100% ELÉCTRICO
h3: ¿Qué son los CAEs?
h3: ¿Qué modelos de Peugeot están incluidos?
h6: E-TRAVELLER
h6: E-PARTNER
h6: E-EXPERT
h6: E-BOXER
h6: E-208
h6: E-2008
h6: NUEVO E-308
h6: NUEVO E-308 SW
h6: E-3008
h6: E-408
h6: E-5008
h6: E-RIFTER
h6: E-TRAVELLER
h6: E-PARTNER
h6: E-EXPERT
h6: E-BOXER
h6: E-208
h6: E-2008
h6: NUEVO E-308
h6: NUEVO E-308 SW
h6: E-3008
h6: E-408
h6: E-5008
h6: E-RIFTER
h6: E-TRAVELLER
h6: E-PARTNER
h6: E-EXPERT
h6: E-BOXER
h3: ¿Cuánto puedes recibir?
h3: ¿Cómo beneficiarte del CAE con PEUGEOT?
 
Los CAE, paso a paso
h3: COMPATIBILIDAD CON OTRAS AYUDAS
h3: Beneficios de la bonificación CAE
h3: FAQ: PREGUNTAS Y RESPUESTAS
h3: ¿Cuánto puedes ahorrar?
h3: ¿Cómo beneficiarte del CAE con Peugeot?
h3: ¿Cuál es la documentación necesaria para poder aplicar la bonificación CAE con Peugeot?
h3: ¿Que son los beneficios de las bonificaciones CAE?
h3: ¿Dónde puedo enco

## Paragraphs

In [24]:
let paragraphs = parse.paragraphs($);
console.log(`Total paragraphs: ${paragraphs.length}`);
console.log('Sample paragraphs:');
for (let p of paragraphs.slice(0, 5)) {
    console.log(`- ${p.slice(0, 100)}...`);
}

Total paragraphs: 171
Sample paragraphs:
- Peugeot se suma al nuevo sistema de Certificados de Ahorro Energético (CAE), una iniciativa respalda...
- Recibe una bonificación de 1.000 € (IVA incluido) por cambiar tu coche de combustión por un vehículo...
- Los Certificados de Ahorro Energético (CAE) son un mecanismo nacional que reconoce y recompensa los ...
- Cuando eliges un Peugeot eléctrico, estás contribuyendo a una movilidad más limpia y eso tiene recom...
- Con el programa CAE, recibirás una bonificación económica directa de 1000€, gestionada íntegramente ...


## Lists

In [25]:
let lists = parse.lists($);
console.log(`Total lists: ${lists.length}`);
console.log('Sample lists:');
for (let l of lists.slice(0, 5)) {
    console.log(`\n- List with ${l.items.length} items: ${l.contextHeading}, URL: ${l.contextLink}`);
    for (let item of l.items.slice(0, 5)) {
        console.log(`  - ${item.slice(0, 100)}...`);
    }
}

Total lists: 5
Sample lists:

- List with 12 items: E-BOXER, URL: undefined
  - Go to slide 1...
  - Go to slide 2...
  - Go to slide 3...
  - Go to slide 4...
  - Go to slide 5...

- List with 4 items: ¿Cómo beneficiarte del CAE con Peugeot?, URL: undefined
  - Acude a tu concesionario Peugeot más cercano...
  - Elige tu vehículo eléctrico...
  - El concesionario gestionará el expediente CAE...
  - Esta bonificación se entrega mediante una factura independiente, gestionada por el concesionario....

- List with 1 items: ¿Que son los beneficios de las bonificaciones CAE?, URL: undefined
  - Reducen el coste de adquisición:...

- List with 1 items: ¿Que son los beneficios de las bonificaciones CAE?, URL: undefined
  - Incentivan la transición a la movilidad eléctrica:...

- List with 1 items: ¿Que son los beneficios de las bonificaciones CAE?, URL: undefined
  - Contribuyen a la reducción del consumo de energía:...


## Tables

In [26]:
let tables = parse.tables($);
console.log(`Total tables: ${tables.length}`);
tables[0]

Total tables: 0


## Links

In [27]:
let links = parse.links($, true);
let linksDf = pl.DataFrame(links).sort("isExternal", true);
linksDf

text,href,isExternal,isPdf
página web,https://www.miteco.gob.es/es/energia/eficiencia/cae.html,True,False
Información MOVES III 2025,https://www.idae.es/ayudas-y-financiacion/para-movilidad-y-vehiculos/moves-iii-2025,True,False
Convocatorias de las Comunidades Autónomas,https://www.idae.es/ayudas-y-financiacion/para-movilidad-y-vehiculos/moves-iii-2025/convocatorias-de-las-comunidades-autonomas,True,False
Guía para acceder a las ayudas,https://www.idae.es/sites/default/files/documentos/ayudas_y_financiacion/MOVES_III/MOVES_III_GUIA_CIUDADANOS_VEHICULOS_28022025.pdf,True,True
Agencia Tributaria: Deducción IRPF por la adquisición de vehículos eléctricos,https://sede.agenciatributaria.gob.es/Sede/vehiculos-embarcaciones/deduccion-irpf-adquisicion-vehiculos-electricos/deduccion-adquisicion-vehiculos-electricos-enchufables.html,True,False
CONFIGURA Y COMPRA,https://store.peugeot.es/trim/configurable/traveller-standard?fuel=El%C3%A9ctrico,False,False
CONFIGURA Y COMPRA,https://store.peugeot.es/trim/configurable/208-5-puertas?fuel=El%C3%A9ctrico,False,False
CONFIGURA Y COMPRA,https://store.peugeot.es/trim/configurable/2008-suv?fuel=El%C3%A9ctrico,False,False
CONFIGURA Y COMPRA,https://store.peugeot.es/trim/configurable/nuevo-308-5-puertas?fuel=El%C3%A9ctrico,False,False
CONFIGURA Y COMPRA,https://store.peugeot.es/trim/configurable/nuevo-308-sw?fuel=El%C3%A9ctrico,False,False


## Questions

In [28]:
let body = parse.bodyText($);
console.log(body.slice(0, 500));

CAE: 1.000€ POR PASARTE AL ELÉCTRICO CAE: 1.000€ POR PASARTE AL ELÉCTRICO AHORRA 1.000€ POR CAMBIAR TU COCHE DE COMBUSTIÓN POR UN PEUGEOT 100% ELÉCTRICO Peugeot se suma al nuevo sistema de Certificados de Ahorro Energético (CAE), una iniciativa respaldada por el Ministerio para la Transición Ecológica. Recibe una bonificación de 1.000 € (IVA incluido) por cambiar tu coche de combustión por un vehículo 100% eléctrico PEUGEOT. AHORRA 1.000€ POR CAMBIAR TU COCHE DE COMBUSTIÓN POR UN PEUGEOT 100% EL


In [29]:
let questions = body.match(/[^.!?]*\?/g) || [];
console.log(`Found ${questions.length} questions`);
for (let q of questions.slice(0, 10)) {
    console.log(`- ${q.trim()}`);
}

Found 11 questions
- ¿Qué son los CAEs?
- ¿Qué modelos de Peugeot están incluidos?
- 000 km en RENTING TODO INCLUIDO 900 € de ayuda en contratos ≥24 meses (CAES) Connect Fleet incluido Punto de carga easyWallbox incluido** Emisiones, Consumos y Condiciones Legales CONFIGURADOR PIDE UNA OFERTA Go to slide 1 Go to slide 2 Go to slide 3 Go to slide 4 Go to slide 5 Go to slide 6 Go to slide 7 Go to slide 8 Go to slide 9 Go to slide 10 Go to slide 11 Go to slide 12 ¿Cuánto puedes recibir?
- ¿Cómo beneficiarte del CAE con PEUGEOT?
- FAQ: PREGUNTAS Y RESPUESTAS ¿Cuánto puedes ahorrar?
- ¿Cómo beneficiarte del CAE con Peugeot?
- ¿Cuál es la documentación necesaria para poder aplicar la bonificación CAE con Peugeot?
- Declaración de responsabilidad ¿Que son los beneficios de las bonificaciones CAE?
- ¿Dónde puedo encontrar más detalles sobre los Certificados de Ahorro Energetico (CAEs)?
- ¿Se puede deducir también la instalación de un punto de carga?


# Forms

In [30]:
import * as parse from "../../src/analysis/parseHtml.ts?v=206";

let forms = parse.forms($);
console.log(`Total forms: ${forms.length}`);

Total forms: 0


In [31]:
forms[0];

## Structured content result

In [32]:
type StructuredContent = {
    schemas: Array<Record<string, unknown>>;
    headings: Array<{ tag: string; text: string }>;
    paragraphs: Array<string>;
    lists: Array<{ contextHeading: string; items: Array<string> }>;
    tables: Array<{ contextHeading: string; headers: Array<string>; rows: Array<Array<string>> }>;
    links: Array<{ href: string; text: string; isExternal: boolean }>;
    forms: Array<parse.Form>;
}

let structuredContent: StructuredContent = {
    schemas: structData,
    headings: headings,
    paragraphs: paragraphs,
    lists: lists,
    tables: tables,
    links: links,
    forms: forms
};

structuredContent

{
  schemas: [],
  headings: [
    { tag: [32m"h1"[39m, text: [32m"CAE: 1.000€ POR PASARTE AL ELÉCTRICO"[39m },
    {
      tag: [32m"h2"[39m,
      text: [32m"AHORRA 1.000€ POR CAMBIAR TU COCHE DE COMBUSTIÓN\n"[39m +
        [32m"POR UN PEUGEOT 100% ELÉCTRICO"[39m
    },
    { tag: [32m"h3"[39m, text: [32m"¿Qué son los CAEs?"[39m },
    { tag: [32m"h3"[39m, text: [32m"¿Qué modelos de Peugeot están incluidos?"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-TRAVELLER"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-PARTNER"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-EXPERT"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-BOXER"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-208"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-2008"[39m },
    { tag: [32m"h6"[39m, text: [32m"NUEVO E-308"[39m },
    { tag: [32m"h6"[39m, text: [32m"NUEVO E-308 SW"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-3008"[39m },
    { tag: [32m"h6"[39m, text: [32m"E-40

## Content categories

- Article, Blog Post, Listicle, Comparison (table), Calculator, Product page, Hub, How-to, News

In [435]:
let ContentTypeSchema = z.object({
    type: z.enum([
        "Article",
        "Blog Post",
        "Listicle",
        "Comparison",
        "Calculator",
        "Product Page",
        "Hub",
        "How-to",
        "News"
    ]).describe("Category best describing the web page. Select the most specific if applicable (e.g. Listicle, Product Page), otherwise more general (e.g. Article)."),
    reason: z.string().describe("Brief explanation (single phrase) of why this content type was assigned."),
}).describe("Categorization of the web page.");

let ContentElementSchema = z.object({
    elementType: z.enum([
        "List",
        "Listicle",
        "Comparison",
        "Calculator"
    ]).describe("Type of content element found on the page. Although a page may not be categorized specifically as a Listicle or Comparison, it may still contain such elements."),
    contextHeading: z.string().describe("The heading or section title under which this content element is found."),
}).describe("Specific content elements identified within the web page that contribute to its overall categorization.");

let StructuredContentSchema = z.object({
    types: z.array(ContentTypeSchema).describe("One or more content types. A web page can be both a general type (e.g. Article) and a more specific type (e.g. Listicle). If multiple types are assigned, they should be listed from most specific to most general. Do NOT include types that are not applicable."),
    elements: z.array(ContentElementSchema).describe("List of specific content elements identified on the page that support its categorization."),
}).describe("Structured representation of the web page's content type and its constituent elements based on HTML analysis.");

In [436]:
// ------ HTML based ------
let catPromptHTML = `
Analyze the following web page HTML content and check whether it belongs to one or more of the following categories:
Article, Blog Post, Listicle, Comparison, Calculator, Product page, Hub, How-to, News. Also check whether any of the following
content elements are present: List, Listicle, Comparison, Calculator. In the output provide the list of applicable categories
(with reason) and the list of detected content blocks (with heading and direct url if available).
To classify the overall page category (e.g. Article, Blog Post, Listicle etc.) focus on the content of the headings (h1, h2).
For Listicles and Comparisons, focus on the presence and content of lists or tables.
For product pages, focus on the presence of product information in the structured data, e.g. lists of products or offers,
or headings that suggest product listings. To detect a hub, focus on the quantity of internal links to related articles or sections.

## Base URI
${currUrl}

## HTML Content
{content}
`
    .replace("{content}", $.html())
    .trim();

let result = await askOpenAISafe(
    catPromptHTML,
    'gpt-5.1',
    StructuredContentSchema,
    { reasoning: { effort: 'low' } }
);

if (result.parsed) {
    console.log(result.parsed);
}


{
  types: [
    {
      type: "Product Page",
      reason: "Landing page promoting Peugeot’s 100% electric range with many model tiles, finance offers, configurator CTAs and “Pide una oferta” links, centered around the CAE bonus (h1 and h2 are promotional, not editorial)."
    },
    {
      type: "How-to",
      reason: "Contains clear procedural sections like “¿Cómo beneficiarte del CAE con PEUGEOT? Los CAE, paso a paso” with numbered steps explaining what to do."
    }
  ],
  elements: [
    {
      elementType: "List",
      contextHeading: "¿Cómo beneficiarte del CAE con PEUGEOT? / Los CAE, paso a paso"
    },
    {
      elementType: "List",
      contextHeading: "Beneficios de la bonificación CAE"
    },
    {
      elementType: "List",
      contextHeading: "¿Cómo beneficiarte del CAE con Peugeot?"
    },
    {
      elementType: "List",
      contextHeading: "¿Cuál es la documentación necesaria para poder aplicar la bonificación CAE con Peugeot?"
    },
    {
      elementTy

In [438]:
// ------ Structure based ------
let catPromptStruct = `
Analyze the following structured web page content and check whether it belongs to one or more of the following categories:
Article, Blog Post, Listicle, Comparison, Calculator, Product page, Hub, How-to, News. Also check whether any of the following
content elements are present: List, Listicle, Comparison, Calculator. In the output provide the list of applicable categories
(with reason) and the list of detected content blocks (with heading and direct url if available).
To classify the overall page category (e.g. Article, Blog Post, Listicle etc.) focus on the schemas and the content of the headings (h1, h2).
For Listicles and Comparisons, focus on the presence and content of lists or tables.
For product pages, focus on the presence of product information in the structured data, e.g. lists of products or offers,
or headings that suggest product listings.
To detect a hub, focus on the quantity of internal links to related articles or sections.

## Base URI
${currUrl}

## HTML Content
{content}
`
    .replace("{content}", JSON.stringify(structuredContent, null, 2))
    .trim();

let result = await askOpenAISafe(
    catPromptStruct,
    'gpt-5.1',
    StructuredContentSchema,
    { reasoning: { effort: 'low' } }
);

if (result.parsed) {
    console.log(result.parsed);
}


{
  types: [
    {
      type: "How-to",
      reason: "The page explains how to obtain and benefit from the CAE (Certificados de Ahorro Energético) with step-by-step instructions under headings like “¿Cómo beneficiarte del CAE con Peugeot?” and related procedural FAQs."
    },
    {
      type: "Article",
      reason: "Provides general informational content about energy-saving certificates, compatible models, fiscal deductions, and related aids, structured with explanatory H2/H3 sections and FAQs rather than being primarily commercial product detail."
    }
  ],
  elements: [
    { elementType: "List", contextHeading: "E-BOXER" },
    {
      elementType: "List",
      contextHeading: "¿Cómo beneficiarte del CAE con Peugeot?"
    },
    {
      elementType: "List",
      contextHeading: "¿Que son los beneficios de las bonificaciones CAE?"
    },
    {
      elementType: "List",
      contextHeading: "¿Que son los beneficios de las bonificaciones CAE?"
    },
    {
      elementType: 

## Content element categories

In [439]:
import * as parse from "../../src/analysis/parseHtml.ts?v=207";

let classifiedLists = await parse.classifyElements(lists, 'list');
let classifiedTables = await parse.classifyElements(tables, 'table');
let classifiedForms = await parse.classifyElements(forms, 'form');

console.log(`Classified ${classifiedLists.length} lists, ${classifiedTables.length} tables, ${classifiedForms.length} forms`);

Classified 5 lists, 0 tables, 0 forms


In [441]:
classifiedLists

[
  {
    ordered: [33mfalse[39m,
    items: [
      [32m"Go to slide 1"[39m,
      [32m"Go to slide 2"[39m,
      [32m"Go to slide 3"[39m,
      [32m"Go to slide 4"[39m,
      [32m"Go to slide 5"[39m,
      [32m"Go to slide 6"[39m,
      [32m"Go to slide 7"[39m,
      [32m"Go to slide 8"[39m,
      [32m"Go to slide 9"[39m,
      [32m"Go to slide 10"[39m,
      [32m"Go to slide 11"[39m,
      [32m"Go to slide 12"[39m
    ],
    contextHeading: [32m"E-BOXER"[39m,
    classification: {
      type: [32m"Navigation"[39m,
      confidence: [32m"high"[39m,
      reason: [32m"The list items serve as navigation links directing to specific slides."[39m
    }
  },
  {
    ordered: [33mtrue[39m,
    items: [
      [32m"Acude a tu concesionario Peugeot más cercano"[39m,
      [32m"Elige tu vehículo eléctrico"[39m,
      [32m"El concesionario gestionará el expediente CAE"[39m,
      [32m"Esta bonificación se entrega mediante una factura independiente, gest

## Entities
Which entities are mentioned in well-ranking pages?

In [446]:
import * as entities from "../../src/entities.ts?v=6";

let instructions = `
Extract any relevant entities or keywords from the text related to electric vehicles and government subsidies.
These will be used to brief content creation, so focus on terms that would help in writing informative articles
similar to the input text but for a different brand.
`.trim();

let bodyEnts = await entities.extractAnyEntities(body, instructions, 'gpt-5.1', { reasoning: { effort: 'none' } });

In [447]:
for (const type of [...new Set(bodyEnts.map(e => e.type))]) {
    const entities = bodyEnts.filter(e => e.type === type).map(e => e.name);
    console.log(`\n**${type}**\n`);
    for (const entity of entities) {
        console.log(`- ${entity}`);
    }
}


**brand**

- peugeot

**product**

- vehículo eléctrico
- coche eléctrico
- e-traveller
- e-partner
- e-expert
- e-boxer
- e-208
- e-2008
- e-3008
- e-308
- e-308 sw
- e-408
- e-5008
- e-rifter

**product line**

- gama 100 % eléctrica de peugeot

**product subtype**

- turismo eléctrico
- suv eléctrico
- vehículo comercial eléctrico

**product (reference/old technology)**

- coche de combustión
- vehículo de combustión

**concept**

- movilidad eléctrica
- electrificación de la movilidad
- ahorro energético

**policy instrument**

- certificado de ahorro energético
- cae
- certificados de ahorro energético

**subsidy program**

- programa cae
- plan moves iii
- programa moves iii 2025
- programa de incentivos ligados a la movilidad eléctrica

**financial incentive**

- bonificación cae
- bonificación económica directa
- bonificación de 1.000 €
- ayuda de 900 €
- ayuda gubernamental
- subvención

**tax incentive**

- deducción irpf
- deducción del 15% en el irpf
- deducción irpf hasta

In [400]:
pl.DataFrame(bodyEnts).filter(pl.col('type').neq(pl.lit('ev_model')))

name,type
coche eléctrico,electric_vehicle
vehículo eléctrico,electric_vehicle
coches eléctricos,electric_vehicle
vehículos electrificados,electric_vehicle
vehículo eléctrico de batería,electric_vehicle_type
bev,electric_vehicle_type
vehículo híbrido enchufable,electric_vehicle_type
phev,electric_vehicle_type
vehículo de pila de combustible de hidrógeno,electric_vehicle_type
fcev,electric_vehicle_type


# Own content

In [193]:
...

Expression expected at file:///repl.tsx:1:1

  ...
  ~~~: Expression expected at file:///repl.tsx:1:1

  ...
  ~~~

# Create Content

Create content automatically, or generate a brief for what content should be created

## Auto-generate FAQ