### 1.0 Create an Azure AI Search Index 


 #### Vector-Only Index Creation with Azure AI Search

This script creates a vector-only index in Azure AI Search using the General Availability (GA) schema introduced in mid-2024. It sets up an index with just two fields:

A string-based document ID (used as the primary key)
A vector field (contentVector) that holds embedding data (e.g.Azure OpenAI)
We configure the vector search behavior to use the HNSW algorithm with cosine similarity, which is ideal for semantic search scenarios. This vector-only setup is lean and optimized for scenarios where we rely purely on vector search (e.g., similarity search in embeddings) rather than keyword-based retrieval.



In [20]:
#!/usr/bin/env python3
"""
create_index_text_and_vector.py
───────────────────────────────
Creates/updates an Azure AI Search index that stores BOTH:

• a searchable **raw** text field (full chunk text)  
• a 1 536-d **contentVector** field for HNSW-cosine vector search

Everything else is unchanged from the original script—only the extra
`raw` field is added so query results can include readable snippets.
"""

from dotenv import load_dotenv
import os
from azure.core.credentials import AzureKeyCredential
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import (
    SearchIndex,
    SimpleField,
    SearchField,
    SearchFieldDataType,
    VectorSearch,
    HnswAlgorithmConfiguration,
    HnswParameters,
    VectorSearchProfile,
)

# ── 1. env ──────────────────────────────────────────────────────────
load_dotenv()
ENDPOINT   = os.getenv("AZURE_SEARCH_ENDPOINT")
ADMIN_KEY  = os.getenv("AZURE_SEARCH_ADMIN_KEY")
INDEX_NAME = "index02"

# ── 2. algorithm + profile (HNSW + cosine) ─────────────────────────
algo_cfg = HnswAlgorithmConfiguration(
    name="hnsw-cosine",
    parameters=HnswParameters(metric="cosine")
)

profile_cfg = VectorSearchProfile(
    name="hnsw-cosine-profile",
    algorithm_configuration_name="hnsw-cosine",
)

vector_search = VectorSearch(
    algorithms=[algo_cfg],
    profiles=[profile_cfg],
)

# ── 3. schema: id + raw text + vector ──────────────────────────────
fields = [
    SimpleField(name="id", type=SearchFieldDataType.String, key=True),

    # NEW: store the chunk’s plain text so we can preview it in results
    SearchField(
        name="raw",
        type=SearchFieldDataType.String,
        searchable=True,         # full-text search enabled
        filterable=False,
        facetable=False,
        sortable=False
    ),

    SearchField(
        name="contentVector",
        type=SearchFieldDataType.Collection(SearchFieldDataType.Single),
        searchable=True,
        vector_search_dimensions=1536,
        vector_search_profile_name="hnsw-cosine-profile",
    ),
]

index = SearchIndex(
    name=INDEX_NAME,
    fields=fields,
    vector_search=vector_search,
)

# ── 4. push index ──────────────────────────────────────────────────
client = SearchIndexClient(ENDPOINT, AzureKeyCredential(ADMIN_KEY))
print(f"Creating or updating index '{INDEX_NAME}' …")
client.create_or_update_index(index)
print("✅  Index ready – text + vector fields provisioned")


16:29:56 [INFO] Request URL: 'https://chatops-ozguler.search.windows.net/indexes('index02')?api-version=REDACTED'
Request method: 'PUT'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '688'
    'api-key': 'REDACTED'
    'Prefer': 'REDACTED'
    'Accept': 'application/json;odata.metadata=minimal'
    'x-ms-client-request-id': '2d42c99c-3d5a-11f0-9ff2-4eb2cec3a125'
    'User-Agent': 'azsdk-python-search-documents/11.5.2 Python/3.12.10 (macOS-15.5-arm64-arm-64bit)'
A body is sent with the request


Creating or updating index 'index02' …


16:29:57 [INFO] Response status: 201
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=minimal; odata.streaming=true; charset=utf-8'
    'ETag': '"0x8DD9F7E122DA5A7"'
    'Location': 'REDACTED'
    'Server': 'Microsoft-IIS/10.0'
    'Strict-Transport-Security': 'REDACTED'
    'Preference-Applied': 'REDACTED'
    'OData-Version': 'REDACTED'
    'request-id': '2d42c99c-3d5a-11f0-9ff2-4eb2cec3a125'
    'elapsed-time': 'REDACTED'
    'Date': 'Fri, 30 May 2025 13:29:57 GMT'


✅  Index ready – text + vector fields provisioned


✅ Result
Once this script runs, you’ll have a minimal, production-ready vector-only index that is compatible with the new GA schema and supports efficient vector similarity search via HNSW and cosine distance.

You can now upload vectorized documents and perform semantic search queries efficiently.

### 2.0 OCR the PDF 

This code performs OCR on a single PDF file, 2504_IMF_WOO.pdf, using Azure AI Document Intelligence and saves the extracted text as 2504_IMF_WOO.txt in the same directory. It loads API credentials from a .env file, sets up the client, and handles both script and notebook environments by resolving the working directory accordingly. The script submits the PDF to Azure’s prebuilt-read model, waits for the result, extracts text line-by-line from each page, and writes the output as a plain-text file. It includes basic error handling and status messages, making it a clean and reusable OCR workflow.



In [3]:
"""
OCR one PDF (2504_IMF_WOO.pdf) with Azure Document Intelligence
Saves 2504_IMF_WOO.txt in the same folder.
"""
from pathlib import Path
import os, sys
from dotenv import load_dotenv
from azure.core.credentials import AzureKeyCredential
from azure.ai.documentintelligence import DocumentIntelligenceClient

# ─────────────── SETUP ───────────────
try:
    SCRIPT_DIR = Path(__file__).resolve().parent   # works in a .py file
except NameError:
    SCRIPT_DIR = Path.cwd()                        # Jupyter fallback

load_dotenv(dotenv_path=SCRIPT_DIR / ".env")       # credentials in .env

ENDPOINT = os.getenv("DOCUMENTINTELLIGENCE_ENDPOINT")
KEY      = os.getenv("DOCUMENTINTELLIGENCE_API_KEY")
if not ENDPOINT or not KEY:
    sys.exit("❌  Missing DOCUMENTINTELLIGENCE_… values in .env")

client = DocumentIntelligenceClient(
    endpoint=ENDPOINT, credential=AzureKeyCredential(KEY)
)

PDF_FILE = SCRIPT_DIR / "2504_IMF_WOO.pdf"
if not PDF_FILE.exists():
    sys.exit(f"❗  {PDF_FILE.name} not found in {SCRIPT_DIR.resolve()}")

print(f"🔍  Processing {PDF_FILE.name} …")

# ─────────────── OCR ───────────────
try:
    with PDF_FILE.open("rb") as fh:
        poller = client.begin_analyze_document(
            "prebuilt-read", fh, content_type="application/pdf"
        )
    result = poller.result()

    pages_txt = [
        "\n".join(ln.content for ln in (p.lines or []))
        for p in (result.pages or [])
    ]
    (PDF_FILE.with_suffix(".txt")).write_text("\n\n".join(pages_txt), "utf-8")
    print(f"✅  Text saved to {PDF_FILE.with_suffix('.txt').name}")

except Exception as e:
    print(f"⚠️  Failed to process {PDF_FILE.name}: {e}")


🔍  Processing 2504_IMF_WOO.pdf …
✅  Text saved to 2504_IMF_WOO.txt


### 3. pre-Processing Text 

To prepare the OCR dump 2504_IMF_WOO.txt for RAG, the script first joins words split across line-break hyphens (e.g., “eco- \n nomic” → “economic”). It then strips generic noise—tabs, HTML/Markdown tags, non-UTF8 bytes, divider lines, and bold “IMPORTANT/NOTE” blocks—using regex replacements. Next, it removes IMF-specific clutter such as page headers/footers, Roman- or Arabic-numbered page numbers, table-of-contents lines, chapter titles, and figure/table captions. Finally, it replaces all remaining newlines with spaces and collapses multiple spaces to one, producing a compact, boilerplate-free string that is ideal for tokenization and chunking. The cleaned output is saved as 2504_IMF_WOO.cleaned.txt.

In [4]:
#!/usr/bin/env python3
"""
clean_2504_imf_woo.py  –  Create a RAG-ready version of 2504_IMF_WOO.txt

Reads the raw OCR dump, removes headers/footers, TOC noise, figure captions,
hyphen-breaks, HTML/Markdown tags, non-UTF8 chars, etc., and writes
2504_IMF_WOO.cleaned.txt in the same directory.

Run with:  python clean_2504_imf_woo.py
"""

from pathlib import Path
import re
import sys


# ────────────────────────────────────────────────────────────────
# 1.  Text-cleaning utility
# ────────────────────────────────────────────────────────────────
def clean_text(text: str) -> str:
    """Return a compact, boilerplate-free string suitable for chunking."""
    # fix hyphenated line breaks  (eco-\n  nomic → economic)
    text = re.sub(r"(\w+)-\s*\n\s*(\w+)", r"\1\2", text)

    # generic noise
    generic = [
        r"\t", r"\r\n", r"\r",                # tabs / CRs
        r"[^\x00-\x7F]+",                     # non-UTF8
        r"<\/?(table|tr|td|ul|li|p|br)>",     # HTML tags
        r"\*\*IMPORTANT:\*\*|\*\*NOTE:\*\*", # doc notes
        r"<!|no-loc |text=|<--|-->",          # markup
        r"```|:::|---|--|###|##|#",           # md code / hr / headers
    ]
    for pat in generic:
        text = re.sub(pat, " ", text, flags=re.I)

    # IMF-specific headers / footers / TOC lines / captions
    imf_noise = [
        r"INTERNATIONAL MONETARY FUND",
        r"WORLD\s+ECONOMIC\s+OUTLOOK",
        r"\|\s*April\s+\d{4}",
        r"^CONTENTS$|^DATA$|^PREFACE$|^FOREWORD$|^EXECUTIVE SUMMARY$",
        r"^ASSUMPTIONS AND CONVENTIONS$|^FURTHER INFORMATION$|^ERRATA$",
        r"^Chapter\s+\d+.*$",
        r"^(Table|Figure|Box|Annex)\s+[A-Z0-9].*$",
        r"^\s*[ivxlcdm]+\s*$",   # Roman numerals
        r"^\s*\d+\s*$",          # arabic page nos
    ]
    for pat in imf_noise:
        text = re.sub(pat, " ", text, flags=re.I | re.M)

    # remove remaining newlines → single spaces
    text = text.replace("\n", " ")
    text = re.sub(r"\s{2,}", " ", text).strip()
    return text


# ────────────────────────────────────────────────────────────────
# 2.  Entrypoint
# ────────────────────────────────────────────────────────────────
def main() -> None:
    raw_path = Path.cwd() / "2504_IMF_WOO.txt"
    if not raw_path.exists():
        sys.exit(f"❌  {raw_path.name} not found in {Path.cwd()}")

    raw_text = raw_path.read_text(encoding="utf-8", errors="ignore")
    cleaned   = clean_text(raw_text)

    out_path = raw_path.with_suffix(".cleaned.txt")
    out_path.write_text(cleaned, encoding="utf-8")

    print(f"✅  Saved cleaned text → {out_path.name}  "
          f"({len(cleaned):,} characters)")


if __name__ == "__main__":
    main()


✅  Saved cleaned text → 2504_IMF_WOO.cleaned.txt  (624,877 characters)


### 4. Chunking Documents for RAG

### 🔹 What Is “Chunking” in RAG?

*Chunking* means slicing long documents into smaller, self-contained pieces (“chunks”) so they fit the model’s token window and can be embedded, indexed, and retrieved accurately. The aim is to keep **enough context** for a useful answer while **avoiding overly large inputs** that waste tokens or hurt precision.

---

#### Common Chunking Methods

| Method | How it works | Best for |
|--------|--------------|----------|
| **Fixed-length windows** | Split every *N* tokens/characters, often with 10–20 % overlap. | Logs, code, data dumps where structure ≈ length. |
| **Sentence/paragraph split** | Use an NLP splitter; keep full sentences or paragraphs. | Narrative or news text; avoids mid-sentence cuts. |
| **Recursive / semantic split** | Split on headings → paragraphs → sentences until each piece < limit (e.g., LangChain `RecursiveCharacterTextSplitter`). | Long structured docs (white papers, legal contracts). |
| **Sliding window at retrieval** | No pre-processing; generate overlapping windows on demand around query anchors. | Recall-critical QA (wikis, forums) when storage is cheap. |
| **Adaptive / LLM-assisted** | An LLM places boundaries where topics shift. | Highly variable content; experimental but coherent. |

---

#### Choosing a Strategy

* **Code & logs:** fixed 400-token windows + 10 % overlap.  
* **Technical reports / legal PDFs:** recursive splitting on headings.  
* **Emails & web articles:** paragraph/sentence chunks of ~300-500 tokens.  
* **Large wiki corpora:** sliding windows to maximise recall.  
* **Mixed formats needing topic coherence:** try LLM-assisted splitting.

> **Rule of thumb:** keep chunks **200–800 tokens** and add a small overlap when continuity matters.


In [21]:
#!/usr/bin/env python3
"""
embed_and_upload_chunks.py
──────────────────────────
• Reads IMF WEO cleaned text (2504_IMF_WOO.cleaned.txt)
• Splits it into ≈ 500-token chunks (10 % overlap, heading-aware)
• Embeds each chunk with Azure OpenAI
• **Uploads id + raw + contentVector** to Azure AI Search index01
• Shows INFO logging and tqdm progress bars

(raw text is now retained so query results can include readable snippets)
"""

import os, sys, re, logging
from pathlib import Path
from dotenv import load_dotenv
from tqdm.auto import tqdm

import tiktoken
import openai
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient

# ── 1. logging ────────────────────────────────────────────────────
logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] %(message)s",
    datefmt="%H:%M:%S",
)
log = logging.getLogger("IMF-Embed")

# ── 2. env vars ───────────────────────────────────────────────────
load_dotenv()

SEARCH_ENDPOINT   = os.getenv("AZURE_SEARCH_ENDPOINT")
SEARCH_ADMIN_KEY  = os.getenv("AZURE_SEARCH_ADMIN_KEY")
SEARCH_INDEX_NAME = os.getenv("AZURE_SEARCH_INDEX_NAME", "index01")
if not SEARCH_ENDPOINT or not SEARCH_ADMIN_KEY:
    sys.exit("❌  Missing AZURE_SEARCH_* vars in .env")

AOAI_ENDPOINT     = os.getenv("AZURE_OPENAI_ENDPOINT", "").rstrip("/")
AOAI_KEY          = os.getenv("AZURE_OPENAI_API_KEY")
AOAI_API_VERSION  = os.getenv("AZURE_OPENAI_API_VERSION", "2024-12-01-preview")
EMBED_DEPLOYMENT  = os.getenv("AZURE_TEXT_EMBEDDING_DEPLOYMENT_NAME")
if not AOAI_ENDPOINT or not AOAI_KEY or not EMBED_DEPLOYMENT:
    sys.exit("❌  Missing Azure OpenAI vars in .env")

# ── 3. files & params ────────────────────────────────────────────
CLEAN_FILE   = Path("2504_IMF_WOO.cleaned.txt")
CHUNK_TOKENS = 500
OVERLAP      = 50
EMB_BATCH    = 16
UPL_BATCH    = 100

enc = tiktoken.get_encoding("cl100k_base")

# ── 4. helpers ────────────────────────────────────────────────────
def slide(tokens, size, step):
    for i in range(0, len(tokens), step):
        yield tokens[i : i + size]

def chunkify(text: str, parent: str = "WEO25"):
    step = CHUNK_TOKENS - OVERLAP
    tokens = enc.encode(text)
    for idx, win in enumerate(slide(tokens, CHUNK_TOKENS, step)):
        yield {
            "id": f"{parent}_c{idx:06}",
            "raw": enc.decode(win),          # kept for query-time snippets
            "@search.action": "upload",
        }

# ── 5. load cleaned text ─────────────────────────────────────────
if not CLEAN_FILE.exists():
    sys.exit("❌  Cleaned text file not found")

full_text = CLEAN_FILE.read_text("utf-8")

log.info("Splitting document on headings …")
blocks = re.split(r"\n([A-Z][^\n]{3,100})\n", full_text)  # even=text, odd=heading

chunks = []
for i in range(0, len(blocks), 2):
    body = blocks[i]
    chunks.extend(chunkify(body))

log.info("Generated %s chunks (≈%s tokens each).", len(chunks), CHUNK_TOKENS)

# ── 6. embed ─────────────────────────────────────────────────────
openai_client = openai.AzureOpenAI(
    api_key     = AOAI_KEY,
    api_version = AOAI_API_VERSION,
    base_url    = f"{AOAI_ENDPOINT}/openai/deployments/{EMBED_DEPLOYMENT}",
)

log.info("Embedding with %s …", EMBED_DEPLOYMENT)
for i in tqdm(range(0, len(chunks), EMB_BATCH), desc="Embedding", unit="chunk"):
    batch  = chunks[i:i+EMB_BATCH]
    inputs = [c["raw"] for c in batch]
    resp   = openai_client.embeddings.create(model=EMBED_DEPLOYMENT, input=inputs)
    for rec, emb in zip(batch, resp.data):
        rec["contentVector"] = emb.embedding  # name matches index field

# ── 7. upload ────────────────────────────────────────────────────
search = SearchClient(
    endpoint    = SEARCH_ENDPOINT,
    index_name  = SEARCH_INDEX_NAME,
    api_version = "2024-07-01",
    credential  = AzureKeyCredential(SEARCH_ADMIN_KEY),
)

log.info("Uploading to Search index %s …", SEARCH_INDEX_NAME)
for i in tqdm(range(0, len(chunks), UPL_BATCH), desc="Uploading", unit="chunk"):
    batch   = chunks[i:i+UPL_BATCH]
    results = search.upload_documents(batch)
    fails   = [r for r in results if not r.succeeded]
    if fails:
        log.warning("%d failures (starting at %s)", len(fails), batch[0]["id"])

log.info("✅  Done — %s chunks embedded & indexed.", len(chunks))


16:34:29 [INFO] Splitting document on headings …
16:34:29 [INFO] Generated 414 chunks (≈500 tokens each).
16:34:29 [INFO] Embedding with text-embedding-3-small …
Embedding:   0%|          | 0/26 [00:00<?, ?chunk/s]16:34:30 [INFO] HTTP Request: POST https://aoai-ep-swedencentral02.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
Embedding:   4%|▍         | 1/26 [00:01<00:25,  1.03s/chunk]16:34:31 [INFO] HTTP Request: POST https://aoai-ep-swedencentral02.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
Embedding:   8%|▊         | 2/26 [00:01<00:17,  1.37chunk/s]16:34:31 [INFO] HTTP Request: POST https://aoai-ep-swedencentral02.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
Embedding:  12%|█▏        | 3/26 [00:02<00:14,  1.54chunk/s]16:34:32 [INFO] HTTP Request: POST https://aoai-

This script automates the pipeline for turning the **cleaned IMF WEO report** into a searchable vector index. It loads *2504_IMF_WOO.cleaned.txt*, splits the text into ~500-token chunks with a 10 % overlap (to avoid cutting important context), and sends each chunk to Azure OpenAI’s **text-embedding-3-small** model to obtain a 1 536-dimensional vector. After embedding, the script uploads only two fields—`id` and `contentVector`—to the Azure AI Search index **index01**, which is set up as a vector-only schema. All credentials come from the `.env` file, and `tqdm` progress bars plus INFO-level logging make it easy to track embedding and upload progress inside a Jupyter notebook.


### 5. Query the Index 

In [23]:
#!/usr/bin/env python3
"""
rag_query.py
────────────
1. Embed a natural-language question
2. Retrieve the TOP_K most similar chunks from Azure AI Search
3. Feed those chunks to a chat-capable Azure OpenAI model
4. Print the LLM’s grounded answer + show which chunks were used
"""

import os, textwrap, openai
from dotenv import load_dotenv
from azure.core.credentials import AzureKeyCredential
from azure.search.documents import SearchClient
from azure.search.documents.models import VectorizedQuery

# ── 0. config ──────────────────────────────────────────────────────
load_dotenv()

QUESTION     = "How will AI affect future energy demand according to the report?"
TOP_K        = 3
VECTOR_FIELD = "contentVector"

EMBED_DEPLOY = os.getenv("AZURE_TEXT_EMBEDDING_DEPLOYMENT_NAME")
CHAT_DEPLOY  = os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME",
                         os.getenv("AZURE_OPENAI_DEPLOYMENT_NAME"))  # fallback
AOAI_BASE    = os.getenv("AZURE_OPENAI_ENDPOINT").rstrip("/")
AOAI_KEY     = os.getenv("AZURE_OPENAI_API_KEY")
AOAI_VER     = os.getenv("AZURE_OPENAI_API_VERSION", "2024-12-01-preview")

# ── 1. embed the question ─────────────────────────────────────────
aoai = openai.AzureOpenAI(
    api_key=AOAI_KEY,
    api_version=AOAI_VER,
    base_url=f"{AOAI_BASE}/openai/deployments/{EMBED_DEPLOY}"
)
embedding = aoai.embeddings.create(model=EMBED_DEPLOY, input=[QUESTION]).data[0].embedding

# ── 2. vector search ──────────────────────────────────────────────
search = SearchClient(
    endpoint   = os.getenv("AZURE_SEARCH_ENDPOINT"),
    index_name = os.getenv("AZURE_SEARCH_INDEX_NAME", "index01"),
    credential = AzureKeyCredential(os.getenv("AZURE_SEARCH_ADMIN_KEY")),
    api_version="2024-07-01"
)

vector_query = VectorizedQuery(vector=embedding, fields=VECTOR_FIELD, k=TOP_K)
hits = list(search.search(search_text="", vector_queries=[vector_query], top=TOP_K))

contexts = []
for h in hits:
    snippet = (h.get("raw") or "")[:300].replace("\n", " ")
    contexts.append(h.get("raw", ""))
    print(f"\n{h['id']}  (score {h['@search.score']:.3f})")
    print(textwrap.shorten(snippet, 300) if snippet else "(raw text not stored)")

if not contexts:
    print("\nNo chunks retrieved – cannot answer.")
    raise SystemExit

# ── 3. ask the LLM with retrieved context ─────────────────────────
chat = openai.AzureOpenAI(
    api_key=AOAI_KEY,
    api_version=AOAI_VER,
    base_url=f"{AOAI_BASE}/openai/deployments/{CHAT_DEPLOY}"
)

prompt = f"""You are an analyst answering questions using the provided report excerpts.
Answer concisely and cite excerpts by chunk id when relevant.

Question:
{QUESTION}

Excerpts:
{"\n\n".join(f"[{hits[i]['id']}]\n{contexts[i]}" for i in range(len(contexts)))}

Answer:"""

response = chat.chat.completions.create(
    model=CHAT_DEPLOY,
    messages=[{"role": "user", "content": prompt}],
    max_tokens=400,
    temperature=0.2
)

print("\n── LLM Answer ──\n")
print(response.choices[0].message.content)


16:36:07 [INFO] HTTP Request: POST https://aoai-ep-swedencentral02.openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2024-12-01-preview "HTTP/1.1 200 OK"
16:36:07 [INFO] Request URL: 'https://chatops-ozguler.search.windows.net/indexes('index01')/docs/search.post.search?api-version=REDACTED'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/json'
    'Content-Length': '34274'
    'api-key': 'REDACTED'
    'Accept': 'application/json;odata.metadata=none'
    'x-ms-client-request-id': '0a883274-3d5b-11f0-9ff2-4eb2cec3a125'
    'User-Agent': 'azsdk-python-search-documents/11.5.2 Python/3.12.10 (macOS-15.5-arm64-arm-64bit)'
A body is sent with the request
16:36:08 [INFO] Response status: 200
Response headers:
    'Transfer-Encoding': 'chunked'
    'Content-Type': 'application/json; odata.metadata=none; odata.streaming=true; charset=utf-8'
    'Content-Encoding': 'REDACTED'
    'Vary': 'REDACTED'
    'Server': 'Microsoft-IIS/10.0'
    'S


WEO25_c000087  (score 0.765)
(2030e) Sources: International Energy Agency (IEA); Organization of the Petroleum Exporting Countries (OPEC); and IMF staff calculations. Note: Estimates for data centers (DCs) and electric vehicles (EVs) are for the world and come from OPEC and the IEA, respectively. Data labels in the figure use

WEO25_c000088  (score 0.759)
greater use of compute by companies pursuing better-performing models (Hoffmann and others 2022). Adding to this complexity is the recent emergence of reasoning models-which require more compute in their deployment-and possibly greater AI use driven by lower costs and availability of open-source mo

WEO25_c000089  (score 0.741)
4). 8 percent in the United States (525 TWh), 3 percent in Europe (145 TWh), and 2 percent in China (237 TWh) relative to the baseline scenario. In the AI scenario under alternative energy policies, the increase in total electricity supply is kept the same, but its composition shifts in favor of ren


16:36:11 [INFO] HTTP Request: POST https://aoai-ep-swedencentral02.openai.azure.com/openai/deployments/gpt-4.1/chat/completions?api-version=2024-12-01-preview "HTTP/1.1 200 OK"



── LLM Answer ──

AI is expected to significantly increase future energy demand, primarily through higher electricity consumption by data centers and AI services. By 2030, global electricity consumption from AI could reach 1,500 TWh—comparable to India's current total electricity use and about 1.5 times higher than projected demand from electric vehicles. In the United States, electricity demand from data centers is projected to more than triple from 178 TWh in 2024 to 606 TWh in 2030. Under an AI-driven scenario, total electricity supply is expected to rise by 8% in the US, 3% in Europe, and 2% in China relative to baseline projections. This surge in demand may drive up electricity prices and could require significant investments in renewables and grid infrastructure to avoid supply bottlenecks and price spikes. If renewables scale-up or grid investments lag, price increases could be substantial, and electricity might need to be redirected from other sectors, impacting energy-intensi