[Authenticating to sovereign cloud](https://learn.microsoft.com/en-us/python/api/overview/azure/search-documents-readme?view=azure-python#authenticate-in-a-national-cloud)

In [24]:
from pathlib import Path

import os

from dotenv import load_dotenv

from azure.identity import DefaultAzureCredential, AzureAuthorityHosts

from azure.search.documents import SearchClient

env_directory = Path.cwd() / ".azure"

env_file = next(
    (candidate / ".env" for candidate in env_directory.glob("avcoe-*")),
    None,
)

if env_file is None:
    raise FileNotFoundError(
        "Could not locate an avcoe-* environment directory under .azure"
    )


load_dotenv(dotenv_path=env_file, override=False)


search_endpoint = os.environ["SEARCH_SERVICE_ENDPOINT"]

search_index = os.environ["SEARCH_INDEX_NAME"]

print(f"Using search endpoint {search_endpoint}, and index: {search_index}")

authority_host = (
    AzureAuthorityHosts.AZURE_GOVERNMENT
    if os.getenv("CLOUD_NAME") == "AzureUSGovernment"
    else AzureAuthorityHosts.AZURE_PUBLIC_CLOUD
)

print(f"Using authority host: {authority_host}")

Using search endpoint https://avcoe-demo-ai-search-mcp-search.search.azure.us, and index: avcoe-demo-ai-search-mcp-index-and-vectorize
Using authority host: login.microsoftonline.us


In [25]:
# Set the correct credential scope for Azure Government
audience = ("https://search.azure.us" if os.getenv("CLOUD_NAME") == "AzureUSGovernment"else "https://search.azure.com")

print(f"Using credential scope: {audience}")

credential = DefaultAzureCredential()

# Create SearchClient with the scoped credential
search_client = SearchClient(search_endpoint, search_index, credential, audience=audience)

Using credential scope: https://search.azure.us


In [36]:
import json

results = search_client.search("*",
                                 facets=["title"],
)

results.get_facets()

{'title': [{'value': 'llama.pdf', 'count': 61},
  {'value': 'deepseek.pdf', 'count': 41}]}

In [26]:
results = search_client.search("test",
                                 top=3,
                                 include_total_count=True,
                                 query_type="semantic",
                                 filter="title eq 'deepseek.pdf'",
                                 select=["title", "chunk"],)
                                 

import json

for result in results:
    print(json.dumps(dict(result), indent=2, default=str))

InteractiveBrowserBrokerCredential.get_token_info failed: (pii). Status: Response_Status.Status_IncorrectConfiguration, Error code: 3399614475, Tag: 508634112


{
  "chunk": "to the zero-shot setting. The CoT in few-shot\nmay hurt the performance of DeepSeek-R1. Other datasets follow their original evaluation\nprotocols with default prompts provided by their creators. For code and math benchmarks, the\nHumanEval-Mul dataset covers eight mainstream programming languages (Python, Java, C++,\nC#, JavaScript, TypeScript, PHP, and Bash). Model performance on LiveCodeBench is evaluated\nusing CoT format, with data collected between August 2024 and January 2025. The Codeforces\ndataset is evaluated using problems from 10 Div.2 contests along with expert-crafted test cases,\nafter which the expected ratings and percentages of competitors are calculated. SWE-Bench\nverified results are obtained via the agentless framework (Xia et al., 2024). AIDER-related\nbenchmarks are measured using a \"diff\" format. DeepSeek-R1 outputs are capped at a maximum\nof 32,768 tokens for each benchmark.\n\nBaselines We conduct comprehensive evaluations against several st

In [27]:
  "chunk_id": "da69dc6effcc_aHR0cHM6Ly9hdmNvZWRlbW9haXNlYXJjaG1jcHN0Zy5ibG9iLmNvcmUudXNnb3ZjbG91ZGFwaS5uZXQvYWlzZWFyY2hkYXRhL2RlZXBzZWVrLnBkZg2_pages_2",
  "parent_id": "aHR0cHM6Ly9hdmNvZWRlbW9haXNlYXJjaG1jcHN0Zy5ibG9iLmNvcmUudXNnb3ZjbG91ZGFwaS5uZXQvYWlzZWFyY2hkYXRhL2RlZXBzZWVrLnBkZg2",
  "chunk": "Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13\n\n3.2 Distilled Model Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14\n\n4 Discussion 14\n\n4.1 Distillation v.s. Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . . . 14\n\n4.2 Unsuccessful Attempts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15\n\n5 Conclusion, Limitations, and Future Work 16\n\nA Contributions and Acknowledgments 20\n\n2\n\n\n\n1. Introduction\n\nIn recent years, Large Language Models (LLMs) have been undergoing rapid iteration and\nevolution (Anthropic, 2024; Google, 2024; OpenAI, 2024a), progressively diminishing the gap\ntowards Artificial General Intelligence (AGI).\n\nRecently, post-training has emerged as an important component of the full training pipeline.\nIt has been shown to enhance accuracy on reasoning tasks, align with social values, and adapt\nto user preferences, all while requiring relatively minimal computational resources against\npre-training. In the context of reasoning capabilities, OpenAI\u2019s o1 (OpenAI, 2024b) series models\nwere the first to introduce inference-time scaling by increasing the length of the Chain-of-\nThought reasoning process. This approach has achieved significant improvements in various\nreasoning tasks, such as mathematics, coding, and scientific reasoning. However, the challenge\nof effective test-time scaling remains an open question for the research community. Several prior\nworks have explored various approaches, including process-based reward models (Lightman\net al., 2023; Uesato et al., 2022; Wang et al., 2023), reinforcement learning (Kumar et al., 2024),\nand search algorithms such as Monte Carlo Tree Search and Beam Search (Feng et al., 2024; Trinh\net al., 2024; Xin et al., 2024). However, none of these methods has achieved general reasoning\nperformance comparable to OpenAI\u2019s o1 series models.",
  "title": "deepseek.pdf",
  "@search.score": 1.7589658,
  "@search.reranker_score": null,
  "@search.highlights": null,
  "@search.captions": null

SyntaxError: illegal target for annotation (3690852108.py, line 1)