# Part 3: SharePoint Knowledge Source

In Parts 1 and 2, you worked with search indexes. In Part 3, you'll connect directly to SharePoint documents using `RemoteSharePointKnowledgeSource`. This lets you query live SharePoint content without indexing it first.

## Step 1: Load Environment Variables

Run below cell to load the configuration for your Azure resources, choose the **.venv** environment that is created for you.

> **⚠️ Troubleshooting**
>
> If code cells get stuck and keep spinning, select **Restart** from the notebook toolbar at the top. If the issue persists after a couple of tries, close VS Code completely and reopen it.

In [None]:
import os

from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential
from dotenv import load_dotenv

load_dotenv(override=True) # take environment variables from .env.

# Azure AI Search configuration
endpoint = os.environ["AZURE_SEARCH_SERVICE_ENDPOINT"]
if os.getenv('KEYLESS','false').lower() == 'true':
    credential = DefaultAzureCredential()    
else:
    credential = AzureKeyCredential(os.environ["AZURE_SEARCH_ADMIN_KEY"])

# Knowledge base name
knowledge_base_name = "sharepoint-knowledge-base"

# Azure OpenAI configuration
azure_openai_endpoint = os.environ["AZURE_OPENAI_ENDPOINT"]
if os.getenv('KEYLESS','false').lower() == 'true':
    azure_openai_key = ''
else:
    azure_openai_key = os.environ["AZURE_OPENAI_KEY"]
azure_openai_chatgpt_deployment = os.getenv("AZURE_OPENAI_CHATGPT_DEPLOYMENT", "gpt-4.1")
azure_openai_chatgpt_model_name = os.getenv("AZURE_OPENAI_CHATGPT_MODEL_NAME", "gpt-4.1")

print("Environment variables loaded")

## Step 2: Create Remote SharePoint Knowledge Source

A **RemoteSharePointKnowledgeSource** connects directly to SharePoint documents without requiring you to index them first. This is different from the `SearchIndexKnowledgeSource` you used in Parts 1 and 2.

The code below creates a SharePoint knowledge source with `RemoteSharePointKnowledgeSourceParameters()`. The service handles authentication and document access automatically.

In [None]:
from azure.search.documents.indexes import SearchIndexClient
from azure.search.documents.indexes.models import RemoteSharePointKnowledgeSource, RemoteSharePointKnowledgeSourceParameters

index_client = SearchIndexClient(endpoint=endpoint, credential=credential)

ks = RemoteSharePointKnowledgeSource(
    name="sharepoint-knowledge-source",
    description="Knowledge source for SharePoint Documents",
    remote_share_point_parameters=RemoteSharePointKnowledgeSourceParameters()
)
index_client.create_or_update_knowledge_source(knowledge_source=ks)
print(f"Knowledge source '{ks.name}' created or updated successfully.")

## Step 3: Create SharePoint Knowledge Base

In this step, you'll create a knowledge base that references the SharePoint knowledge source. The setup is identical to Parts 1 and 2, you configure the Azure OpenAI model parameters, add a reference to your knowledge source, and set `output_mode=ANSWER_SYNTHESIS` to enable AI-powered answer generation.

The only difference is the knowledge source type, instead of pointing to a search index, this knowledge base queries SharePoint directly.

In [None]:
from azure.search.documents.indexes.models import AzureOpenAIVectorizerParameters, KnowledgeBase, KnowledgeBaseAzureOpenAIModel, KnowledgeRetrievalOutputMode, KnowledgeSourceReference

aoai_params = AzureOpenAIVectorizerParameters(
    resource_url=azure_openai_endpoint,
    deployment_name=azure_openai_chatgpt_deployment,
    model_name=azure_openai_chatgpt_model_name,
    api_key=azure_openai_key
)

knowledge_base = KnowledgeBase(
    name=knowledge_base_name,
    models=[KnowledgeBaseAzureOpenAIModel(azure_open_ai_parameters=aoai_params)],
    knowledge_sources=[
        KnowledgeSourceReference(name=ks.name)
    ],
    output_mode=KnowledgeRetrievalOutputMode.ANSWER_SYNTHESIS
)

index_client.create_or_update_knowledge_base(knowledge_base)
print(f"Knowledge base '{knowledge_base_name}' created or updated successfully.")

## Step 4: Query SharePoint Documents

Now you can query your SharePoint documents using the knowledge base you created.

There's one key difference when querying SharePoint: you need to provide an identity token (`x_ms_query_source_authorization`) so the knowledge base can access SharePoint on behalf of the token owner. The cell below gets this token using your Azure credentials. In a production application, that token would come from the user logged into your app.

The second cell runs a question which can only be answered from Sharepoint documents shared with your test account. If you'd like to see those documents, navigate to the [Lab Sharepoint](https://lodsprodmca.sharepoint.com/sites/lab511) and opening the Documents folder.

When you run the second cell, the knowledge base analyzes your question, decomposes it into focused subqueries, searches SharePoint, uses semantic ranking to filter results, and synthesizes a grounded answer with citations.

In [None]:
from azure.identity import DefaultAzureCredential, get_bearer_token_provider

identity_token_provider = get_bearer_token_provider(DefaultAzureCredential(), "https://search.azure.com/.default")
token = identity_token_provider()
print("Authentication completed")

In [None]:
from azure.search.documents.knowledgebases import KnowledgeBaseRetrievalClient
from azure.search.documents.knowledgebases.models import KnowledgeBaseMessage, KnowledgeBaseMessageTextContent, KnowledgeBaseRetrievalRequest, RemoteSharePointKnowledgeSourceParams
from IPython.display import display, Markdown

knowledge_base_client = KnowledgeBaseRetrievalClient(endpoint=endpoint, knowledge_base_name=knowledge_base_name, credential=credential)

sharepoint_ks_params = RemoteSharePointKnowledgeSourceParams(
    knowledge_source_name="sharepoint-knowledge-source",
    include_references=True,
    include_reference_source_data=True
)
req = KnowledgeBaseRetrievalRequest(
    messages=[
        KnowledgeBaseMessage(role="user", content=[KnowledgeBaseMessageTextContent(text="What are major announcements and innovations in Zava Smart Wearables & Clothing Industry in 2025?")])
    ],
    knowledge_source_params=[
        sharepoint_ks_params
    ],
    include_activity=True
)


result = knowledge_base_client.retrieve(retrieval_request=req, x_ms_query_source_authorization=token)
display(Markdown(result.response[0].content[0].text))

## Step 5: Review Response, References, and Activity

The two cells below show the citations and activity log from the SharePoint query.

The references reveal which SharePoint documents were used. Each citation includes the file path and the specific text snippet that contributed to the answer.

The activity log reveals what happened behind the scenes: how the knowledge base accessed SharePoint, what searches it performed, and how it ranked the results.

In [None]:
import json

references = json.dumps([ref.as_dict() for ref in result.references], indent=2)
print(references)

In [None]:
import pandas as pd

activity_types = [{"type": a.type} for a in result.activity]

df = pd.DataFrame(activity_types)

print("Activity Log Steps")
df

In [None]:
activity_content = json.dumps([a.as_dict() for a in result.activity], indent=2)
print("Activity Details")
print(activity_content)

## Summary

In this part, you learned how to connect directly to SharePoint documents using `RemoteSharePointKnowledgeSource`. This allows you to query live SharePoint content without needing to index it first.

**Key concepts to remember:**
- `RemoteSharePointKnowledgeSource` queries SharePoint documents in real-time
- SharePoint queries require an identity token (`x_ms_query_source_authorization`)
- The knowledge base handles authentication and document access automatically
- Citations include SharePoint file paths, not index references

### What's Next?

➡️ Continue to [Part 4: Web Knowledge Source](part4-web-knowledge-source.ipynb) to learn how to query public web URLs alongside your internal data sources.