## Working with DeepSeek V3 - Mixture of Experts ##

This notebook contains the code that integrate LangChain + Azure AI Inference SDK
Requirements
- Azure Subscription
- DeepSeek V3 Deployed via Azure Foundry
- Python 3.11 or Higher

In [None]:
%pip install -r requirements.txt

In [None]:
import os
from azure.core.credentials import AzureKeyCredential
from langchain_azure_ai.chat_models import AzureAIChatCompletionsModel
from dotenv import load_dotenv
from langchain.prompts import PromptTemplate
from langchain_community.retrievers import ArxivRetriever

# Load environment variables
load_dotenv()

# Get the environment variables
endpoint = os.getenv("AZURE_AI_ENDPOINT") # Azure AI endpoint environment variable this is the Completions Endpoint
model_name = os.getenv("AZURE_DEPLOYMENT") # Azure AI model name environment variable
key = os.getenv('AZURE_AI_KEY') # Azure AI key environment variable
region = os.getenv("REGION")   # Region the model is deployed in

wrapper_key = AzureKeyCredential(key) # Method of wrapping the key in AzureKeyCredential


# Initialize the Azure AI Chat Completions Model
model = AzureAIChatCompletionsModel(
    endpoint=endpoint,
    credential=wrapper_key,
    model_name=model_name,
    max_tokens=2048
)

# Initialize the Arxiv Retriever
arxiv_r = ArxivRetriever(
    load_max_docs = 3, # Instructs the maximum papers to load
    get_full_documents=True # Instructs to get the full documents
)

# Prompt template
prompt_template = PromptTemplate(
    input_variables = ["query", "relevant_info"],
    template = """
    You are an expert in Research of a variety of topics specifically AI Security. Use the following information from arXiv to answer the user's question. If there is no sufficient information, say 'I need more information to answer this question'.

    Question: {query}

    Relevant Information:
    {relevant_info}

    Answer:
    """
)

# Define our chain 
This puts together our components of "prompt_template" + "model" (DeepSeekv3)

In [15]:
# Define the chain
chain = prompt_template | model

In [21]:
# Query to ask the model
query = "Summarize the existing challenges in AI Security?"

In [22]:
# Analyze the research papers

try:
    papers = arxiv_r.invoke(query) # Papers retrieved from the Arxiv Retriever
    relevant_info = "\n".join([
        f"Title: {paper.metadata.get('title', 'No title')}\nAbstract: {paper.page_content}" 
        for paper in papers
    ]) # Relevant information from the papers
    response = chain.invoke({"query": query, "relevant_info": relevant_info}) # Response from the model
    print("Response", response.content)
    print("-----" * 20)
    # Usage stats from our query
    print("Usage of query")
    print("\tPrompt Tokens:", response.usage_metadata["input_tokens"])
    print("\tCompletion Tokens:", response.usage_metadata["output_tokens"])
    print("\tTotal Tokens:", response.usage_metadata["total_tokens"])
    print("-----" * 20)

    # Cost rates per 1K Tokens (in USD)
    INPUT_COST_PER_1K = 0.00114
    OUTPUT_COST_PER_1K = 0.00456


    # Estimated cost of the query
    input_cost = (response.usage_metadata["input_tokens"] / 1000) * INPUT_COST_PER_1K
    output_cost = (response.usage_metadata["output_tokens"] / 1000) * OUTPUT_COST_PER_1K
    total_cost = input_cost + output_cost

    # Print cost with proper formatting
    print("Estimated Cost")
    print(f"\tInput cost ({INPUT_COST_PER_1K:.5f}/1K tokens): ${input_cost:.5f}")
    print(f"\tOutput cost ({OUTPUT_COST_PER_1K:.5f}/1K tokens): ${output_cost:.5f}")
    print(f"\tTotal cost: ${total_cost:.5f}")
    print("-----" * 20)
except Exception as e:
    print(f"Error processing query: {e}")
    import traceback
    traceback.print_exc()

Response The existing challenges in AI Security, as highlighted by the provided arXiv abstracts, can be summarized into the following key areas:

1. **AI Resilience and Human-AI Interaction**:  
   - AI systems can make objective errors or produce contextually inappropriate outputs, which users may not easily notice. For example, in text summarization, critical details might be omitted, and users may not detect these omissions if they rely solely on the summary without reviewing the original document.  
   - Judging AI choices is challenging due to limited contextual information provided by interfaces, leading users to rely on assumptions when deciding whether to dismiss or modify AI outputs.  
   - Designing **AI-resilient interfaces** that help users notice, judge, and recover from AI errors is critical for improving AI safety, usability, and utility, especially in open-ended tasks like summarization, ideation, and sensemaking.

2. **Brittleness and Opacity of AI Systems**:  
   - De

In [23]:
financial_template = PromptTemplate(
    input_variables = ["background"],
    template = """
    You are an experienced Financial Advisor that helps hedge fund clients, and personal investment decisions for clients. Use the following background information to answer the user's inquiry on the best financial path given the information along with advice on a diverse portfolio. Use principled investing strategies to provide the best advice.

    {background_info}


    Provide your advice on the best financial path for the client and advice on a diverse portfolio, and reasoning include specific best case scenario and worst case scenario on a 10 year basis.
    """
)

In [24]:
chain = financial_template | model

In [25]:
# Background information
background_info = """"
Client Age: 24
Client Income: 250,000
Client Savings: 100,000
Client Debt: 50,000
Client Goals: Aggressive Growth with a time horizon of 10 years and a risk tolerance of 7/10 Minimize risk of tax long term with Roth IRA/Backdoor Roth IRA
Client Investment Knowledge: Intermediate
 """

In [26]:
# Put it all together

response = chain.invoke(background_info)

print("Response", response.content)
print("-----" * 20)
print("Usage of query")
print("\tPrompt Tokens:", response.usage_metadata["input_tokens"])
print("\tCompletion Tokens:", response.usage_metadata["output_tokens"])
print("\tTotal Tokens:", response.usage_metadata["total_tokens"])
print("-----" * 20)

# Cost rates per 1K Tokens (in USD)
INPUT_COST_PER_1K = 0.00114
OUTPUT_COST_PER_1K = 0.00456


# Estimated cost of the query
input_cost = (response.usage_metadata["input_tokens"] / 1000) * INPUT_COST_PER_1K
output_cost = (response.usage_metadata["output_tokens"] / 1000) * OUTPUT_COST_PER_1K
total_cost = input_cost + output_cost

# Print cost with proper formatting
print("Estimated Cost")
print(f"\tInput cost ({INPUT_COST_PER_1K:.5f}/1K tokens): ${input_cost:.5f}")
print(f"\tOutput cost ({OUTPUT_COST_PER_1K:.5f}/1K tokens): ${output_cost:.5f}")
print(f"\tTotal cost: ${total_cost:.5f}")
print("-----" * 20)

Response ### Financial Path and Portfolio Advice for the Client

Given the client's age, income, savings, goals, and risk tolerance, here’s a tailored financial plan and investment strategy:

---

### **1. Financial Priorities**
- **Pay Off Debt:** The client has $50,000 in debt. While the interest rate isn’t specified, it’s generally advisable to pay off high-interest debt (e.g., credit cards, personal loans) before aggressively investing. If the debt is low-interest (e.g., student loans), the client can prioritize investing while making minimum payments.
- **Emergency Fund:** Ensure the client has 3-6 months of living expenses (approximately $30,000-$60,000) in a high-yield savings account or money market fund.
- **Tax-Advantaged Accounts:** Maximize contributions to a Roth IRA or Backdoor Roth IRA ($6,500 annually as of 2023) to minimize long-term tax risk. Since the client has a high income, a Backdoor Roth IRA is likely the best option.
- **Employer Retirement Plans:** If availabl