# DeepSeek Dynamic Threshold Application
This notebook applies dynamic threshold selection and generates answers using the DeepSeek API.

In [1]:
import pandas as pd
import os
import time
import re

# Load threshold CSV files
threshold1 = pd.read_csv('sythetic_testing/RAG4_0.5.csv')
threshold2 = pd.read_csv('sythetic_testing/RAG4_0.6.csv')
threshold3 = pd.read_csv('sythetic_testing/RAG4_0.7.csv')
threshold4 = pd.read_csv('sythetic_testing/RAG4_0.8.csv')
threshold5 = pd.read_csv('sythetic_testing/RAG4_0.9.csv')

In [2]:
# Function to check if the CONTEXT is empty
def is_context_empty(cell_text):
    match = re.search(r"### CONTEXT:\s*(\[\s*\])", cell_text)
    return bool(match)  # True if context is exactly "[]", False otherwise

# Add context-empty flags to each threshold DataFrame
threshold5['Context_Empty_0.9'] = threshold5['Prompt'].apply(is_context_empty)
threshold4['Context_Empty_0.8'] = threshold4['Prompt'].apply(is_context_empty)
threshold3['Context_Empty_0.7'] = threshold3['Prompt'].apply(is_context_empty)
threshold2['Context_Empty_0.6'] = threshold2['Prompt'].apply(is_context_empty)
threshold1['Context_Empty_0.5'] = threshold1['Prompt'].apply(is_context_empty)

In [5]:
threshold1['Context_Empty_0.5'].value_counts()

Context_Empty_0.5
False    385
Name: count, dtype: int64

In [6]:
# Aggregate prompts and context flags into a single DataFrame
dataset_generation = threshold1[['Question']].copy()
dataset_generation['Prompt_0.5'] = threshold1['Prompt']
dataset_generation['Prompt_0.6'] = threshold2['Prompt']
dataset_generation['Prompt_0.7'] = threshold3['Prompt']
dataset_generation['Prompt_0.8'] = threshold4['Prompt']
dataset_generation['Prompt_0.9'] = threshold5['Prompt']
dataset_generation['Context_Empty_0.5'] = threshold1['Context_Empty_0.5']
dataset_generation['Context_Empty_0.6'] = threshold2['Context_Empty_0.6']
dataset_generation['Context_Empty_0.7'] = threshold3['Context_Empty_0.7']
dataset_generation['Context_Empty_0.8'] = threshold4['Context_Empty_0.8']
dataset_generation['Context_Empty_0.9'] = threshold5['Context_Empty_0.9']

In [7]:
# Select the final prompt for each row based on context availability
def choose_final_Prompt(row):
    if not row['Context_Empty_0.9']:
        return row['Prompt_0.9']
    elif not row['Context_Empty_0.8']:
        return row['Prompt_0.8']
    elif not row['Context_Empty_0.7']:
        return row['Prompt_0.7']
    elif not row['Context_Empty_0.6']:
        return row['Prompt_0.6']
    else:
        return row['Prompt_0.5']
dataset_generation['final_prompt'] = dataset_generation.apply(choose_final_Prompt, axis=1)

In [None]:
import requests

DEEPSEEK_API_URL = ""
DEEPSEEK_MODEL = "deepseek-r1:70b"
DRY_RUN = False  # Set to False to enable real API calls

class DeepSeekClient:
    def __init__(self, api_url, model):
        self.api_url = api_url
        self.model = model
    def generate(self, prompt, max_tokens=512, temperature=0.7, top_p=0.9):
        if DRY_RUN:
            return f"[DRY_RUN] Answer for: {prompt}"
        data = {
            "model": self.model,
            "prompt": prompt,
            "stream": False
        }
        response = requests.post(self.api_url, json=data)
        return response.json().get("response", "")

client = DeepSeekClient(DEEPSEEK_API_URL, DEEPSEEK_MODEL)

In [11]:
# Generate DeepSeek responses for each final prompt
from tqdm import tqdm
response_list = []
for prompt in tqdm(dataset_generation['final_prompt'], desc='DeepSeek Generating'):
    answer = client.generate(prompt)
    response_list.append(answer)
dataset_generation['deepseek_output'] = response_list

DeepSeek Generating: 100%|██████████| 385/385 [9:54:29<00:00, 92.65s/it]    


In [14]:
print(response_list[0])

<think>
Okay, so I'm trying to set up grunt-browser-sync in my Cloud9 development environment. I've been having some trouble getting it running properly. Let me go through the steps and see where I might be going wrong.

First, I remember that I need to install the necessary npm packages. Grunt itself is required, along with browser-sync and any other plugins I might need like grunt-contrib-uglify for minifying JavaScript files. So, I'll start by installing them using npm:

npm install grunt grunt-browser-sync grunt-contrib-uglify --save-dev

Next, I need to make sure my package.json file includes these as devDependencies. That way, anyone else working on the project or when deploying, they can run npm install and get all the necessary packages.

Now, onto the Gruntfile. I think I might have made a mistake in setting up the tasks. Let me check my current Gruntfile.js. Oh, wait, I see that I only registered 'concat' and 'uglify' as default tasks but didn't include browser-sync. That's p

In [12]:
# Save results to CSV
output_path = 'dynamic_RAG4_deepseek_results.csv'
dataset_generation.to_csv(output_path, index=False)
print(f"Saved to {output_path}")

Saved to dynamic_RAG4_deepseek_results.csv
