In [1]:
import os
from langchain_openai import ChatOpenAI
from getpass import getpass

os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

chat = ChatOpenAI(
    openai_api_key=os.environ["OPENAI_API_KEY"],
    model='gpt-3.5-turbo-0125'
)

In [3]:
from langchain.schema import (
    SystemMessage,
    HumanMessage,
    AIMessage
)

messages = [
    SystemMessage(content="You are a helpful assistant."),
    HumanMessage(content="Hi AI, how are you today?"),
    AIMessage(content="I'm great thank you. How can I help you?"),
    HumanMessage(content="I'd like to understand string theory.")
]

In [None]:
res = chat(messages)


  res = chat(messages)


AIMessage(content='String theory is a theoretical framework in physics that attempts to reconcile quantum mechanics and general relativity. It proposes that the fundamental building blocks of the universe are not point-like particles, but rather tiny, vibrating strings.\n\nThese strings can oscillate at different frequencies, giving rise to different particles and forces in the universe. The theory suggests that there are additional spatial dimensions beyond the three spatial dimensions we are familiar with.\n\nString theory has the potential to describe all fundamental forces and particles in a unified framework, but it is still a work in progress and has not yet been experimentally confirmed.\n\nIs there a specific aspect of string theory that you would like to know more about?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 132, 'prompt_tokens': 53, 'total_tokens': 185, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'aud

In [None]:
print(res.content)

String theory is a theoretical framework in physics that attempts to reconcile quantum mechanics and general relativity. It proposes that the fundamental building blocks of the universe are not point-like particles, but rather tiny, vibrating strings.

These strings can oscillate at different frequencies, giving rise to different particles and forces in the universe. The theory suggests that there are additional spatial dimensions beyond the three spatial dimensions we are familiar with.

String theory has the potential to describe all fundamental forces and particles in a unified framework, but it is still a work in progress and has not yet been experimentally confirmed.

Is there a specific aspect of string theory that you would like to know more about?


In [8]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="Why do physicists believe it can produce a 'unified theory'?"
)
# add to messages
messages.append(prompt)

# send to chat-gpt
res = chat(messages)

print(res.content)

Physicists are interested in string theory as a candidate for a unified theory because it has the potential to bring together all the fundamental forces of nature (gravity, electromagnetism, the weak nuclear force, and the strong nuclear force) into a single, coherent framework.

In traditional quantum field theory, each force is described by its own set of particles and equations, making it difficult to reconcile these different theories and understand how they are related. String theory, on the other hand, proposes that all forces and particles arise from the vibrations of a single fundamental entity: strings.

This unified perspective suggests that all the fundamental forces are manifestations of a single underlying framework, which could provide a more complete understanding of the laws of physics and potentially resolve some of the outstanding questions in theoretical physics, such as the nature of dark matter and dark energy, the unification of quantum mechanics and general relat

### If you ask the chatGPT model about the latest model, for exmaple Deepseek R1, the model would fail to update anything after it has been trained.

In [9]:
# add latest AI response to messages
messages.append(res)

# now create a new user prompt
prompt = HumanMessage(
    content="What is so special about Deepseek R1?"
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [12]:
print(res.content)

I'm sorry, but I'm not familiar with "Deepseek R1." It's possible that it could be a product, technology, or concept that is not widely known or recognized. If you can provide more context or details about what Deepseek R1 is, I may be able to offer more information or insights.


We can feed additional information to the LLM with some instructions to tell the model how we would like to use this information alogside our original questioin.

In [13]:
source_knowledge = (
    "We introduce our first-generation reasoning models, DeepSeek-R1-Zero and "
    "DeepSeek-R1. DeepSeek-R1-Zero, a model trained via large-scale "
    "reinforcement learning (RL) without supervised fine-tuning (SFT) as a "
    "preliminary step, demonstrates remarkable reasoning capabilities. Through "
    "RL, DeepSeek-R1-Zero naturally emerges with numerous powerful and "
    "intriguing reasoning behaviors. However, it encounters challenges such as "
    "poor readability, and language mixing. To address these issues and "
    "further enhance reasoning performance, we introduce DeepSeek-R1, which "
    "incorporates multi-stage training and cold-start data before RL. "
    "DeepSeek-R1 achieves performance comparable to OpenAI-o1-1217 on "
    "reasoning tasks. To support the research community, we open-source "
    "DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, "
    "32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama."
)

In [14]:
query = "What is so special about Deepseek R1?"

augmented_prompt = f"""Using the contexts below, answer the query.

Contexts:
{source_knowledge}

Query: {query}"""

In [15]:
# create a new user prompt
prompt = HumanMessage(
    content=augmented_prompt
)
# add to messages
messages.append(prompt)

# send to OpenAI
res = chat(messages)

In [16]:
print(res.content)

DeepSeek R1 is a special reasoning model that has been developed with advanced training techniques and capabilities. It is an improvement upon the initial model, DeepSeek R1-Zero, which was trained using reinforcement learning without supervised fine-tuning. Although DeepSeek R1-Zero showed impressive reasoning abilities, it faced challenges such as poor readability and language mixing. 

To overcome these issues and enhance its reasoning performance, DeepSeek R1 incorporates multi-stage training and cold-start data before reinforcement learning. As a result, DeepSeek R1 is able to achieve performance comparable to OpenAI-o1-1217 on reasoning tasks. Additionally, DeepSeek R1 has been open-sourced along with six dense models of varying sizes, which have been distilled from DeepSeek R1. 

Overall, what makes DeepSeek R1 special is its advanced training methodologies, improved reasoning capabilities, and its contribution to the research community through open-sourcing the model and relate

### Importing the Data

In this task, we will be importing our data. We will be using the Hugging Face Datasets library to load our data. Specifically, we will be using the "jamescalam/deepseek-r1-paper-chunked" dataset. This dataset contains the Deepseek R1 paper pre-processed into RAG-ready chunks.

In [17]:
from datasets import load_dataset

dataset = load_dataset(
    "jamescalam/deepseek-r1-paper-chunked",
    split="train"
)

dataset

  from .autonotebook import tqdm as notebook_tqdm
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Generating train split: 100%|██████████| 76/76 [00:00<00:00, 8317.04 examples/s]


Dataset({
    features: ['doi', 'chunk-id', 'chunk', 'num_tokens', 'pages', 'source'],
    num_rows: 76
})