## Introduction
In this tutorial, you can learn about OpenAI Assistant and explore its capabilities further.

## table of content
>- What is OpenAI assistant?
>- Leveraging LangChain to utilize OpenAI Assistant capabilities.
>- Combining OpenAI Assistant and LangChain tool to extend capabilities in the context of OpenAI Assistant.

## What is OpenAI assistant?
The Assistants API enables the creation of AI assistants directly within your applications. These assistants are equipped with instructions and can utilize various models, tools, and knowledge to provide responses to user inquiries. Currently, the Assistants API supports three categories of tools: 
1. Code Interpreter
2. Retrieval
3. Function Calling

For more details follow [LangChain](https://python.langchain.com/docs/modules/agents/agent_types/openai_assistants) and [OpenAI assistants](https://platform.openai.com/docs/assistants/overview)

### Project info
Let's create a research assistant! `ScholarlySphere` is a Document Analysis and Search assistant.
Users can upload PDF or text files containing documents or research papers. `ScholarlySphere` can then extract key information, such as keywords, topics, or main points, and use search engines to find relevant information to answer user questions based on the content of the uploaded files.


## Leveraging LangChain to utilize OpenAI Assistant capabilities
Let's craft our own AI assistant!🚀

In this step we want to create an assistant that use search engines to find relative data about user query.


In [68]:
# First we need to import some packages
import os
from dotenv import load_dotenv
from langchain.agents.openai_assistant import OpenAIAssistantRunnable # import OpenAIAssistantRunnable from langchain
from langchain.agents import AgentExecutor

from langchain.utilities import DuckDuckGoSearchAPIWrapper
from langchain.tools import DuckDuckGoSearchResults, DuckDuckGoSearchRun
from openai import OpenAI

In [46]:
# We need to have an openAI API key
load_dotenv()
os.environ['OPENAI_API_KEY'] = os.getenv('OPENAI_API_KEY')


In [77]:
# we need an instruction. An instruction is how the Assistant and model should behave or respond.
instructions = """
    You are a scholarly expert, you have the skill to find academic papers similar to users' 
    provided papers and utilize online resources to find similar papers. Your expertise enables you
    to generate comprehensive outputs based on user desires.

"""

# What we need in the output?
prompt = """
    Give structured JSON output involved 5 references that explation GPT-3:
      title, author_names, content_summary are the fields.
"""


In [78]:
# To connect our agent to online resources we need some search engine as tools.
# create the search tool based on new sholars
tools = [DuckDuckGoSearchRun()]

# create the agent of assistant-openai
agent = OpenAIAssistantRunnable.create_assistant(
    name="Document Analysis and Search assistant",
    instructions=instructions,
    tools=tools,
    model="gpt-4-1106-preview",
    as_agent=True
)

### With AgentExecutor:
The `OpenAIAssistantRunnable` works well with `AgentExecutor`, so we can easily include it as an agent directly in the executor. The AgentExecutor handles running the tools we use and sending their results back to the Assistants API. It also includes LangSmith tracing features.

In [79]:
agent_executor = AgentExecutor(agent=agent, tools=tools, return_intermediate_steps=True)
output = agent_executor.invoke({"content": prompt})

In [80]:
print(output['output'])

Based on the search results, here are five references that provide an explanation of GPT-3. Below, I've structured the JSON output as requested with some of the details available from the description:

```json
[
  {
    "title": "Capabilities of GPT Series Models",
    "author_names": ["Not Provided"],
    "content_summary": "This paper analyzes the capabilities of various GPT series models, including GPT-3 and GPT-3.5. The study includes multiple iterations like davinci and text-davinci, highlighting the advances in language modeling that these versions represent."
  },
  {
    "title": "The Role of GPT Systems as Co-authors in Academic Paper Writing",
    "author_names": ["Not Provided"],
    "content_summary": "This study investigates the potential of a system to act as a co-author on an academic paper. The authors focus on the criteria set by the International Committee of Medical Journal Editors (ICMJE) and explore the impact of GPT models in aiding research work and writing proce

We developed an assistant capable of accessing real-time data through search tools. For additional tools, you can refer to the [LangChain Tools](https://python.langchain.com/docs/integrations/tools) documentation.


## Combining OpenAI Assistant and LangChain tool
Occasionally, we possess existing data and wish for LLMs to not only access real-time data but also utilize files to gather information.

In this step we extract key main points of a PDF file, and use search engines to find relevant information to answer user questions based on the content of the uploaded files.


In [81]:
# we need an instruction. An instruction is how the Assistant and model should behave or respond.
instructions = """
    You are a scholarly expert, you have the skill to extract crucial details from academic papers provided 
    for you and utilize online resources to find similar papers. Your expertise enables you
    to generate comprehensive outputs based on user desires.

"""

# What we need in the output?
file_prompt = """
    Write a detailed summary of the input paper, synthesizing its key findings, methodologies, and 
    implications, providing overview for reference and comprehension purposes."""


online_prompt = """
    Give structured JSON file for similar papers that you found contains: paper_title, authors_names, 
    publication_date, link_of_paper and the conference_or_journal_name where it was published.
"""

In [90]:
# filrst we need to initiate OpenAI client instance
client = OpenAI()

# Upload a file with an "assistants" purpose
file = client.files.create(
  file=open("attention_is_all_you_need.pdf", "rb"),
  purpose='assistants'
)

# Add the file to the assistant
assistant = client.beta.assistants.create(
  name="Document Analysis and Search assistant",
  instructions=instructions,
  model="gpt-4-1106-preview",
  tools=[{"type": "retrieval"}], # 
  file_ids=[file.id]
)

# create a thread
thread = client.beta.threads.create()



In [94]:
message = client.beta.threads.messages.create(
  thread_id=thread.id,
  role="user",
  content=file_prompt,
  file_ids=[file.id]
)

create_run_10 = client.beta.threads.runs.create(
    thread_id=thread.id,
    assistant_id=assistant.id
)

retrieve_run = client.beta.threads.runs.retrieve(
    thread_id=thread.id,
    run_id=create_run_10.id
)

responses = client.beta.threads.messages.list(
    thread_id=thread.id
)


In [104]:
# Iterate over the list of messages
for message in responses.data[:2]:
        print(message.content[0].text.value)



    Write a detailed summary of the input paper, synthesizing its key findings, methodologies, and 
    implications, providing overview for reference and comprehension purposes.
The paper delves into the specifics of the Transformer model, explaining its core components and advantages. A significant innovation within the Transformer is the use of "Multi-Head Attention." This approach employs multiple attention functions in parallel to allow the model to capture information from different representation subspaces at various positions. Each 'head' focuses on a different part of the encoded information, enabling the model to pay attention to diverse parts of a sequence simultaneously【15†source】.

In addition, the Transformer features "Position-wise Feed-Forward Networks" applied identically to each position, consisting of two linear transformations with a ReLU activation in between. These networks contribute to learning local patterns within the sequence data【16†source】.

A key element 

To use an existing Assistant we can initialize the OpenAIAssistantRunnable directly with an assistant_id.

In [105]:
# To connect our agent to online resources we need some search engine as tools.
# create the search tool based on new sholars
wrapper = DuckDuckGoSearchAPIWrapper(time="m")
tools = [DuckDuckGoSearchResults(api_wrapper=wrapper, source="scholar")]

# create the agent of assistant-openai
agent = OpenAIAssistantRunnable(
    tools=tools,
    assistant_id=assistant.id,
    model="gpt-4-1106-preview",
    as_agent=True
)


In [106]:
agent_executor = AgentExecutor(agent=agent, tools=tools, return_intermediate_steps=True)
output = agent_executor.invoke({"content": online_prompt})

In [107]:
print(output['output'])

To assist you, I will need to first review the content of the uploaded academic paper to understand its topic, scope, and specifics. After assessing the content, I can then utilize online resources to locate similar papers and provide the structured JSON information you've requested.

I will now examine the uploaded document to understand its subject matter. Please bear with me for a moment as I access the file.
The paper provided is "Attention Is All You Need," which details the development of the Transformer model architecture for sequence transduction models (e.g., machine translation). Here are the details extracted from the paper:

- **Paper Title**: Attention Is All You Need
- **Authors' Names**: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, Illia Polosukhin
- **Publication Date**: 2 August 2023
- **Conference or Journal Name**: 31st Conference on Neural Information Processing Systems (NIPS 2017)
- **Link of Paper**: The p

## Conclusion:

In this tutorial, we learned about OpenAI Assistant and how it can help us make smart programs. We saw how to use it with LangChain tools to create helpful assistants for different tasks.

First, we talked about OpenAI Assistant and what it can do, like giving smart answers in apps using special tools.

Then, we looked at examples. We made a helper called ScholarlySphere that can understand documents and find more info online. We did this by making an assistant with LangChain's tools.

Next, we showed how to use OpenAI Assistant with LangChain to summarize a document and find similar ones. We made an assistant that can read a document, summarize it, and find other documents like it online.

By doing these examples, we saw how OpenAI Assistant can do many things, from reading documents to finding info online, and how we can use it with LangChain for even more power. Overall, this tutorial showed us how to make smart programs with AI helpers.