# Building a Research LLM AI

Inspiration taken from [here](https://github.com/assafelovic/gpt-researcher)

We want to define two agents. (1) to define objective questions (queries) in parallel out of the user's question and (2) to web search for answers through url requests. Url searches done in paralle will be aggregate and summarized into a report. See below:

<center><img src="https://camo.githubusercontent.com/a1db65299ca6ec3b203e42d47b225c46f36f54009246fe2cdae8fd74c68ab8d5/68747470733a2f2f636f7772697465722d696d616765732e73332e616d617a6f6e6177732e636f6d2f6172636869746563747572652e706e67" width=30%></center>


In [1]:
import os
from langchain.chat_models import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.schema.output_parser import StrOutputParser
from langchain.schema.runnable import RunnablePassthrough, RunnableLambda
from langchain.utilities import DuckDuckGoSearchAPIWrapper
import requests
from bs4 import BeautifulSoup
import json
from IPython.display import display, Markdown, Latex

In [2]:
os.environ["OPENAI_API_KEY"] = ""

In [3]:
template = """Summarize the following question based on the context:

Question: {question}

Context:
{context}
"""

prompt = ChatPromptTemplate.from_template(template)

In [4]:
print(prompt)

input_variables=['context', 'question'] messages=[HumanMessagePromptTemplate(prompt=PromptTemplate(input_variables=['context', 'question'], template='Summarize the following question based on the context:\n\nQuestion: {question}\n\nContext:\n{context}\n'))]


In [3]:
## inspiration from https://gist.github.com/hwchase17/69a8cdef9b01760c244324339ab64f0c

def scrape_text(url: str):
    # Send a GET request to the webpage
    try:
        response = requests.get(url)

        # Check if the request was successful
        if response.status_code == 200:
            # Parse the content of the request with BeautifulSoup
            soup = BeautifulSoup(response.text, "html.parser")

            # Extract all text from the webpage
            page_text = soup.get_text(separator=" ", strip=True)

            # Print the extracted text
            return page_text
        else:
            return f"Failed to retrieve the webpage: Status code {response.status_code}"
    except Exception as e:
        print(e)
        return f"Failed to retrieve the webpage: {e}"

In [6]:
url = "https://blog.langchain.dev/announcing-langsmith/" # example url to play with

page_content = scrape_text(url)

In [7]:
print(len(page_content))
print(page_content)

12520
Announcing LangSmith, a unified platform for debugging, testing, evaluating, and monitoring your LLM applications Skip to content LangChain Blog Home By LangChain Release Notes GitHub Docs Case Studies Sign in Subscribe Announcing LangSmith, a unified platform for debugging, testing, evaluating, and monitoring your LLM applications 8 min read Jul 18, 2023 LangChain exists to make it as easy as possible to develop LLM-powered applications. We started with an open-source Python package when the main blocker for building LLM-powered applications was getting a simple prototype working. We remember seeing Nat Friedman tweet in late 2022 that there was “ not enough tinkering happening .” The LangChain open-source packages are aimed at addressing this and we see lots of tinkering happening now ( Nat agrees )–people are building everything from chatbots over internal company documents to an AI dungeon master for a Dungeons and Dragons game. The blocker has now changed. While it’s easy to

models from OpenAI can be found [here](https://platform.openai.com/docs/models/gpt-3-5)

In [8]:
model_name = "gpt-3.5-turbo-16k"

chain = prompt | ChatOpenAI(model=model_name) | StrOutputParser()

In [1]:
# print(chain)

In [9]:
chain.invoke(
    {
        "question":"what is langsmith?",
        "context": page_content[:10000]
    }
)

'The question is asking for a definition or explanation of what LangSmith is.'

Now we'll create a summary template and prompt

In [7]:
summary_template = """{text}

Using the above text, answer in short the following question

> {question}

---
if the question cannot be answered using the text, imply summarize the text. Include all factual information, numbers, stats, etc.
"""

summary_prompt = ChatPromptTemplate.from_template(summary_template)


In [None]:
model_name = "gpt-3.5-turbo-16k"

chain = summary_prompt | ChatOpenAI(model=model_name) | StrOutputParser()

chain.invoke(
    {
        "question":"what is langsmith",
        "text":page_content
    }
)

We'll modify the chain and add more syntax to make it more involved but at the same time more <b>streamlined</b> using the <b>[RunnablePassthrogh](https://nanonets.com/blog/langchain/)</b> calls

In [9]:
# Recall that scrape_text is the function we created and we are passing it through to be executed within the chain

chain = RunnablePassthrough.assign(
    text = lambda x: scrape_text(x["url"])[:10000]
) | summary_prompt |  ChatOpenAI(model=model_name) | StrOutputParser()

In [12]:
chain.invoke(
    {
        "question":"what is langsmith",
        "url":url
    }
)

'LangSmith is a unified platform that helps developers debug, test, evaluate, and monitor their LLM (large language models) applications. It aims to address the challenge of taking an LLM-powered application from prototype to production by providing tools for understanding model inputs and outputs, tracking token usage and costs, debugging latency issues, evaluating application performance, and monitoring user interactions. LangSmith has been tested and used by various companies and organizations to build and improve their LLM applications.'

Now we are going to use the duckduckgo browser api to look up the url and pass it through the chain we built

In [4]:
results_per_search = 3

ddg_search = DuckDuckGoSearchAPIWrapper()

def web_search(query:str, num_results:int = results_per_search):
    results = ddg_search.results(query, num_results)

    return [r["link"] for r in results]

Examples of what DDG can do - Notice the return objects: snippet, title and link


In [5]:
ddg_search.results("what is the one piece?", 3)

[{'snippet': 'Entertainment 17 December 2023 Global Link copied to clipboard Netflix today announced that the manga "ONE PIECE" is getting a new anime adaptation starting from the iconic East Blue saga, sending waves of excitement through the global anime community on the second day of Jump Festa 2024.',
  'title': "New Anime Series 'THE ONE PIECE' Starts Fresh Journey into the East ...",
  'link': 'https://about.netflix.com/en/news/new-anime-series-the-one-piece'},
 {'snippet': "Based on Eiichiro Oda 's bestselling manga series, THE ONE PIECE follows Monkey D. Luffy, an aspiring pirate who sets sail with his Straw Hats crew in search of the mysterious One Piece treasure. Luffy isn't just like any pirate, however.",
  'title': "'THE ONE PIECE' New Anime Series Announced - Netflix Tudum",
  'link': 'https://www.netflix.com/tudum/articles/the-one-piece-release-date'},
 {'snippet': 'One Piece is a manga and anime series that follows Monkey D. Luffy, a young boy with a dream to become the 

In [44]:
ddg_search.results("what was the score between peru and venezuela last night?", 3)

[{'snippet': "Thanks to expanded qualifying with the first 48-team World Cup, Peru is still only four points off a qualification spot, but Juan Reynoso's team will be desperate for all three points against...",
  'title': 'Peru vs. Venezuela: How to watch 2026 World Cup qualifier - Pro Soccer Wire',
  'link': 'https://prosoccerwire.usatoday.com/2023/11/20/how-to-watch-peru-vs-venezuela-conmebol-2026-world-cup-qualifier-tv-and-streaming/'},
 {'snippet': '11/21/2023 TAJ 11/21/2023 11/21/2023 124 99 PAL Venezuela 0 - 0 Bolivia Venezuela vs Peru LIVE Updates: Score, Stream Info, Lineups and How to Watch World Cup Qualifiers 2026 Match',
  'title': 'Venezuela vs Peru LIVE Updates: Score, Stream Info, Lineups and How to ...',
  'link': 'https://www.vavel.com/en-us/soccer/2023/11/22/1163696-venezuela-vs-peru-live-updates-score-stream-info-lineups-and-how-to-watch-world-cup-qualifiers-2026-match.html'},
 {'snippet': 'Conmebol odds courtesy of Tipico Sportsbook. Odds last updated Tuesday at 7:2

Let's quickly explore what the output from RunnabalePassthrough does with our ddg function

In [49]:
main_chain = RunnablePassthrough.assign(
    urls = lambda x: web_search(x["question"])
)

main_chain.invoke({"question": "what is the one piece?"}) 

{'question': 'what is the one piece?',
 'urls': ['https://www.polygon.com/entertainment/23845804/one-piece-explained-anime-manga-netflix-live-action',
  'https://www.cbr.com/what-is-one-piece-treasure/',
  'https://collider.com/one-piece-arcs-in-order/']}

It not only returns the output in dictionary format (urls), but also the input (question)

Let's now formulate the entire chain invoking it in each url. Notice how we are <b>>ensuring to pass a dictionary to each RunnablePassthrough</b> 

**NOTE** that line 6 below would have worked without lambda because **RunnablePassthrough.assign()** passess runable by default without the need of a lambda

In [84]:
############## PLAYGROUND CELL #################
scrape_and_summary_chain = RunnablePassthrough.assign(
    text = lambda x: scrape_text(x["url"])[:10000]
) | RunnablePassthrough.assign(
    url_output = lambda x : x["url"],
    summary_output = lambda x: RunnablePassthrough() | summary_prompt | ChatOpenAI(model=model_name) | StrOutputParser()
) | (lambda x: f"URL: {x['url_output']}\n\nSUMMARY: {x['summary_output']}")

## This doesn't work, lambda, only above option
# scrape_and_summary_chain = RunnablePassthrough.assign(
#     text = lambda x: scrape_text(x["url"])[:10000]
# ) | (lambda x: f"URL: {x['url']}\n\nSUMMARY: {RunnablePassthrough() | summary_prompt | ChatOpenAI(model=model_name) | StrOutputParser()}")

link = "https://python.langchain.com/docs/get_started/introduction"

test = scrape_and_summary_chain.invoke(
    {
        "question":"what is langsmith?",
        "url":link
    }
)

test
###################################################

'URL: https://python.langchain.com/docs/get_started/introduction\n\nSummary: LangSmith is a developer platform that allows for debugging, testing, evaluating, and monitoring chains built on any Language Model (LLM) framework. It seamlessly integrates with LangChain and simplifies the application lifecycle, from development to production deployment.'

**NOTE** Very important to observe that we used **RunnablePassthrough() within lambda** to retain runnable class and continue LCEL. Inspiration taken from *RunnableParallel* [here](https://python.langchain.com/docs/expression_language/how_to/map)

This was needed to included both: *url* and *summary* in runnable output as a single string

In [99]:
# From above - ORIGINAL
# scrape_and_summary_chain = RunnablePassthrough.assign(
#     text = lambda x: scrape_text(x["url"])[:10000]
# ) | summary_prompt |  ChatOpenAI(model=model_name) | StrOutputParser()

# From above - MODIFIED 
scrape_and_summary_chain = RunnablePassthrough.assign(
    text = lambda x: scrape_text(x["url"])[:10000]
) | RunnablePassthrough.assign(
    url_output = lambda x : x["url"],
    summary_output = lambda x: RunnablePassthrough() | summary_prompt | ChatOpenAI(model=model_name) | StrOutputParser()
) | (lambda x: f"URL: {x['url_output']}\n\n SUMMARY: {x['summary_output']}")

# New - Recall from above cell that a dict is returned from RunnablePassthrough.assign() {"question", "urls"}
web_url_chain = RunnablePassthrough.assign(
    urls = lambda x: web_search(x["question"])
) | (lambda x: [{"question":x["question"], "url":u} for u in x["urls"]]) # What we want is a list of {"url":...}

# Put together - Map applies chain to every input element. It takes a list of dictionaries
question_to_summaries_chain = web_url_chain | scrape_and_summary_chain.map()

# Invoke chain
question_to_summaries_chain.invoke(
    {
        "question":"what is langsmith?"
    }
)

['URL: https://logankilpatrick.medium.com/what-is-langsmith-and-why-should-i-care-as-a-developer-e5921deb54b5\n\n SUMMARY: LangSmith is a product created by the team behind LangChain, the popular language model software tool. It aims to tackle the challenges of getting language model applications into production in a reliable and maintainable way. LangSmith focuses on providing features for debugging, testing, evaluating, monitoring, and tracking usage metrics of language models. It offers a simple and intuitive user interface to perform these tasks, reducing the barrier to entry for developers without a software background. While there may be potential competitors in the market, LangSmith aims to differentiate itself by integrating with various tools and services and providing value-added features. The author also mentions the need for extensibility and the potential for LangSmith to be built into other applications and services.',
 'URL: https://blog.langchain.dev/announcing-langsmit

What we are going to do now is add on top a chain to generate a list of sub-questions to feed the main chain

We need an additional prompt to generate such questions. Inspiration taken from [here](https://github.com/assafelovic/gpt-researcher/blob/master/gpt_researcher/master/prompts.py). Refer [here](https://python.langchain.com/docs/modules/model_io/prompts/prompt_templates/) from PromptTemplate guides.

In [86]:
search_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an AI Research assistant. Your name is {agent_name}"),
        ("user",
         "Write 3 google search queries to search online that form an objective opinion from "
         "the following {question}\n"
         "You must respond with a list of strings in the following format:"
         '["query 1", "query 2", "query 3"].'
        )
    ]
)

search_questions_chain = search_prompt | ChatOpenAI(model=model_name) 

Test it to see what the output is so that you know how to parse it within the same chain

In [17]:
search_questions_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the one piece?"
    }
)

AIMessage(content='["One Piece manga plot summary", "Opinions on One Piece anime adaptation", "Critiques of One Piece storyline"]')

So we need to modify to output a list of dictionaries

In [18]:
search_questions_chain = search_prompt | ChatOpenAI(model=model_name) | StrOutputParser()

search_questions_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the one piece?"
    }
)

'["One Piece manga plot summary", "Analysis and critique of One Piece storyline", "Opinions and reviews on One Piece anime adaptation"]'

This is not actual list, but a string representing a list. We need to first convert the string type output, that has an embedded list, into an actual python list object. This can be achieved by [json.loads()](https://www.geeksforgeeks.org/python-difference-between-json-load-and-json-loads/)

Notice that in order to so, we have to make sure that the Prompt instructions have the <b>"</b> inside the embedded list, and the <b>'</b> at the ends of the string. Otherwise you'll get an error

In [42]:
json.loads('["One Piece anime review", "One Piece manga vs anime comparison", "One Piece fan discussion"]')

['One Piece anime review',
 'One Piece manga vs anime comparison',
 'One Piece fan discussion']

The alternative will produce an error (below)

In [43]:
json.loads("['a', 'b']")

JSONDecodeError: Expecting value: line 1 column 2 (char 1)

Now we can continue and convert the output to a list of strings below (Example)

In [87]:
search_questions_chain = search_prompt | ChatOpenAI(model=model_name) | StrOutputParser() | json.loads

search_questions_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the diffence between the UEFA Champions and Europe leagues?"
    }
)

['UEFA Champions League format',
 'UEFA Champions League vs Europa League',
 'TV rights for UEFA Champions League and Europa League']

Recall that we <b>need a list of dictionaries to pass it along the chain</b> so that it can be "mapped." So we need to convert it to that. (i.e. [{"question" : "....."}])

The <b>web_url_chain</b> within the <b>question_to_summaries_chain</b> requires a <b>"question"</b> key

In [88]:
search_questions_chain = search_prompt | ChatOpenAI(model=model_name, temperature=0.3) | StrOutputParser() | json.loads | (
    lambda x : [{"question":i} for i in x]
)

search_questions_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the diffence between langsmith and langchain?"
    }
)

[{'question': 'Langsmith vs Langchain differences'},
 {'question': 'Comparison between Langsmith and Langchain'},
 {'question': 'Langsmith and Langchain features comparison'}]

Good. Now <b>pass it to the question_to_summaries_chain</b>. We need to map it, which in essence will invoke the question_to_summaries_chain for each dict input coming out of the search_questions_chain

In [91]:
main_research_chain = search_questions_chain | question_to_summaries_chain.map()

main_research_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the diffence between langsmith and langchain?"
    }
)

[['URL: https://cheatsheet.md/langchain-tutorials/langsmith.en\n\nSUMMARY: LangSmith is a platform designed to build, test, and deploy intelligent agents and chains based on any Language Learning Model (LLM) framework. It offers features such as debugging, testing, and tracing capabilities. LangSmith integrates seamlessly with LangChain, an open-source LLM framework developed by the same company. LangSmith is user-friendly, versatile, and supported by a strong community. It provides interactive tutorials, a quick start guide, and a Cookbook filled with real-world examples to assist developers in their LLM projects. The Cookbook is a community-driven resource that covers topics like tracing code, using REST API features, evaluating Q&A systems, and exporting data for fine-tuning.',
  "URL: https://logankilpatrick.medium.com/what-is-langsmith-and-why-should-i-care-as-a-developer-e5921deb54b5\n\nSUMMARY: The text does not provide a direct comparison between LangSmith and LangChain. It onl

This is a list of lists. Now that we have it, all we have to do is pass it to a final prompt to summarize everything

We need a function to flatten the lists of list and a Prompt that will summarize that text. Inspiration for prompt taken from [here](https://github.com/assafelovic/gpt-researcher/blob/master/gpt_researcher/master/prompts.py)

In [89]:
writer_system_prompt = "You are an AI critical thinker research assistant. Your sole purpose is to write well written, "\
"critically acclaimed, objective and structured reports on given text."

research_report_template = """Information:
---------
{research_summary}
---------

Using the above information, answer the following question or topic: {question} in a detailed report. \
The report should focus on the answer to the question, should be well structured, informative, \
in depth, with facts and numbers if available and a minimum of 1,200 words.

You should strive to write the report as long as you can using all relevant and necessary information provided. \
You must write the report with markdown syntax. \
Use an unbiased and journalistic tone. \
You MUST determine your own concrete and valid opinion based on the given information. Do NOT deter to general and meaningless conclusions. \
You MUST write all used source urls at the end of the report as references, and make sure to not add duplicated sources, but only one reference for each. \
You MUST write the report in apa format. \
Cite search results using inline notations. Only cite the most \
relevant results that answer the query accurately. Place these citations at the end \
of the sentence or paragraph that reference them. \
Please do your best, this is very important to my career.
"""

research_summarization_prompt = ChatPromptTemplate.from_messages([
    ("system", writer_system_prompt),
    ("user", research_report_template)
])

Now we write a helper function to collapse the list of lsits

In [90]:
# Fuction to collapse list of lists
def collapse_list_of_lists(lists):
    
    all_text = []
    for i in lists:
        all_text.append("\n\n".join(i))

    return "\n\n".join(all_text)

Finally we write chain. We need to pass the list of lists through our collapsing function before feeding it to the summarization prompt

In [36]:
[["a","b"], ["c","d"]] | collapse_list_of_lists

TypeError: unsupported operand type(s) for |: 'list' and 'function'

We need runnable to do this and keep it in LCEL (pipe) format. This allows us to pass it as a variable in an assigned value ("research_summary") as a dict to the prompt.

**Note** Our prompt does need a *question* though, does this mean that the main research question remains in the LCEL unless we change it? Because we are not passing a *question* in the list of lists. Let's explore this first

In [92]:
all_chain = RunnablePassthrough.assign(
    research_summary = main_research_chain | collapse_list_of_lists
)

all_chain_result = all_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the diffence between langsmith and langchain?"
    }
)

**Important NOTE** Yes, the LCEL keeps memory of the *invoked elements* throughout the chain (I'm assuming unless we change them)

In [45]:
all_chain_result

{'agent_name': 'El Bryan',
 'question': 'what is the diffence between langsmith and langchain?',
 'research_summary': "The given text does not provide any information about Langsmith.\n\nThe text does not provide a direct comparison between LangSmith and LangChain. However, it mentions that LangSmith is the latest product from the creators of LangChain and aims to tackle production challenges for large language models (LLMs). LangChain, on the other hand, is described as the most popular LLM software tool and focuses on reducing the barrier to entry for building prototypes. LangSmith aims to provide features related to debugging, testing, evaluating, monitoring, and usage metrics for LLM applications, with a user-friendly UI. It is mentioned that other platforms, such as Vercel, may also build similar tooling in the future, and LangSmith's success will depend on its ability to expand and compete with multiple providers and other tooling ecosystems.\n\nThe given text does not provide an

Now that we are certain we'll execute the entire chain, all the way to the summarization of the collapsed lists in APA format

In [97]:
all_chain = RunnablePassthrough.assign(
    research_summary = main_research_chain | collapse_list_of_lists 
) | research_summarization_prompt | ChatOpenAI(model="gpt-3.5-turbo-16k") | StrOutputParser()

all_chain_result = all_chain.invoke(
    {
        "agent_name":"El Bryan",
        "question":"what is the diffence between langsmith and langchain?"
    }
)

In [98]:
Markdown(all_chain_result)

# Report: The Difference Between LangSmith and LangChain

## Introduction
LangSmith and LangChain are two platforms developed by LangChain that cater to the needs of developers working with Language Learning Models (LLMs). While they share a common origin, they serve different purposes in the development and productionization of LLM applications. This report aims to provide a detailed analysis of the differences between LangSmith and LangChain, based on the information provided.

## LangSmith: Elevating LLM Applications to Production Grade
LangSmith is a platform developed by LangChain that focuses on providing features for debugging, testing, evaluating, and monitoring LLM applications. It is designed to elevate LLM applications to production-grade quality by addressing the challenges associated with getting these applications into production in a reliable and maintainable way.

### Debugging and Testing Capabilities
One of the key features of LangSmith is its debugging and testing capabilities. It offers tools for analyzing and extracting insights from LLM responses, allowing developers to refine and enhance their AI applications. This dynamic testing framework enhances the efficiency and performance of new chains and sets of tools effortlessly [^1^]. It allows users to create and run customizable test scenarios, enabling thorough testing of LLM applications [^6^].

### API and Environment Setup
LangSmith provides a seamless integration with the open-source LangChain framework, making it easier for developers to set up the necessary API and environment for their LLM applications. This integration ensures that developers can quickly get started with their projects without facing significant barriers to entry [^1^].

### Tracing Capabilities for Code Improvement
LangSmith incorporates tracing capabilities that enable developers to investigate and debug the execution steps of their LLM applications. This feature helps identify and rectify issues in the code, leading to improved performance and reliability [^3^].

### Community Support and Cookbook
LangSmith boasts strong community support, which is crucial for the growth and development of any platform. The LangSmith Cookbook serves as a community-driven resource that provides practical examples and use-cases for mastering LangSmith. It allows users to contribute their insights, shaping the community and fostering collaboration [^1^].

## LangChain: Enabling Advanced Language Model Applications
LangChain, on the other hand, is the underlying framework on which LangSmith is built. It provides a comprehensive toolkit for developers to build advanced language model applications that are adaptable, efficient, and capable of handling complex use cases [^7^]. LangChain consists of several integral parts, including LangChain Libraries, LangChain Templates, LangServe, and LangSmith.

### LangChain Libraries and Templates
LangChain Libraries serve as the backbone of the LangChain framework, offering interfaces and integrations for various components. They provide a basic runtime for combining these components into cohesive chains and agents. LangChain Templates, on the other hand, are deployable reference architectures tailored for different tasks, providing a solid starting point for application development [^7^].

### LangServe
LangServe is a versatile library that allows developers to deploy LangChain chains as REST APIs. It simplifies the process of turning LangChain projects into accessible and scalable web services, enabling developers to create context-aware applications capable of sophisticated reasoning [^7^].

### LangSmith for Debugging, Testing, and Monitoring
While LangSmith is a separate platform, it is closely integrated with LangChain. LangSmith serves as a developer platform for debugging, testing, evaluating, and monitoring chains built on any LLM framework [^7^]. It offers a range of features that help developers ensure the quality and reliability of their LLM applications.

## Comparison and Conclusion
Based on the information provided, LangSmith and LangChain serve complementary roles in the development and productionization of LLM applications. LangSmith focuses on providing features for debugging, testing, evaluating, and monitoring LLM applications, while LangChain provides a comprehensive toolkit for building advanced language model applications.

LangSmith's key features include debugging and testing capabilities, API and environment setup, and tracing capabilities for code improvement. It is known for its ease of use, versatility, and strong community support. The LangSmith Cookbook, a community-driven resource, provides practical examples and use-cases for mastering LangSmith [^1^].

LangChain, on the other hand, offers a range of features such as LLM and prompt management, chain creation, data augmented generation, agent support, memory management, and evaluation tools. It provides a comprehensive toolkit for developers to build advanced language model applications that are adaptable, efficient, and capable of handling complex use cases [^7^].

In conclusion, LangSmith and LangChain are two essential platforms developed by LangChain to cater to the needs of developers working with Language Learning Models (LLMs). While LangSmith focuses on providing features for debugging, testing, evaluating, and monitoring LLM applications, LangChain provides a comprehensive toolkit for building advanced language model applications. Both platforms play crucial roles in simplifying the development, productionization, and deployment of intelligent, language model-powered applications.

## References
1. LangChain. (n.d.). LangSmith Tutorials. Retrieved from https://cheatsheet.md/langchain-tutorials/langsmith.en
2. Kilpatrick, L. (2021, August 10). What is LangSmith, and why should I care as a developer? Retrieved from https://logankilpatrick.medium.com/what-is-langsmith-and-why-should-i-care-as-a-developer-e5921deb54b5
3. Kilpatrick, L. (2021, September 28). LangSmith: Test LLMs & AI Applications. Retrieved from https://blog.logrocket.com/langsmith-test-llms-ai-applications/
4. AIModels. (n.d.). A Complete Guide to LangChain: Building Powerful Applications with Large Language Models. Retrieved from https://notes.aimodels.fyi/a-complete-guide-to-langchain-building-powerful-applications-with-large-language-models/
5. Nanonets. (n.d.). LangChain: A Comprehensive Guide. Retrieved from https://nanonets.com/blog/langchain/
6. Kilpatrick, L. (2021, September 28). What is LangSmith, and why should I care as a developer? Retrieved from https://blog.logrocket.com/langsmith-test-llms-ai-applications/
7. Enterprise DNA. (n.d.). What is LangChain? A Beginner's Guide with Examples. Retrieved from https://blog.enterprisedna.co/what-is-langchain-a-beginners-guide-with-examples/

It does a decent job at summarizing the content, and listing the references. These have to be manually checked however to increase confidence

**All work up to here will be stored in python file within** [repo](https://github.com/jzamalloa1/langchain_learning/blob/main/ra_pilot.py) and will be **langserved**

### Next we'll test another retriever, Arxiv

Instead of the websearch retriever used above. Final .py langserved file will be hosted [here](https://github.com/jzamalloa1/langchain_learning/blob/main/ra_pilot_arxiv.py)

In [54]:
from langchain.retrievers import ArxivRetriever
import arxiv

In [58]:
retriever = ArxivRetriever(top_k_results=5) #, get_full_documents=True, doc_content_chars_max=None)

In [59]:
# docs = retriever.get_relevant_documents("housing markets in the US")
docs = retriever.get_summaries_as_docs(query="what are some advancements that have happened in llm prompting strategies?")
# docs = retriever.get_relevant_documents(query="what is the difference between openai and langchain?")

In [60]:
docs

[Document(page_content="Large Language Models (LLMs) with strong abilities in natural language\nprocessing tasks have emerged and have been applied in various kinds of areas\nsuch as science, finance and software engineering. However, the capability of\nLLMs to advance the field of chemistry remains unclear. In this paper, rather\nthan pursuing state-of-the-art performance, we aim to evaluate capabilities of\nLLMs in a wide range of tasks across the chemistry domain. We identify three\nkey chemistry-related capabilities including understanding, reasoning and\nexplaining to explore in LLMs and establish a benchmark containing eight\nchemistry tasks. Our analysis draws on widely recognized datasets facilitating\na broad exploration of the capacities of LLMs within the context of practical\nchemistry. Five LLMs (GPT-4, GPT-3.5, Davinci-003, Llama and Galactica) are\nevaluated for each chemistry task in zero-shot and few-shot in-context learning\nsettings with carefully selected demonstrat

In [48]:
docs[0].metadata

IndexError: list index out of range

In [24]:
print(docs[0].page_content)

OpenAI Gym is a toolkit for reinforcement learning (RL) research. It includes
a large number of well-known problems that expose a common interface allowing
to directly compare the performance results of different RL algorithms. Since
many years, the ns-3 network simulation tool is the de-facto standard for
academic and industry research into networking protocols and communications
technology. Numerous scientific papers were written reporting results obtained
using ns-3, and hundreds of models and modules were written and contributed to
the ns-3 code base. Today as a major trend in network research we see the use
of machine learning tools like RL. What is missing is the integration of a RL
framework like OpenAI Gym into the network simulator ns-3. This paper presents
the ns3-gym framework. First, we discuss design decisions that went into the
software. Second, two illustrative examples implemented using ns3-gym are
presented. Our software package is provided to the community as open sou