# LangGraph and LangSmith - Agentic RAG Powered by LangChain

In the following notebook we'll complete the following tasks:

- 🤝 Breakout Room #1:
  1. Install required libraries
  2. Set Environment Variables
  3. Creating our Tool Belt
  4. Creating Our State
  5. Creating and Compiling A Graph!

- 🤝 Breakout Room #2:
  1. Evaluating the LangGraph Application with LangSmith
  2. Adding Helpfulness Check and "Loop" Limits
  3. LangGraph for the "Patterns" of GenAI

# 🤝 Breakout Room #1

## Part 1: LangGraph - Building Cyclic Applications with LangChain

LangGraph is a tool that leverages LangChain Expression Language to build coordinated multi-actor and stateful applications that includes cyclic behaviour.

### Why Cycles?

In essence, we can think of a cycle in our graph as a more robust and customizable loop. It allows us to keep our application agent-forward while still giving the powerful functionality of traditional loops.

Due to the inclusion of cycles over loops, we can also compose rather complex flows through our graph in a much more readable and natural fashion. Effectively allowing us to recreate application flowcharts in code in an almost 1-to-1 fashion.

### Why LangGraph?

Beyond the agent-forward approach - we can easily compose and combine traditional "DAG" (directed acyclic graph) chains with powerful cyclic behaviour due to the tight integration with LCEL. This means it's a natural extension to LangChain's core offerings!

## Task 1:  Dependencies


## Task 2: Environment Variables

We'll want to set our OpenAI, Tavily, and LangSmith API keys along with our LangSmith environment variables.

In [1]:
import os
import getpass

os.environ["OPENAI_API_KEY"] = getpass.getpass("OpenAI API Key:")

In [2]:
os.environ["TAVILY_API_KEY"] = getpass.getpass("TAVILY_API_KEY")

In [3]:
from uuid import uuid4

os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_PROJECT"] = f"AIE8 - LangGraph - {uuid4().hex[0:8]}"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass("LangSmith API Key: ")

## Task 3: Creating our Tool Belt

As is usually the case, we'll want to equip our agent with a toolbelt to help answer questions and add external knowledge.

There's a tonne of tools in the [LangChain Community Repo](https://github.com/langchain-ai/langchain-community/tree/main/libs/community) but we'll stick to a couple just so we can observe the cyclic nature of LangGraph in action!

We'll leverage:

- [Tavily Search Results](https://github.com/langchain-ai/langchain-community/blob/main/libs/community/langchain_community/tools/tavily_search/tool.py)
- [Arxiv](https://github.com/langchain-ai/langchain-community/blob/main/libs/community/langchain_community/tools/arxiv/tool.py)

#### 🏗️ Activity #1:

Please add the tools to use into our toolbelt.

> NOTE: Each tool in our toolbelt should be a method.

In [4]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_community.tools.arxiv.tool import ArxivQueryRun

tavily_tool = TavilySearchResults(max_results=5)

tool_belt = [
    tavily_tool,
    ArxivQueryRun(),
]

  tavily_tool = TavilySearchResults(max_results=5)


### Model

Now we can set-up our model! We'll leverage the familiar OpenAI model suite for this example - but it's not *necessary* to use with LangGraph. LangGraph supports all models - though you might not find success with smaller models - as such, they recommend you stick with:

- OpenAI's GPT-3.5 and GPT-4
- Anthropic's Claude
- Google's Gemini

> NOTE: Because we're leveraging the OpenAI function calling API - we'll need to use OpenAI *for this specific example* (or any other service that exposes an OpenAI-style function calling API.

In [5]:
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4.1-nano", temperature=0)

Now that we have our model set-up, let's "put on the tool belt", which is to say: We'll bind our LangChain formatted tools to the model in an OpenAI function calling format.

In [6]:
model = model.bind_tools(tool_belt)

#### ❓ Question #1:

How does the model determine which tool to use?

The model determines which tool to use by analyzing the user's input and matching it to the function signatures and descriptions of the available tools. When using the OpenAI function calling API (as in this example), the model is provided with a list of tools (functions), each with a name and description. The model then decides, based on the context of the conversation and the user's request, which tool (if any) is most appropriate to call, and generates a function call to that tool with the relevant arguments. This process leverages the model's understanding of language and the provided tool metadata to select the best tool for the task.


## Task 4: Putting the State in Stateful

Earlier we used this phrasing:

`coordinated multi-actor and stateful applications`

So what does that "stateful" mean?

To put it simply - we want to have some kind of object which we can pass around our application that holds information about what the current situation (state) is. Since our system will be constructed of many parts moving in a coordinated fashion - we want to be able to ensure we have some commonly understood idea of that state.

LangGraph leverages a `StatefulGraph` which uses an `AgentState` object to pass information between the various nodes of the graph.

There are more options than what we'll see below - but this `AgentState` object is one that is stored in a `TypedDict` with the key `messages` and the value is a `Sequence` of `BaseMessages` that will be appended to whenever the state changes.

Let's think about a simple example to help understand exactly what this means (we'll simplify a great deal to try and clearly communicate what state is doing):

1. We initialize our state object:
  - `{"messages" : []}`
2. Our user submits a query to our application.
  - New State: `HumanMessage(#1)`
  - `{"messages" : [HumanMessage(#1)}`
3. We pass our state object to an Agent node which is able to read the current state. It will use the last `HumanMessage` as input. It gets some kind of output which it will add to the state.
  - New State: `AgentMessage(#1, additional_kwargs {"function_call" : "WebSearchTool"})`
  - `{"messages" : [HumanMessage(#1), AgentMessage(#1, ...)]}`
4. We pass our state object to a "conditional node" (more on this later) which reads the last state to determine if we need to use a tool - which it can determine properly because of our provided object!

In [7]:
from typing import TypedDict, Annotated
from langgraph.graph.message import add_messages
import operator
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

## Task 5: It's Graphing Time!

Now that we have state, and we have tools, and we have an LLM - we can finally start making our graph!

Let's take a second to refresh ourselves about what a graph is in this context.

Graphs, also called networks in some circles, are a collection of connected objects.

The objects in question are typically called nodes, or vertices, and the connections are called edges.

Let's look at a simple graph.

![image](https://i.imgur.com/2NFLnIc.png)

Here, we're using the coloured circles to represent the nodes and the yellow lines to represent the edges. In this case, we're looking at a fully connected graph - where each node is connected by an edge to each other node.

If we were to think about nodes in the context of LangGraph - we would think of a function, or an LCEL runnable.

If we were to think about edges in the context of LangGraph - we might think of them as "paths to take" or "where to pass our state object next".

Let's create some nodes and expand on our diagram.

> NOTE: Due to the tight integration with LCEL - we can comfortably create our nodes in an async fashion!

In [8]:
from langgraph.prebuilt import ToolNode

def call_model(state):
  messages = state["messages"]
  response = model.invoke(messages)
  return {"messages" : [response]}

tool_node = ToolNode(tool_belt)

Now we have two total nodes. We have:

- `call_model` is a node that will...well...call the model
- `tool_node` is a node which can call a tool

Let's start adding nodes! We'll update our diagram along the way to keep track of what this looks like!


In [9]:
from langgraph.graph import StateGraph, END

uncompiled_graph = StateGraph(AgentState)

uncompiled_graph.add_node("agent", call_model)
uncompiled_graph.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x11520e7b0>

Let's look at what we have so far:

![image](https://i.imgur.com/md7inqG.png)

Next, we'll add our entrypoint. All our entrypoint does is indicate which node is called first.

In [10]:
uncompiled_graph.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x11520e7b0>

![image](https://i.imgur.com/wNixpJe.png)

Now we want to build a "conditional edge" which will use the output state of a node to determine which path to follow.

We can help conceptualize this by thinking of our conditional edge as a conditional in a flowchart!

Notice how our function simply checks if there is a "function_call" kwarg present.

Then we create an edge where the origin node is our agent node and our destination node is *either* the action node or the END (finish the graph).

It's important to highlight that the dictionary passed in as the third parameter (the mapping) should be created with the possible outputs of our conditional function in mind. In this case `should_continue` outputs either `"end"` or `"continue"` which are subsequently mapped to the action node or the END node.

In [11]:
def should_continue(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  return END

uncompiled_graph.add_conditional_edges(
    "agent",
    should_continue
)

<langgraph.graph.state.StateGraph at 0x11520e7b0>

Let's visualize what this looks like.

![image](https://i.imgur.com/8ZNwKI5.png)

Finally, we can add our last edge which will connect our action node to our agent node. This is because we *always* want our action node (which is used to call our tools) to return its output to our agent!

In [12]:
uncompiled_graph.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x11520e7b0>

Let's look at the final visualization.

![image](https://i.imgur.com/NWO7usO.png)

All that's left to do now is to compile our workflow - and we're off!

In [13]:
simple_agent_graph = uncompiled_graph.compile()

#### ❓ Question #2:

Is there any specific limit to how many times we can cycle?

If not, how could we impose a limit to the number of cycles?

There is no specific built-in limit to how many times the graph can cycle; by default, it will continue looping between the agent and action nodes as long as the agent continues to request tool calls.
# 
To impose a limit on the number of cycles, you could add a counter to the state that tracks how many times the loop has occurred. Then, modify the `should_continue` function to check this counter and return END if the maximum number of cycles has been reached.
# 
For example:
1. Initialize a `cycle_count` in your state (e.g., set to 0 at the start).
2. Increment `cycle_count` each time the agent node runs.
3. In `should_continue`, check if `cycle_count` exceeds your desired maximum (e.g., 5). If so, return END.
# 
This approach prevents infinite loops and allows you to control the maximum number of iterations.


## Using Our Graph

Now that we've created and compiled our graph - we can call it *just as we'd call any other* `Runnable`!

Let's try out a few examples to see how it fairs:

In [14]:
from langchain_core.messages import HumanMessage

inputs = {"messages" : [HumanMessage(content="How are technical professionals using AI to improve their work?")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='Technical professionals are using AI in various ways to enhance their work, including automating repetitive tasks, improving decision-making, analyzing large datasets, developing new products and services, and optimizing processes. They leverage AI for tasks such as machine learning model development, natural language processing, computer vision, predictive analytics, and automation of workflows. This integration helps increase efficiency, accuracy, and innovation across different industries. Would you like specific examples from particular fields or industries?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 90, 'prompt_tokens': 163, 'total_tokens': 253, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano

Let's look at what happened:

1. Our state object was populated with our request
2. The state object was passed into our entry point (agent node) and the agent node added an `AIMessage` to the state object and passed it along the conditional edge
3. The conditional edge received the state object, found the "tool_calls" `additional_kwarg`, and sent the state object to the action node
4. The action node added the response from the OpenAI function calling endpoint to the state object and passed it along the edge to the agent node
5. The agent node added a response to the state object and passed it along the conditional edge
6. The conditional edge received the state object, could not find the "tool_calls" `additional_kwarg` and passed the state object to END where we see it output in the cell above!

Now let's look at an example that shows a multiple tool usage - all with the same flow!

In [15]:
inputs = {"messages" : [HumanMessage(content="Search Arxiv for the A Comprehensive Survey of Deep Research paper, then search each of the authors to find out where they work now using Tavily!")]}

async for chunk in simple_agent_graph.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        if node == "action":
          print(f"Tool Used: {values['messages'][0].name}")
        print(values["messages"])

        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_LyPspDcD6ja9v9IcZoy2rfXQ', 'function': {'arguments': '{"query": "A Comprehensive Survey of Deep Research"}', 'name': 'arxiv'}, 'type': 'function'}, {'id': 'call_YM9MaTSpBXRDPYLdEEsxt5IB', 'function': {'arguments': '{"query": "author of A Comprehensive Survey of Deep Research"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 60, 'prompt_tokens': 182, 'total_tokens': 242, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': 'fp_7c233bf9d1', 'id': 'chatcmpl-CJ7I0JIETVvTyxkSLHnuN2bD3dN5P', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--548df1c3-c8c

#### 🏗️ Activity #2:

Please write out the steps the agent took to arrive at the correct answer.

Receiving update from node: 'agent'
[AIMessage(content='', additional_kwargs={'tool_calls': [{'id': 'call_LyPspDcD6ja9v9IcZoy2rfXQ', 'function': {'arguments': '{"query": "A Comprehensive Survey of Deep Research"}', 'name': 'arxiv'}, 'type': 'function'}, {'id': 'call_YM9MaTSpBXRDPYLdEEsxt5IB', 'function': {'arguments': '{"query": "author of A Comprehensive Survey of Deep Research"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 60, 'prompt_tokens': 182, 'total_tokens': 242, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': 'fp_7c233bf9d1', 'id': 'chatcmpl-CJ7I0JIETVvTyxkSLHnuN2bD3dN5P', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--548df1c3-c8cc-4484-8446-20e1f8b7cab0-0', tool_calls=[{'name': 'arxiv', 'args': {'query': 'A Comprehensive Survey of Deep Research'}, 'id': 'call_LyPspDcD6ja9v9IcZoy2rfXQ', 'type': 'tool_call'}, {'name': 'tavily_search_results_json', 'args': {'query': 'author of A Comprehensive Survey of Deep Research'}, 'id': 'call_YM9MaTSpBXRDPYLdEEsxt5IB', 'type': 'tool_call'}], usage_metadata={'input_tokens': 182, 'output_tokens': 60, 'total_tokens': 242, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]



Receiving update from node: 'action'
Tool Used: arxiv
[ToolMessage(content='Published: 2025-06-14\nTitle: A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications\nAuthors: Renjun Xu, Jingwen Peng\nSummary: This survey examines the rapidly evolving field of Deep Research systems --\nAI-powered applications that automate complex research workflows through the\nintegration of large language models, advanced information retrieval, and\nautonomous reasoning capabilities. We analyze more than 80 commercial and\nnon-commercial implementations that have emerged since 2023, including\nOpenAI/Deep Research, Gemini/Deep Research, Perplexity/Deep Research, and\nnumerous open-source alternatives. Through comprehensive examination, we\npropose a novel hierarchical taxonomy that categorizes systems according to\nfour fundamental technical dimensions: foundation models and reasoning engines,\ntool utilization and environmental interaction, task planning and execution\ncontrol, and knowledge synthesis and output generation. We explore the\narchitectural patterns, implementation approaches, and domain-specific\nadaptations that characterize these systems across academic, scientific,\nbusiness, and educational applications. Our analysis reveals both the\nsignificant capabilities of current implementations and the technical and\nethical challenges they present regarding information accuracy, privacy,\nintellectual property, and accessibility. The survey concludes by identifying\npromising research directions in advanced reasoning architectures, multimodal\nintegration, domain specialization, human-AI collaboration, and ecosystem\nstandardization that will likely shape the future evolution of this\ntransformative technology. By providing a comprehensive framework for\nunderstanding Deep Research systems, this survey contributes to both the\ntheoretical understanding of AI-augmented knowledge work and the practical\ndevelopment of more capable, responsible, and accessible research technologies.\nThe paper resources can be viewed at\nhttps://github.com/scienceaix/deepresearch.\n\nPublished: 2021-03-05\nTitle: A comprehensive survey on point cloud registration\nAuthors: Xiaoshui Huang, Guofeng Mei, Jian Zhang, Rana Abbas\nSummary: Registration is a transformation estimation problem between two point clouds,\nwhich has a unique and critical role in numerous computer vision applications.\nThe developments of optimization-based methods and deep learning methods have\nimproved registration robustness and efficiency. Recently, the combinations of\noptimization-based and deep learning methods have further improved performance.\nHowever, the connections between optimization-based and deep learning methods\nare still unclear. Moreover, with the recent development of 3D sensors and 3D\nreconstruction techniques, a new research direction emerges to align\ncross-source point clouds. This survey conducts a comprehensive survey,\nincluding both same-source and cross-source registration methods, and summarize\nthe connections between optimization-based and deep learning methods, to\nprovide further research insight. This survey also builds a new benchmark to\nevaluate the state-of-the-art registration algorithms in solving cross-source\nchallenges. Besides, this survey summarizes the benchmark data sets and\ndiscusses point cloud registration applications across various domains.\nFinally, this survey proposes potential research directions in this rapidly\ngrowing field.\n\nPublished: 2023-07-07\nTitle: A Survey of Deep Learning in Sports Applications: Perception, Comprehension, and Decision\nAuthors: Zhonghan Zhao, Wenhao Chai, Shengyu Hao, Wenhao Hu, Guanhong Wang, Shidong Cao, Mingli Song, Jenq-Neng Hwang, Gaoang Wang\nSummary: Deep learning has the potential to revolutionize sports performance, with\napplications ranging from perception and comprehension to decision. This paper\npresents a comprehensive survey of deep learning in sports performance,\nfocusing on three main aspects: algorithms, datasets and virtual environments,\nand challenges. Firstly, we discuss th', name='arxiv', id='2281b908-75b9-4258-bbe1-719e1c076008', tool_call_id='call_LyPspDcD6ja9v9IcZoy2rfXQ'), ToolMessage(content='[{"title": "[2506.12594] A Comprehensive Survey of Deep Research - arXiv", "url": "https://arxiv.org/abs/2506.12594", "content": "We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate\\n\\n> cs > arXiv:2506.12594\\n\\n# Computer Science > Artificial Intelligence\\n\\narXiv:2506.12594 (cs)\\n\\n[Submitted on 14 Jun 2025]\\n\\n# Title:A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications\\n\\nAuthors:Renjun Xu, Jingwen Peng [...] View a PDF of the paper titled A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications, by Renjun Xu and 1 other authors [...] View a PDF of the paper titled A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications, by Renjun Xu and 1 other authors\\n\\n View PDF\\n HTML (experimental)\\n TeX Source\\n Other Formats\\n\\nview license\\n\\nCurrent browse context:\\n\\ncs.AI\\n\\n< prev\\")    |    next >\\")\\n\\nnew  |  recent  | 2025-06\\n\\nChange to browse by:\\n\\ncs cs.MA\\n\\n### References & Citations\\n\\n NASA ADS\\n Google Scholar\\n Semantic Scholar\\n\\na export BibTeX citation Loading...\\n\\n## BibTeX formatted citation\\n\\n×", "score": 0.90289277}, {"title": "A comprehensive survey of deep learning research on medical ...", "url": "https://pubmed.ncbi.nlm.nih.gov/36462229/", "content": "PMID: 36462229\\n    DOI: 10.1016/j.clinimag.2022.11.003\\n\\n Item in Clipboard \\n\\nReview\\n\\nA comprehensive survey of deep learning research on medical image analysis with focus on transfer learning\\n\\nSema Atasever et al. Clin Imaging.2023 Feb.\\n\\nShow details\\n\\nDisplay options\\n\\n Display options \\n\\n Format \\n\\n Clin Imaging \\n\\nActions\\n\\n   Search in PubMed\\n   Search in NLM Catalog\\n   Add to Search\\n\\n. 2023 Feb:94:18-41.\\n\\n doi: 10.1016/j.clinimag.2022.11.003.  Epub 2022 Nov 12. \\n\\n### Authors [...] doi: 10.1016/j.clinimag.2022.11.003.  Epub 2022 Nov 12. \\n\\nA comprehensive survey of deep learning research on medical image analysis with focus on transfer learning\\n\\nSema Atasever1,Nuh Azginoglu2,Duygu Sinanc Terzi3,Ramazan Terzi4\\n\\n Affiliations  Expand \\n\\n### Affiliations [...] A Deep Convolutional Neural Network for Pneumonia Detection in X-ray Images with Attention Ensemble.An Q, Chen W, Shao W.An Q, et al.Diagnostics (Basel). 2024 Feb 11;14(4):390. doi: 10.3390/diagnostics14040390.Diagnostics (Basel). 2024.PMID: 38396430 Free PMC article.", "score": 0.71425}, {"title": "A Comprehensive Survey of Deep Learning Approaches in Image ...", "url": "https://www.mdpi.com/1424-8220/25/2/531", "content": "Elias Dritsas\\n\\nElias Dritsas\\n\\nSciProfiles  Scilit  Preprints.org  Google Scholar\\n\\n \\\\\\n\\nIndustrial Systems Institute (ISI), Athena Research and Innovation Center, 26504 Patras, Greece\\n\\n\\\\\\n\\nAuthor to whom correspondence should be addressed.\\n\\nSensors 2025, 25(2), 531; \\n\\nSubmission received: 20 December 2024 / Revised: 13 January 2025 / Accepted: 13 January 2025 / Published: 17 January 2025 [...] Visit our dedicated information section to learn more about MDPI.\\n\\n Get Information\\n\\nclear\\n\\n## JSmol Viewer\\n\\nclear\\n\\nfirst\\\\_page\\n\\n Download PDF \\n\\nsettings\\n\\n Order Article Reprints\\n\\nFont Type:\\n\\nArial Georgia Verdana\\n\\nFont Size:\\n\\nAa Aa Aa\\n\\nLine Spacing:\\n\\n\uf034   \uf034   \uf034\\n\\nColumn Width:\\n\\n\uf035   \uf035   \uf035\\n\\nBackground:\\n\\nOpen AccessReview\\n\\n# A Comprehensive Survey of Deep Learning Approaches in Image Processing\\n\\nby \\n\\nMaria Trigka\\n\\nMaria Trigka\\n\\nSciProfiles  Scilit  Preprints.org  Google Scholar\\n\\nand [...] transferring irrelevant knowledge, ensuring effective domain alignment. Apart from its contribution to negative transfer mitigation, it also enhances fine-grained feature extraction, addresses the scarcity of labeled data with self-supervised pre-training, and resolves class imbalance using key point sensitive loss. These strategies demonstrate the solution’s robustness in sonar image classification challenges. A summary of topics discussed regarding transfer learning techniques is presented in", "score": 0.6469293}, {"title": "[PDF] A Comprehensive Survey of Deep Research - arXiv", "url": "https://arxiv.org/pdf/2506.12594?", "content": "> (2023 - February 2025) Early prototypes and foundational approaches (February - March 2025) Commercial releases and competitive rivalry (March 2025 - Present) Multi-modal integration and diverse applications\\n> Google Gemini Deep Research\\n> Dec 2024\\n> OpenAI Deep Research\\n> Feb 2025\\n> Manus\\n> Mar 2025\\n> Perplexity Deep Research\\n> Feb 2025\\n> AutoGLM-Research\\n> Mar 2025\\n> QwenLM/ Qwen-Agent\\n> Apr 2024\\n> n8n\\n> 2023 mshumer/ OpenDeepResearcher\\n> Feb 2025\\n> nickscamara/ open-deep-research [...] Human-AI Collaboration and Standardization 72 9 Conclusion 76 9.1 Key Findings and Contributions 76 9.2 Limitations and Outlook 78 9.3 Broader Implications 79 9.4 Final Thoughts 80 References 81 4 Xu et al. [...] Deep Research demonstrates transformative potential across multiple domains: (1) Academic Innovation: Accelerating hypothesis validation through automated literature synthesis (e.g., HotpotQA [ 307 ] performance benchmarks) and enabling researchers to explore broader inter-disciplinary connections that might otherwise remain undiscovered. The transformative potential of Deep Research extends beyond individual applications to fundamentally reshape scientific discovery processes. As Sourati and", "score": 0.6453216}, {"title": "A Comprehensive Survey of Deep Learning Applications in Big Data ...", "url": "https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5005162", "content": "Keywords: Deep learning, Big Data Analytics, Convolutional neural networks, recurrent neural networks, Long short-term memory\\n\\nSuggested Citation:\\nSuggested Citation\\n\\nKhan, Ayaz H., A Comprehensive Survey of Deep Learning Applications in Big Data Analytics: Trends, Techniques, and Future Directions. Available at SSRN:  or \\n\\n### Ayaz H. Khan (Contact Author)\\n\\n#### affiliation not provided to SSRN ( email) )\\n\\nDownload This Paper\\n\\nOpen PDF in Browser\\n\\n### 0 References\\n\\n### 0 Citations [...] Skip to main content\\n\\nDownload This Paper\\n\\nOpen PDF in Browser\\n\\nAdd Paper to My Library\\n\\nUsing these links will ensure access to this page indefinitely\\n\\nCopy URL\\n\\nCopy DOI\\n\\n# A Comprehensive Survey of Deep Learning Applications in Big Data Analytics: Trends, Techniques, and Future Directions\\n\\n27 Pages\\nPosted: 19 Nov 2024\\n\\nSee all articles by Ayaz H. Khan\\n\\n## Ayaz H. Khan\\n\\naffiliation not provided to SSRN\\n\\n### Abstract", "score": 0.63316137}]', name='tavily_search_results_json', id='71dc7bec-cd59-49d8-893b-7fbd60f6ad5a', tool_call_id='call_YM9MaTSpBXRDPYLdEEsxt5IB', artifact={'query': 'author of A Comprehensive Survey of Deep Research', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://arxiv.org/abs/2506.12594', 'title': '[2506.12594] A Comprehensive Survey of Deep Research - arXiv', 'content': 'We gratefully acknowledge support from the Simons Foundation, member institutions, and all contributors. Donate\n\n> cs > arXiv:2506.12594\n\n# Computer Science > Artificial Intelligence\n\narXiv:2506.12594 (cs)\n\n[Submitted on 14 Jun 2025]\n\n# Title:A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications\n\nAuthors:Renjun Xu, Jingwen Peng [...] View a PDF of the paper titled A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications, by Renjun Xu and 1 other authors [...] View a PDF of the paper titled A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications, by Renjun Xu and 1 other authors\n\n View PDF\n HTML (experimental)\n TeX Source\n Other Formats\n\nview license\n\nCurrent browse context:\n\ncs.AI\n\n< prev")    |    next >")\n\nnew  |  recent  | 2025-06\n\nChange to browse by:\n\ncs cs.MA\n\n### References & Citations\n\n NASA ADS\n Google Scholar\n Semantic Scholar\n\na export BibTeX citation Loading...\n\n## BibTeX formatted citation\n\n×', 'score': 0.90289277, 'raw_content': None}, {'url': 'https://pubmed.ncbi.nlm.nih.gov/36462229/', 'title': 'A comprehensive survey of deep learning research on medical ...', 'content': 'PMID: 36462229\n    DOI: 10.1016/j.clinimag.2022.11.003\n\n Item in Clipboard \n\nReview\n\nA comprehensive survey of deep learning research on medical image analysis with focus on transfer learning\n\nSema Atasever et al. Clin Imaging.2023 Feb.\n\nShow details\n\nDisplay options\n\n Display options \n\n Format \n\n Clin Imaging \n\nActions\n\n   Search in PubMed\n   Search in NLM Catalog\n   Add to Search\n\n. 2023 Feb:94:18-41.\n\n doi: 10.1016/j.clinimag.2022.11.003.  Epub 2022 Nov 12. \n\n### Authors [...] doi: 10.1016/j.clinimag.2022.11.003.  Epub 2022 Nov 12. \n\nA comprehensive survey of deep learning research on medical image analysis with focus on transfer learning\n\nSema Atasever1,Nuh Azginoglu2,Duygu Sinanc Terzi3,Ramazan Terzi4\n\n Affiliations  Expand \n\n### Affiliations [...] A Deep Convolutional Neural Network for Pneumonia Detection in X-ray Images with Attention Ensemble.An Q, Chen W, Shao W.An Q, et al.Diagnostics (Basel). 2024 Feb 11;14(4):390. doi: 10.3390/diagnostics14040390.Diagnostics (Basel). 2024.PMID: 38396430 Free PMC article.', 'score': 0.71425, 'raw_content': None}, {'url': 'https://www.mdpi.com/1424-8220/25/2/531', 'title': 'A Comprehensive Survey of Deep Learning Approaches in Image ...', 'content': 'Elias Dritsas\n\nElias Dritsas\n\nSciProfiles  Scilit  Preprints.org  Google Scholar\n\n \\\n\nIndustrial Systems Institute (ISI), Athena Research and Innovation Center, 26504 Patras, Greece\n\n\\\n\nAuthor to whom correspondence should be addressed.\n\nSensors 2025, 25(2), 531; \n\nSubmission received: 20 December 2024 / Revised: 13 January 2025 / Accepted: 13 January 2025 / Published: 17 January 2025 [...] Visit our dedicated information section to learn more about MDPI.\n\n Get Information\n\nclear\n\n## JSmol Viewer\n\nclear\n\nfirst\\_page\n\n Download PDF \n\nsettings\n\n Order Article Reprints\n\nFont Type:\n\nArial Georgia Verdana\n\nFont Size:\n\nAa Aa Aa\n\nLine Spacing:\n\n\uf034   \uf034   \uf034\n\nColumn Width:\n\n\uf035   \uf035   \uf035\n\nBackground:\n\nOpen AccessReview\n\n# A Comprehensive Survey of Deep Learning Approaches in Image Processing\n\nby \n\nMaria Trigka\n\nMaria Trigka\n\nSciProfiles  Scilit  Preprints.org  Google Scholar\n\nand [...] transferring irrelevant knowledge, ensuring effective domain alignment. Apart from its contribution to negative transfer mitigation, it also enhances fine-grained feature extraction, addresses the scarcity of labeled data with self-supervised pre-training, and resolves class imbalance using key point sensitive loss. These strategies demonstrate the solution’s robustness in sonar image classification challenges. A summary of topics discussed regarding transfer learning techniques is presented in', 'score': 0.6469293, 'raw_content': None}, {'url': 'https://arxiv.org/pdf/2506.12594?', 'title': '[PDF] A Comprehensive Survey of Deep Research - arXiv', 'content': '> (2023 - February 2025) Early prototypes and foundational approaches (February - March 2025) Commercial releases and competitive rivalry (March 2025 - Present) Multi-modal integration and diverse applications\n> Google Gemini Deep Research\n> Dec 2024\n> OpenAI Deep Research\n> Feb 2025\n> Manus\n> Mar 2025\n> Perplexity Deep Research\n> Feb 2025\n> AutoGLM-Research\n> Mar 2025\n> QwenLM/ Qwen-Agent\n> Apr 2024\n> n8n\n> 2023 mshumer/ OpenDeepResearcher\n> Feb 2025\n> nickscamara/ open-deep-research [...] Human-AI Collaboration and Standardization 72 9 Conclusion 76 9.1 Key Findings and Contributions 76 9.2 Limitations and Outlook 78 9.3 Broader Implications 79 9.4 Final Thoughts 80 References 81 4 Xu et al. [...] Deep Research demonstrates transformative potential across multiple domains: (1) Academic Innovation: Accelerating hypothesis validation through automated literature synthesis (e.g., HotpotQA [ 307 ] performance benchmarks) and enabling researchers to explore broader inter-disciplinary connections that might otherwise remain undiscovered. The transformative potential of Deep Research extends beyond individual applications to fundamentally reshape scientific discovery processes. As Sourati and', 'score': 0.6453216, 'raw_content': None}, {'url': 'https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5005162', 'title': 'A Comprehensive Survey of Deep Learning Applications in Big Data ...', 'content': 'Keywords: Deep learning, Big Data Analytics, Convolutional neural networks, recurrent neural networks, Long short-term memory\n\nSuggested Citation:\nSuggested Citation\n\nKhan, Ayaz H., A Comprehensive Survey of Deep Learning Applications in Big Data Analytics: Trends, Techniques, and Future Directions. Available at SSRN:  or \n\n### Ayaz H. Khan (Contact Author)\n\n#### affiliation not provided to SSRN ( email) )\n\nDownload This Paper\n\nOpen PDF in Browser\n\n### 0 References\n\n### 0 Citations [...] Skip to main content\n\nDownload This Paper\n\nOpen PDF in Browser\n\nAdd Paper to My Library\n\nUsing these links will ensure access to this page indefinitely\n\nCopy URL\n\nCopy DOI\n\n# A Comprehensive Survey of Deep Learning Applications in Big Data Analytics: Trends, Techniques, and Future Directions\n\n27 Pages\nPosted: 19 Nov 2024\n\nSee all articles by Ayaz H. Khan\n\n## Ayaz H. Khan\n\naffiliation not provided to SSRN\n\n### Abstract', 'score': 0.63316137, 'raw_content': None}], 'response_time': 0.79, 'request_id': 'd869c154-7179-458d-8b1c-df03ae09e016'})]



Receiving update from node: 'agent'
[AIMessage(content='I found the research paper titled "A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications" authored by Renjun Xu and Jingwen Peng, published on June 14, 2025. \n\nRegarding the authors\' current affiliations, I will now search for each of them to find out where they work now.', additional_kwargs={'tool_calls': [{'id': 'call_Xan9YYzHQZ0CffvyAFN2LBGs', 'function': {'arguments': '{"query": "Renjun Xu"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}, {'id': 'call_vY9adnYm9qZrgNxGRMbrcWQt', 'function': {'arguments': '{"query": "Jingwen Peng"}', 'name': 'tavily_search_results_json'}, 'type': 'function'}], 'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 124, 'prompt_tokens': 2764, 'total_tokens': 2888, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': 'fp_7c233bf9d1', 'id': 'chatcmpl-CJ7I3grg4v6gBIwxHSgJBwL285L0P', 'service_tier': 'default', 'finish_reason': 'tool_calls', 'logprobs': None}, id='run--cf6cafb9-db36-4993-add4-052dff16a20f-0', tool_calls=[{'name': 'tavily_search_results_json', 'args': {'query': 'Renjun Xu'}, 'id': 'call_Xan9YYzHQZ0CffvyAFN2LBGs', 'type': 'tool_call'}, {'name': 'tavily_search_results_json', 'args': {'query': 'Jingwen Peng'}, 'id': 'call_vY9adnYm9qZrgNxGRMbrcWQt', 'type': 'tool_call'}], usage_metadata={'input_tokens': 2764, 'output_tokens': 124, 'total_tokens': 2888, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]



Receiving update from node: 'action'
Tool Used: tavily_search_results_json
[ToolMessage(content='[{"title": "Renjun Xu - Researcher, Zhejiang University - OpenReview", "url": "https://openreview.net/profile?id=~Renjun_Xu1", "content": "# Renjun Xu\\n\\n### Principal Researcher, Zhejiang University\\n\\n#### Names\\n\\n#### Emails\\n\\n#### Personal Links\\n\\n#### Career & Education History\\n\\n#### Advisors, Relations & Conflicts\\n\\nNo relations added\\n\\n#### Expertise\\n\\n#### Publications\\n\\n#### scKGOT: Intercellular Signaling Inference with Knowledge Graph Optimal Transport for Single-cell Transcriptomics)\\n\\n#### $E(2)$-Equivariant Vision Transformer)\\n\\n#### Critical Temperature Prediction of Superconductors Based on Machine Learning: A Short Review) [...] #### Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition)\\n\\n#### Hierarchical knowledge amalgamation with dual discriminative feature alignment)\\n\\n#### Modeling Dynamic Missingness of Implicit Feedback for Sequential Recommendation)\\n\\n#### S2SNet: A Pretrained Neural Network for Superconductivity Discovery)\\n\\n#### Learning Interest-oriented Universal User Representation via Self-supervision) [...] #### Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling)\\n\\n#### Learning Invariant Representations across Domains and Tasks)\\n\\n#### Co-Authors\\n\\nOpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview", "score": 0.6410185}, {"title": "Renjun Xu - Center for Data Science, Zhejiang University | 人才画像", "url": "https://www.aminer.cn/profile/renjun-xu/53f42ceddabfaedd74d30355?source=bz1", "content": "Renjun Xu - Center for Data Science, Zhejiang University | 人才画像 - AMiner\\n\\n\\n\\nResearch\\n\\nCenter for Data Science Zhejiang University\\n\\n、《International Joint Conference on Artificial Intelligence》(IJCAI, CCF-A), 《IEEE Transactions on Knowledge and Data Engineering》(TKDE, CCF-A)交叉领域发表多篇国际顶尖期刊和会议文章，CVPR、AAAI、NIPS、TPAMI、TIP、TLT等顶级人工智能期刊和会议程序委员会委员，荣获2020年度世界人工智能大会青年优秀论文提名奖，指导并推荐的所有学生均已拿到麻省理工学院(MIT)、卡内基梅隆大学(CMU)等全球顶尖名校的offer！\\n\\nEducation\\n\\nSign in to view more\\n\\nExperience\\n\\nSign in to view more [...] Research Interests\\n\\n2012 2025\\n\\nPapers 共 39 篇 Patents 共 9 篇 Author Statistics Co-Author Similar Experts\\n\\nBy Year By Citation 主题筛选 期刊级别筛选 合作者筛选 合作机构筛选\\n\\n时间\\n\\n引用量\\n\\n主题\\n\\n期刊级别\\n\\n合作者\\n\\n合作机构\\n\\nAll 2025 2024 2023 2022 2021 2020 2015 2014 2013 2012 2010 2006\\n\\nDo PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning Vs. Memorization in Large Language Models\\n\\nYang Yan,Yu Lu,Renjun Xu,Zhenzhong Lan\\n\\narXiv · Computation and Language（2025）\\n\\nCited 0 Views 11 Bibtex\\n\\n0\\n\\n11 [...] The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn\\n\\nSwipe to Fine Result", "score": 0.60049355}, {"title": "Renjun Hu\'s Homepage", "url": "https://hurenjun.github.io/", "content": "an algorithm engineer at Alibaba Cloud, contributing to AI-driven transformations across various business\\ndomains including feed recommendation, user growth, online marketing, and LLM-as-a-Judge.\\nSince January 2025, he has joined the School of Data Science and Engineering, East China Normal University\\nas a young researcher.\\nHis recent research interests include robust machine learning and the understanding, evaluation, and applications of large language models. [...] Renjun Hu received his Bachelor\'s degree in 2014 and Ph.D. in 2020 from the School of Computer Science\\nand Engineering at Beihang University. From September 2017 to April 2018, he was a joint Ph.D. student\\nin the Data Mining Group at Rutgers University. He then worked as a research intern at the Business\\nIntelligence Lab of Baidu Research from May 2018 to September 2019. During 2020 to 2024, he served as [...] Renjun\'s avatar\\n\\n### Renjun Hu     (胡仁君)\\n\\n#### Young Researcher\\n\\nSchool of Data Science of Engineering (DaSE)  \\nEast China Normal University (ECNU)\\n\\n \\nRoom X109, Shuxueguan, Putuo Campus\\n\\n \\nrjhu [at] dase.ecnu.edu.cn    renjun0hu [at] gmail.com\\n\\n### Short Bio", "score": 0.59880555}, {"title": "Renjun XU | UCD | Department of Physics | Research profile", "url": "https://www.researchgate.net/profile/Renjun-Xu", "content": "Skills and Expertise: Electronic Structure. Current institution: University of California, Davis. University of California, Davis.", "score": 0.5683445}, {"title": "Xu Genjun - Wikipedia", "url": "https://en.wikipedia.org/wiki/Xu_Genjun", "content": "Xu was born in She County, Anhui on 23 November 1935. He graduated from the Chemistry Department of Fudan University in 1957, and joined the Shanghai Institute of Biochemistry afterwards, where he participated in and made significant contributions to the insulin synthesis project led by Wang Yinglai. He later researched enzyme kinetics, chemical modification of proteins, the relationship between proteins\' structure and function, and the folding and refolding of proteins. [...] Xu published more than 100 research papers in academic journals. He was a two-time winner of the State Natural Science Award (first class), in addition to several national prizes by the Chinese Academy of Sciences (CAS). He was also awarded the Ho Leung Ho Lee Prize for Life Sciences. He was elected as an academician of the CAS in 1991.\\n\\nXu died on 8 January 2008 at Zhongshan Hospital in Shanghai, at the age of 72.\\n\\n## References\\n\\nWikimedia Foundation\\nPowered by MediaWiki [...] Wikipedia\\nThe Free Encyclopedia\\n\\n## Contents\\n\\n# Xu Genjun\\n\\nXu Genjun (Chinese: 许根俊; 23 November 1935 – 8 January 2008) was a Chinese biochemist. He was a professor at the Shanghai Institute of Biochemistry and Cell Biology. He was an academician of the Chinese Academy of Sciences and President of the Chinese Society of Biochemistry and Molecular Biology.\\n\\n## Biography", "score": 0.25375047}]', name='tavily_search_results_json', id='1ee3a6a4-2d41-4ddb-94a2-d6607993b8ee', tool_call_id='call_Xan9YYzHQZ0CffvyAFN2LBGs', artifact={'query': 'Renjun Xu', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://openreview.net/profile?id=~Renjun_Xu1', 'title': 'Renjun Xu - Researcher, Zhejiang University - OpenReview', 'content': '# Renjun Xu\n\n### Principal Researcher, Zhejiang University\n\n#### Names\n\n#### Emails\n\n#### Personal Links\n\n#### Career & Education History\n\n#### Advisors, Relations & Conflicts\n\nNo relations added\n\n#### Expertise\n\n#### Publications\n\n#### scKGOT: Intercellular Signaling Inference with Knowledge Graph Optimal Transport for Single-cell Transcriptomics)\n\n#### $E(2)$-Equivariant Vision Transformer)\n\n#### Critical Temperature Prediction of Superconductors Based on Machine Learning: A Short Review) [...] #### Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition)\n\n#### Hierarchical knowledge amalgamation with dual discriminative feature alignment)\n\n#### Modeling Dynamic Missingness of Implicit Feedback for Sequential Recommendation)\n\n#### S2SNet: A Pretrained Neural Network for Superconductivity Discovery)\n\n#### Learning Interest-oriented Universal User Representation via Self-supervision) [...] #### Learning Universal User Representations via Self-Supervised Lifelong Behaviors Modeling)\n\n#### Learning Invariant Representations across Domains and Tasks)\n\n#### Co-Authors\n\nOpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview', 'score': 0.6410185, 'raw_content': None}, {'url': 'https://www.aminer.cn/profile/renjun-xu/53f42ceddabfaedd74d30355?source=bz1', 'title': 'Renjun Xu - Center for Data Science, Zhejiang University | 人才画像', 'content': 'Renjun Xu - Center for Data Science, Zhejiang University | 人才画像 - AMiner\n\n\n\nResearch\n\nCenter for Data Science Zhejiang University\n\n、《International Joint Conference on Artificial Intelligence》(IJCAI, CCF-A), 《IEEE Transactions on Knowledge and Data Engineering》(TKDE, CCF-A)交叉领域发表多篇国际顶尖期刊和会议文章，CVPR、AAAI、NIPS、TPAMI、TIP、TLT等顶级人工智能期刊和会议程序委员会委员，荣获2020年度世界人工智能大会青年优秀论文提名奖，指导并推荐的所有学生均已拿到麻省理工学院(MIT)、卡内基梅隆大学(CMU)等全球顶尖名校的offer！\n\nEducation\n\nSign in to view more\n\nExperience\n\nSign in to view more [...] Research Interests\n\n2012 2025\n\nPapers 共 39 篇 Patents 共 9 篇 Author Statistics Co-Author Similar Experts\n\nBy Year By Citation 主题筛选 期刊级别筛选 合作者筛选 合作机构筛选\n\n时间\n\n引用量\n\n主题\n\n期刊级别\n\n合作者\n\n合作机构\n\nAll 2025 2024 2023 2022 2021 2020 2015 2014 2013 2012 2010 2006\n\nDo PhD-level LLMs Truly Grasp Elementary Addition? Probing Rule Learning Vs. Memorization in Large Language Models\n\nYang Yan,Yu Lu,Renjun Xu,Zhenzhong Lan\n\narXiv · Computation and Language（2025）\n\nCited 0 Views 11 Bibtex\n\n0\n\n11 [...] The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn\n\nSwipe to Fine Result', 'score': 0.60049355, 'raw_content': None}, {'url': 'https://hurenjun.github.io/', 'title': "Renjun Hu's Homepage", 'content': "an algorithm engineer at Alibaba Cloud, contributing to AI-driven transformations across various business\ndomains including feed recommendation, user growth, online marketing, and LLM-as-a-Judge.\nSince January 2025, he has joined the School of Data Science and Engineering, East China Normal University\nas a young researcher.\nHis recent research interests include robust machine learning and the understanding, evaluation, and applications of large language models. [...] Renjun Hu received his Bachelor's degree in 2014 and Ph.D. in 2020 from the School of Computer Science\nand Engineering at Beihang University. From September 2017 to April 2018, he was a joint Ph.D. student\nin the Data Mining Group at Rutgers University. He then worked as a research intern at the Business\nIntelligence Lab of Baidu Research from May 2018 to September 2019. During 2020 to 2024, he served as [...] Renjun's avatar\n\n### Renjun Hu     (胡仁君)\n\n#### Young Researcher\n\nSchool of Data Science of Engineering (DaSE)  \nEast China Normal University (ECNU)\n\n \nRoom X109, Shuxueguan, Putuo Campus\n\n \nrjhu [at] dase.ecnu.edu.cn    renjun0hu [at] gmail.com\n\n### Short Bio", 'score': 0.59880555, 'raw_content': None}, {'url': 'https://www.researchgate.net/profile/Renjun-Xu', 'title': 'Renjun XU | UCD | Department of Physics | Research profile', 'content': 'Skills and Expertise: Electronic Structure. Current institution: University of California, Davis. University of California, Davis.', 'score': 0.5683445, 'raw_content': None}, {'url': 'https://en.wikipedia.org/wiki/Xu_Genjun', 'title': 'Xu Genjun - Wikipedia', 'content': "Xu was born in She County, Anhui on 23 November 1935. He graduated from the Chemistry Department of Fudan University in 1957, and joined the Shanghai Institute of Biochemistry afterwards, where he participated in and made significant contributions to the insulin synthesis project led by Wang Yinglai. He later researched enzyme kinetics, chemical modification of proteins, the relationship between proteins' structure and function, and the folding and refolding of proteins. [...] Xu published more than 100 research papers in academic journals. He was a two-time winner of the State Natural Science Award (first class), in addition to several national prizes by the Chinese Academy of Sciences (CAS). He was also awarded the Ho Leung Ho Lee Prize for Life Sciences. He was elected as an academician of the CAS in 1991.\n\nXu died on 8 January 2008 at Zhongshan Hospital in Shanghai, at the age of 72.\n\n## References\n\nWikimedia Foundation\nPowered by MediaWiki [...] Wikipedia\nThe Free Encyclopedia\n\n## Contents\n\n# Xu Genjun\n\nXu Genjun (Chinese: 许根俊; 23 November 1935 – 8 January 2008) was a Chinese biochemist. He was a professor at the Shanghai Institute of Biochemistry and Cell Biology. He was an academician of the Chinese Academy of Sciences and President of the Chinese Society of Biochemistry and Molecular Biology.\n\n## Biography", 'score': 0.25375047, 'raw_content': None}], 'response_time': 0.95, 'request_id': 'e5245655-f4db-4404-beca-9ee0ad6860b8'}), ToolMessage(content='[{"title": "Jingwen Peng, CFA - Director - Lead Data Steward at Liberty Mutual ...", "url": "https://www.linkedin.com/in/jingwen-peng-cfa-3a69b011?trk=public_profile_browsemap", "content": "Director - Lead Data Steward at Liberty Mutual Investments · Experience: Liberty Mutual Investments · Location: Boston · 500+ connections on LinkedIn.", "score": 0.70823324}, {"title": "Jingwen Peng - U of Rochester Simon STEM MSBA - LinkedIn", "url": "https://www.linkedin.com/in/jpeng19", "content": "●\\tAuthored product requirement documentation, followed up on new features and optimization based on customer feedback.\\n\\n●\\tDevised creative short videos content, elevated ad CTR by 4 times on Google Ads and refined data-driven strategies.\\n\\n●\\tGrew social media presence through influencer marketing, acquired 1,000+ new users on Instagram, FB, and YouTube. [...] At Amazon, my role as a Site Merchandising analyst was pivotal in enhancing the Kindle E-book business through data-driven strategies<br><br> By automating reports and refining forecasting models, our team significantly improved operational efficiency and influenced strategic decisions that positively impacted the growth of E-book store and Kindle Unlimited<br><br> My technical acumen, particularly in Excel, SQL and Python, was instrumental in automating processes, reducing manual workloads, [...] ## Experience\\n### Marketing Analyst  \\nEmpire City Casino  \\nMay 2025 - Present   \\nYonkers, New York, United States  \\n\\n### Amazon  \\nAmazon  \\nN/A - Present   \\nBeijing, China  \\n\\n### Marketing Analyst  \\nJiyuzhoutian Information Technology  \\nApr 2019 - Sep 2019   \\nBeijing, China  \\n●\\tSEO for website CouponBirds and ASO for app Spark, improved CTR by 50%, enhancing visibility and user acquisition.", "score": 0.6301622}, {"title": "[PDF] SCHWARZMAN SCHOLARS CLASS OF 2018", "url": "https://www.schwarzmanscholars.org/wp-content/uploads/2020/05/Schwarzman-Scholars-Class-of-2018-profiles-126-Scholars.pdf", "content": "Jingwen SUN graduated in May 2017 from the National University of Singapore with a Bachelor’s degree in Computing and Business. She is passionate about innovation and entrepreneurship. She has worked with venture capital funds in Israel and China and also worked with Accenture and Analysys Mason as an analyst intern. Jingwen is excited to join the Schwarzman Scholars program to learn more about leadership within a global context. Jingwen is 23 years old and from China. JONATHAN PADILLA UNITED [...] Assistant at Purdue, supporting both domestic and international students, and served as Vice President of Alpha Kappa Psi. She most recently worked in the Risk Assurance practice at PricewaterhouseCoopers in Chicago. At Schwarzman College, she aspires to amplify her leadership and business skills as both a humanitarian and an entrepreneur to improve and empower the education of youth in developing regions. Jing is 24 years old and from China. JINGWEN SUN CHINA NATIONAL UNIVERSITY OF SINGAPORE [...] over 10 villages in 4 provinces, interviewing over 600 people. He was appointed Academic Organizer of the China Youth Economic Forum and also served as the President of the Student Union of the School of Economics and the Chairman of the Standing Committee of the Students’ Congress. Peng is 22 years old and from China. PRESTON LIM CANADA PRINCETON UNIVERSITY Preston graduated in June 2017 from Princeton University with a major in Near Eastern Studies and a minor in History and the Practice of", "score": 0.38366494}, {"title": "Professor Peng GONG - GEOG | HKU", "url": "https://geog.hku.hk/p-gong", "content": "Guan, D.\\\\, Wang, D., Hallegatte, S. Steven J. Davis, Jingwen Huo, Shuping Li, Yangchun Bai, Tianyang Lei, Qianyu Xue, D’Maris Coffman, Danyang Cheng, Peipei Chen, Xi Liang, Bing Xu, Xiaosheng Lu, Shouyang Wang, Klaus Hubacek & Peng Gong. Global supply-chain effects of COVID-19 control measures. Nature Human Behaviour (2020). 4(6):577-587. . [...] Gong, P\\\\, Bin Chen, Xuecao Li, Han Liu, JieWang, Yuqi Bai, Jingming Chen, Xi Chen, Lei Fang, Shuailong Feng, Yongjiu Feng, Yali Gong, Hao Gu, Huabing Huang, Xiaochun Huang, Hongzan Jiao, Yingdong Kang, Guangbin Lei, Ainong Li, Xiaoting Li, Xun Li, Yuechen Li, Zhilin Li, Zhongde Li, Chong Liu, Chunxia Liu, Maochou Liu, Shuguang Liu, Wanliu Mao, Changhong Miao, Hao Ni, Qisheng Pan, Shuhua Qi, Zhehao Ren, Zhuoran Shan, Shaoqing Shen, Minjun Shi, Yimeng Song, Mo Su, HoiPing Suen, Bo Sun, Fangdi [...] Beijing Normal University, and the Tsinghua Urban Institute at Tsinghua University. In 2020, he led an expert preparation group in establishing the Vanke School of Public Health at Tsinghua.", "score": 0.30464175}, {"title": "Investigators | Center for Computational Biology and Bioinformatics", "url": "https://medicine.iu.edu/research-centers/computational-biology-bioinformatics/investigators", "content": "Associate Professor of Bioinformatics and Data Science\\n\\nRead Bio\\n\\n### Bohdan Khomtchouk, PhD\\n\\nAssistant Professor of Bioinformatics and Data Science\\n\\nRead Bio\\n\\n### Juexin Wang, PhD\\n\\nAssistant Professor of Bioinformatics\\n\\nRead Bio\\n\\n### Jingwen Yan, PhD\\n\\nAssistant Professor of Bioinformatics\\n\\nRead Bio\\n\\n### IU Bloomington Luddy School of Informatics, Computing and Engineering\\n\\n### Yijie Wang, PhD\\n\\nAssistant Professor of Computer Science\\n\\nRead Bio\\n\\n### Purdue University in Indianapolis [...] ### Jing Liu, PhD\\n\\nAssistant Professor of Physics\\n\\nRead Bio\\n\\n## Associate Members\\n\\nJunior computational scientists conducting research in quantitative biomedical sciences\\n\\n### Xiaoqing Huang, PhD\\n\\nAssistant Research Professor of Biostatistics & Health Data Science\\n\\nRead Bio\\n\\n### Gang Peng, PhD\\n\\nAssistant Professor of Medical & Molecular Genetics\\n\\nRead Bio\\n\\n### Jie Ren, PhD\\n\\nAssistant Professor of Biostatistics & Health Data Science\\n\\nRead Bio\\n\\n### Tae-Hwi L. Schwantes-An, PhD [...] Professor of Biochemistry & Molecular Biology\\n\\nRead Bio\\n\\n### Ian Webb, PhD\\n\\nAssistant Professor, Department of Chemistry & Chemical Biology\\n\\nRead Bio\\n\\n### Kai Yang, PhD\\n\\nAssociate Professor of Pediatrics\\n\\nRead Bio\\n\\n### Lei Yang, PhD\\n\\nProfessor of Pediatrics\\n\\nRead Bio\\n\\n### Ji Zhang, PhD\\n\\nAssociate Professor of Pediatrics\\n\\nRead Bio", "score": 0.19255488}]', name='tavily_search_results_json', id='edcda2ce-0519-47f0-8101-5cc3d1d6f5c6', tool_call_id='call_vY9adnYm9qZrgNxGRMbrcWQt', artifact={'query': 'Jingwen Peng', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://www.linkedin.com/in/jingwen-peng-cfa-3a69b011?trk=public_profile_browsemap', 'title': 'Jingwen Peng, CFA - Director - Lead Data Steward at Liberty Mutual ...', 'content': 'Director - Lead Data Steward at Liberty Mutual Investments · Experience: Liberty Mutual Investments · Location: Boston · 500+ connections on LinkedIn.', 'score': 0.70823324, 'raw_content': None}, {'url': 'https://www.linkedin.com/in/jpeng19', 'title': 'Jingwen Peng - U of Rochester Simon STEM MSBA - LinkedIn', 'content': '●\tAuthored product requirement documentation, followed up on new features and optimization based on customer feedback.\n\n●\tDevised creative short videos content, elevated ad CTR by 4 times on Google Ads and refined data-driven strategies.\n\n●\tGrew social media presence through influencer marketing, acquired 1,000+ new users on Instagram, FB, and YouTube. [...] At Amazon, my role as a Site Merchandising analyst was pivotal in enhancing the Kindle E-book business through data-driven strategies<br><br> By automating reports and refining forecasting models, our team significantly improved operational efficiency and influenced strategic decisions that positively impacted the growth of E-book store and Kindle Unlimited<br><br> My technical acumen, particularly in Excel, SQL and Python, was instrumental in automating processes, reducing manual workloads, [...] ## Experience\n### Marketing Analyst  \nEmpire City Casino  \nMay 2025 - Present   \nYonkers, New York, United States  \n\n### Amazon  \nAmazon  \nN/A - Present   \nBeijing, China  \n\n### Marketing Analyst  \nJiyuzhoutian Information Technology  \nApr 2019 - Sep 2019   \nBeijing, China  \n●\tSEO for website CouponBirds and ASO for app Spark, improved CTR by 50%, enhancing visibility and user acquisition.', 'score': 0.6301622, 'raw_content': None}, {'url': 'https://www.schwarzmanscholars.org/wp-content/uploads/2020/05/Schwarzman-Scholars-Class-of-2018-profiles-126-Scholars.pdf', 'title': '[PDF] SCHWARZMAN SCHOLARS CLASS OF 2018', 'content': 'Jingwen SUN graduated in May 2017 from the National University of Singapore with a Bachelor’s degree in Computing and Business. She is passionate about innovation and entrepreneurship. She has worked with venture capital funds in Israel and China and also worked with Accenture and Analysys Mason as an analyst intern. Jingwen is excited to join the Schwarzman Scholars program to learn more about leadership within a global context. Jingwen is 23 years old and from China. JONATHAN PADILLA UNITED [...] Assistant at Purdue, supporting both domestic and international students, and served as Vice President of Alpha Kappa Psi. She most recently worked in the Risk Assurance practice at PricewaterhouseCoopers in Chicago. At Schwarzman College, she aspires to amplify her leadership and business skills as both a humanitarian and an entrepreneur to improve and empower the education of youth in developing regions. Jing is 24 years old and from China. JINGWEN SUN CHINA NATIONAL UNIVERSITY OF SINGAPORE [...] over 10 villages in 4 provinces, interviewing over 600 people. He was appointed Academic Organizer of the China Youth Economic Forum and also served as the President of the Student Union of the School of Economics and the Chairman of the Standing Committee of the Students’ Congress. Peng is 22 years old and from China. PRESTON LIM CANADA PRINCETON UNIVERSITY Preston graduated in June 2017 from Princeton University with a major in Near Eastern Studies and a minor in History and the Practice of', 'score': 0.38366494, 'raw_content': None}, {'url': 'https://geog.hku.hk/p-gong', 'title': 'Professor Peng GONG - GEOG | HKU', 'content': 'Guan, D.\\, Wang, D., Hallegatte, S. Steven J. Davis, Jingwen Huo, Shuping Li, Yangchun Bai, Tianyang Lei, Qianyu Xue, D’Maris Coffman, Danyang Cheng, Peipei Chen, Xi Liang, Bing Xu, Xiaosheng Lu, Shouyang Wang, Klaus Hubacek & Peng Gong. Global supply-chain effects of COVID-19 control measures. Nature Human Behaviour (2020). 4(6):577-587. . [...] Gong, P\\, Bin Chen, Xuecao Li, Han Liu, JieWang, Yuqi Bai, Jingming Chen, Xi Chen, Lei Fang, Shuailong Feng, Yongjiu Feng, Yali Gong, Hao Gu, Huabing Huang, Xiaochun Huang, Hongzan Jiao, Yingdong Kang, Guangbin Lei, Ainong Li, Xiaoting Li, Xun Li, Yuechen Li, Zhilin Li, Zhongde Li, Chong Liu, Chunxia Liu, Maochou Liu, Shuguang Liu, Wanliu Mao, Changhong Miao, Hao Ni, Qisheng Pan, Shuhua Qi, Zhehao Ren, Zhuoran Shan, Shaoqing Shen, Minjun Shi, Yimeng Song, Mo Su, HoiPing Suen, Bo Sun, Fangdi [...] Beijing Normal University, and the Tsinghua Urban Institute at Tsinghua University. In 2020, he led an expert preparation group in establishing the Vanke School of Public Health at Tsinghua.', 'score': 0.30464175, 'raw_content': None}, {'url': 'https://medicine.iu.edu/research-centers/computational-biology-bioinformatics/investigators', 'title': 'Investigators | Center for Computational Biology and Bioinformatics', 'content': 'Associate Professor of Bioinformatics and Data Science\n\nRead Bio\n\n### Bohdan Khomtchouk, PhD\n\nAssistant Professor of Bioinformatics and Data Science\n\nRead Bio\n\n### Juexin Wang, PhD\n\nAssistant Professor of Bioinformatics\n\nRead Bio\n\n### Jingwen Yan, PhD\n\nAssistant Professor of Bioinformatics\n\nRead Bio\n\n### IU Bloomington Luddy School of Informatics, Computing and Engineering\n\n### Yijie Wang, PhD\n\nAssistant Professor of Computer Science\n\nRead Bio\n\n### Purdue University in Indianapolis [...] ### Jing Liu, PhD\n\nAssistant Professor of Physics\n\nRead Bio\n\n## Associate Members\n\nJunior computational scientists conducting research in quantitative biomedical sciences\n\n### Xiaoqing Huang, PhD\n\nAssistant Research Professor of Biostatistics & Health Data Science\n\nRead Bio\n\n### Gang Peng, PhD\n\nAssistant Professor of Medical & Molecular Genetics\n\nRead Bio\n\n### Jie Ren, PhD\n\nAssistant Professor of Biostatistics & Health Data Science\n\nRead Bio\n\n### Tae-Hwi L. Schwantes-An, PhD [...] Professor of Biochemistry & Molecular Biology\n\nRead Bio\n\n### Ian Webb, PhD\n\nAssistant Professor, Department of Chemistry & Chemical Biology\n\nRead Bio\n\n### Kai Yang, PhD\n\nAssociate Professor of Pediatrics\n\nRead Bio\n\n### Lei Yang, PhD\n\nProfessor of Pediatrics\n\nRead Bio\n\n### Ji Zhang, PhD\n\nAssociate Professor of Pediatrics\n\nRead Bio', 'score': 0.19255488, 'raw_content': None}], 'response_time': 1.51, 'request_id': 'c8e27411-93bd-4624-b423-324fb4cfdc36'})]



Receiving update from node: 'agent'
[AIMessage(content='Renjun Xu is currently a Principal Researcher at Zhejiang University. \n\nJingwen Peng is a Director and Lead Data Steward at Liberty Mutual Investments in Boston.', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 33, 'prompt_tokens': 6533, 'total_tokens': 6566, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 2816}}, 'model_name': 'gpt-4.1-nano-2025-04-14', 'system_fingerprint': 'fp_7c233bf9d1', 'id': 'chatcmpl-CJ7I7RW8imXN233X111n9T4j9RAsu', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None}, id='run--adfcd2e6-6f7f-4f60-8d54-a7542c360934-0', usage_metadata={'input_tokens': 6533, 'output_tokens': 33, 'total_tokens': 6566, 'input_token_details': {'audio': 0, 'cache_read': 2816}, 'output_token_details': {'audio': 0, 'reasoning': 0}})]



# 🤝 Breakout Room #2

## Part 1: LangSmith Evaluator

### Pre-processing for LangSmith

To do a little bit more preprocessing, let's wrap our LangGraph agent in a simple chain.

In [16]:
def convert_inputs(input_object):
  return {"messages" : [HumanMessage(content=input_object["text"])]}

def parse_output(input_state):
  return {"answer" : input_state["messages"][-1].content}

agent_chain_with_formatting = convert_inputs | simple_agent_graph | parse_output

agent_chain_with_formatting.invoke({"text" : "What is Deep Research?"})

{'answer': 'Deep Research typically refers to an in-depth, comprehensive investigation or analysis of a particular subject or field. It involves thorough examination of available data, literature, and resources to uncover detailed insights, understand complex concepts, or develop new knowledge. Deep Research is often used in academic, scientific, and technological contexts to signify rigorous and meticulous study beyond surface-level information. \n\nWould you like a more specific definition related to a particular industry or context?'}

### Task 1: Creating An Evaluation Dataset

Just as we saw last week, we'll want to create a dataset to test our Agent's ability to answer questions.

In order to do this - we'll want to provide some questions and some answers. Let's look at how we can create such a dataset below.

```python
questions = [
    {
        "inputs" : {"text" : "Who were the main authors on the 'A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications' paper?"},
        "outputs" : {"must_mention" : ["Peng", "Xu"]}   
    },
    ...,
    {
        "inputs" : {"text" : "Where do the authors of the 'A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications' work now?"},
        "outputs" : {"must_mention" : ["Zhejiang", "Liberty Mutual"]}
    }
]
```

#### 🏗️ Activity #3:

Please create a dataset in the above format with at least 5 questions that pertain to the cohort use-case (more information [here](https://www.notion.so/Session-4-RAG-with-LangGraph-OSS-Local-Models-Eval-w-LangSmith-26acd547af3d80838d5beba464d7e701#26acd547af3d81d08809c9c82a462bdd)), or the use-case you're hoping to tackle in your Demo Day project.

In [19]:
questions = [
    {
        "inputs" : {"text" : "Who were the main authors on the 'A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications' paper?"},
        "outputs" : {"must_mention" : ["Peng", "Xu"]}   
    },
    {
        "inputs" : {"text" : "Where do the authors of the 'A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications' work now?"},
        "outputs" : {"must_mention" : ["Zhejiang", "Liberty Mutual"]}
    },
    {
        "inputs" : {"text" : "What is the main focus of the Deep Research survey paper?"},
        "outputs" : {"must_mention" : ["systems", "methodologies", "applications"]}
    },
    {
        "inputs" : {"text" : "When was the Deep Research survey paper published?"},
        "outputs" : {"must_mention" : ["2025"]}
    },
    {
        "inputs" : {"text" : "What are some examples of Deep Research systems mentioned in the paper?"},
        "outputs" : {"must_mention" : ["OpenAI", "Gemini", "Perplexity"]}
    }
]

Now we can add our dataset to our LangSmith project using the following code which we saw last Thursday!

In [20]:
from langsmith import Client

client = Client()

dataset_name = f"Simple Search Agent - Evaluation Dataset - {uuid4().hex[0:8]}"

dataset = client.create_dataset(
    dataset_name=dataset_name,
    description="Questions about the cohort use-case to evaluate the Simple Search Agent."
)

client.create_examples(
    dataset_id=dataset.id,
    examples=questions
)

{'example_ids': ['c5908ca6-32ad-4681-a84c-1949d50ff505',
  'c4e4b45a-7d3c-46bd-8652-6c7d456288c9',
  '71e05623-1d30-49c2-a28a-3733fbcadb29',
  '848f1fab-cd8d-4721-9abc-fb83e4fac96b',
  '023c88b0-dfdf-4d4f-a800-1410d2da992f'],
 'count': 5}

### Task 2: Adding Evaluators

Let's use the OpenEvals library to product an evaluator that we can then pass into LangSmith!

> NOTE: Examine the `CORRECTNESS_PROMPT` below!

In [21]:
from openevals.prompts import CORRECTNESS_PROMPT
print(CORRECTNESS_PROMPT)

You are an expert data labeler evaluating model outputs for correctness. Your task is to assign a score based on the following rubric:

<Rubric>
  A correct answer:
  - Provides accurate and complete information
  - Contains no factual errors
  - Addresses all parts of the question
  - Is logically consistent
  - Uses precise and accurate terminology

  When scoring, you should penalize:
  - Factual errors or inaccuracies
  - Incomplete or partial answers
  - Misleading or ambiguous statements
  - Incorrect terminology
  - Logical inconsistencies
  - Missing key information
</Rubric>

<Instructions>
  - Carefully read the input and output
  - Check for factual accuracy and completeness
  - Focus on correctness of information rather than style or verbosity
</Instructions>

<Reminder>
  The goal is to evaluate factual correctness and completeness of the response.
</Reminder>

<input>
{inputs}
</input>

<output>
{outputs}
</output>

Use the reference outputs below to help you evaluate the

In [22]:
from openevals.llm import create_llm_as_judge

correctness_evaluator = create_llm_as_judge(
        prompt=CORRECTNESS_PROMPT,
        model="openai:o3-mini", # very impactful to the final score
        feedback_key="correctness",
    )

Let's also create a custom Evaluator for our created dataset above - we do this by first making a simple Python function!

In [23]:
def must_mention(inputs: dict, outputs: dict, reference_outputs: dict) -> float:
  # determine if the phrases in the reference_outputs are in the outputs
  required = reference_outputs.get("must_mention") or []
  score = all(phrase in outputs["answer"] for phrase in required)
  return score

#### ❓ Question #4:

What are some ways you could improve this metric as-is?

> NOTE: Alternatively you can suggest where gaps exist in this method.

One way to improve the `must_mention` metric is to make it more robust to variations in phrasing, capitalization, or minor typos
Currently, it only checks for exact substring matches, which may miss correct answers that use synonyms or paraphrasing
# 
Improvements could include:
- Using case-insensitive matching.
- Incorporating semantic similarity (using embeddings to check if the answer is semantically similar to the required phrase)
- Checking for the presence of all required concepts, not just exact phrases
# 
Gaps in the current method:
- It does not account for context or meaning, only literal string presence
- It may give a false negative if the answer is correct but phrased differently
- It does not penalize for hallucinated or incorrect information


Task 3: Evaluating

All that is left to do is evaluate our agent's response!

In [24]:
results = client.evaluate(
    agent_chain_with_formatting,
    data=dataset.name,
    evaluators=[correctness_evaluator, must_mention],
    experiment_prefix="simple_agent, baseline",  # optional, experiment name prefix
    description="Testing the baseline system.",  # optional, experiment description
    max_concurrency=4, # optional, add concurrency
)

View the evaluation results for experiment: 'simple_agent, baseline-4afe27d8' at:
https://smith.langchain.com/o/64d1906c-49dc-4ba3-ad6b-466372cc69b9/datasets/d89d965e-ff55-4736-8f56-9c79fab7623a/compare?selectedSessions=de6ddd7e-e1f9-4be1-a05c-8952574f7fbc




0it [00:00, ?it/s]

## Part 2: LangGraph with Helpfulness:

### Task 3: Adding Helpfulness Check and "Loop" Limits

Now that we've done evaluation - let's see if we can add an extra step where we review the content we've generated to confirm if it fully answers the user's query!

We're going to make a few key adjustments to account for this:

1. We're going to add an artificial limit on how many "loops" the agent can go through - this will help us to avoid the potential situation where we never exit the loop.
2. We'll add to our existing conditional edge to obtain the behaviour we desire.

First, let's define our state again - we can check the length of the state object, so we don't need additional state for this.

In [25]:
class AgentState(TypedDict):
  messages: Annotated[list, add_messages]

Now we can set our graph up! This process will be almost entirely the same - with the inclusion of one additional node/conditional edge!

#### 🏗️ Activity #4:

Please write markdown for the following cells to explain what each is doing.

cell 66 sets up a new LangGraph StateGraph for the agent, adds the "agent" and "action" nodes, and associates them with their respective functions.


In [26]:
graph_with_helpfulness_check = StateGraph(AgentState)

graph_with_helpfulness_check.add_node("agent", call_model)
graph_with_helpfulness_check.add_node("action", tool_node)

<langgraph.graph.state.StateGraph at 0x116fd5e50>

cell 68 sets the entry point of the graph to the "agent" node, specifying where the execution of the graph should begin.


In [27]:
graph_with_helpfulness_check.set_entry_point("agent")

<langgraph.graph.state.StateGraph at 0x116fd5e50>

cell 70 defines a function called tool_call_or_helpful, which determines the next step in the agent's graph based on the most recent message. It checks if a tool should be called, if the conversation should end, or if the agent's response is helpful enough to stop, using a language model to assess helpfulness.


In [28]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

def tool_call_or_helpful(state):
  last_message = state["messages"][-1]

  if last_message.tool_calls:
    return "action"

  initial_query = state["messages"][0]
  final_response = state["messages"][-1]

  if len(state["messages"]) > 10:
    return "END"

  prompt_template = """\
  Given an initial query and a final response, determine if the final response is extremely helpful or not. Please indicate helpfulness with a 'Y' and unhelpfulness as an 'N'.

  Initial Query:
  {initial_query}

  Final Response:
  {final_response}"""

  helpfullness_prompt_template = PromptTemplate.from_template(prompt_template)

  helpfulness_check_model = ChatOpenAI(model="gpt-4.1-mini")

  helpfulness_chain = helpfullness_prompt_template | helpfulness_check_model | StrOutputParser()

  helpfulness_response = helpfulness_chain.invoke({"initial_query" : initial_query.content, "final_response" : final_response.content})

  if "Y" in helpfulness_response:
    return "end"
  else:
    return "continue"

The next cell adds conditional edges to the graph. It uses the `tool_call_or_helpful` function to decide, after the agent node runs, whether to loop back to the agent, call a tool, or end the conversation, based on the agent's latest response.


In [29]:
graph_with_helpfulness_check.add_conditional_edges(
    "agent",
    tool_call_or_helpful,
    {
        "continue" : "agent",
        "action" : "action",
        "end" : END
    }
)

<langgraph.graph.state.StateGraph at 0x116fd5e50>

The next cell adds an edge from the "action" node back to the "agent" node in the graph, allowing the agent to continue processing after an action is performed.


In [30]:
graph_with_helpfulness_check.add_edge("action", "agent")

<langgraph.graph.state.StateGraph at 0x116fd5e50>

The next cell adds an edge from the "action" node back to the "agent" node in the graph. 
This means that after an action is taken (such as calling a tool), the flow will return to the agent for further processing or decision-making.


In [31]:
agent_with_helpfulness_check = graph_with_helpfulness_check.compile()

The next cell compiles the graph with the helpfulness check into an executable agent. This prepares the graph so it can be run with inputs and process messages according to the defined flow and conditional logic.


In [32]:
inputs = {"messages" : [HumanMessage(content="What are Deep Research Agents?")]}

async for chunk in agent_with_helpfulness_check.astream(inputs, stream_mode="updates"):
    for node, values in chunk.items():
        print(f"Receiving update from node: '{node}'")
        print(values["messages"])
        print("\n\n")

Receiving update from node: 'agent'
[AIMessage(content='Deep Research Agents are advanced AI systems designed to assist with complex research tasks. They leverage deep learning techniques and large datasets to analyze, synthesize, and generate insights across various fields of study. These agents can automate literature reviews, identify relevant information, generate hypotheses, and even assist in experimental design. They are used in academia, industry, and scientific research to accelerate discovery and improve the accuracy and depth of research outcomes. Would you like more detailed information or specific examples of Deep Research Agents?', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 98, 'prompt_tokens': 158, 'total_tokens': 256, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 

## Part 3: LangGraph for the "Patterns" of GenAI

### Task 4: Helpfulness Check of Gen AI Pattern Descriptions

Let's ask our system about the 3 main patterns in Generative AI:

1. Context Engineering
2. Fine-tuning
3. Agents

In [33]:
patterns = ["Context Engineering", "Fine-tuning", "LLM-based agents"]

In [34]:
for pattern in patterns:
  what_is_string = f"What is {pattern} and when did it break onto the scene??"
  inputs = {"messages" : [HumanMessage(content=what_is_string)]}
  messages = agent_with_helpfulness_check.invoke(inputs)
  print(messages["messages"][-1].content)
  print("\n\n")

Context Engineering is a relatively new interdisciplinary field that focuses on designing, managing, and utilizing contextual information to improve the functionality and adaptability of systems, particularly in areas like artificial intelligence, human-computer interaction, and pervasive computing. It involves understanding and engineering the context in which systems operate to enhance their performance, relevance, and user experience.

The concept of Context Engineering began gaining attention in the early 2000s with the rise of ubiquitous computing and context-aware systems. It became more prominent as researchers and practitioners recognized the importance of context in making systems more intelligent and responsive. The term itself and the formalization of the field started to emerge around the mid-2000s, with significant contributions from academia and industry exploring how to systematically capture, model, and utilize context information.

Would you like me to find more detail