#### Sunday, April 14, 2024

mamba activate langchain3

This all runs locally against LMStudio running "TheBloke/NexusRaven-V2-13B-GGUF" in one pass.

I also ran it against OpenAI just to see how the output differs. 

* OpenAI Start April 2024 Monthly Spend: $1.79
* OpenAI End   April 2024 Monthly Spend: $1.79

When running 'agent_executor.invoke' SELECT WHICH CELL YOU WANT TO RUN BASED ON THE VALUE SET FOR 'useOpenAI' I WANT TO RETAIN THE OUTPUT FOR BOTH ANY CALL TO LMSTUDIO AND ALSO FOR OPENAI!


Running LMStudio with "TheBloke/NexusRaven-V2-13B-GGUF". To get this to run, you have to set n_gpu_layers = 38, even though it has 40 layers.

In [1]:
# Example: reuse your existing OpenAI setup
from openai import OpenAI

# Point to the local server
client = OpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio")

completion = client.chat.completions.create(
  model="TheBloke/NexusRaven-V2-13B-GGUF",
  messages=[
    {"role": "system", "content": "Always answer in rhymes."},
    {"role": "user", "content": "Introduce yourself."}
  ],
  temperature=0.7,
)

print(completion.choices[0].message)

ChatCompletionMessage(content="Hello! My name is LLaMA, I'm a large language model trained by a team of researcher at Meta AI. I'm here to help you with any questions or problems you have, within the scope of my training data. How can I assist you today?", role='assistant', function_call=None, tool_calls=None)


Wow! I did not notice that before, but loading the "TheBloke/NexusRaven-V2-13B-GGUF" loads to both the 4090 and the 1050! Damn, is LMStudio ever awesome!!

In [2]:
!nvidia-smi

Sun Apr 14 11:34:33 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.29.06              Driver Version: 545.29.06    CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA GeForce GTX 1050        Off | 00000000:01:00.0  On |                  N/A |
|  0%   58C    P0              N/A /  70W |   1789MiB /  2048MiB |      5%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce RTX 4090        Off | 00000000:02:00.0 Off |  

# Observability

This notebook shows off different levels of LangChain observability. For more information, see:

- [LangChain Debugging Guide](https://python.langchain.com/docs/guides/debugging)

- [LangSmith](https://smith.langchain.com/)

## No Logging

In [3]:
# mamba install conda-forge::langchain-openai

In [4]:
import langchain
langchain.__version__

'0.1.15'

In [5]:
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_openai import ChatOpenAI
from langchain import hub
from langchain.agents import create_openai_functions_agent
from langchain.agents import AgentExecutor

In [6]:
# pip install langchainhub

In [7]:
# Get the prompt to use - you can modify this!
prompt = hub.pull("hwchase17/openai-functions-agent")
print(prompt)

input_variables=['agent_scratchpad', 'input'] input_types={'chat_history': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]], 'agent_scratchpad': typing.List[typing.Union[langchain_core.messages.ai.AIMessage, langchain_core.messages.human.HumanMessage, langchain_core.messages.chat.ChatMessage, langchain_core.messages.system.SystemMessage, langchain_core.messages.function.FunctionMessage, langchain_core.messages.tool.ToolMessage]]} metadata={'lc_hub_owner': 'hwchase17', 'lc_hub_repo': 'openai-functions-agent', 'lc_hub_commit_hash': 'a1655024b06afbd95d17449f21316291e0726f13dcfaf990cc0d18087ad689a5'} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=[], template='You are a helpful assistant')), MessagesPlaceholder(variable_name='

In [21]:
# Toggle between some local model and OpenAI GPT-3
# llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
useOpenAI = False

In [22]:
import os
if useOpenAI:
    llm = ChatOpenAI(model="gpt-3.5-turbo", api_key=os.environ['OPENAI_API_KEY_'],  temperature=0)
else:
    llm = ChatOpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio", temperature=0)

In [23]:
search = TavilySearchResults()
tools = [search]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

In [11]:
# if useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})

{'input': 'what is the weather in sf?',
 'output': 'The weather in San Francisco is currently cloudy with light rain expected later in the day. You can find more detailed information and forecasts on the following websites:\n1. [NOAA National Weather Service](https://forecast.weather.gov/zipcity.php?inputstring=San%20francisco,CA)\n2. [The Weather Channel](https://weather.com/weather/tenday/l/San%20Francisco%20CA%20USCA0987:1:US)\n3. [Weather Underground](https://www.wunderground.com/forecast/us/ca/san-francisco)\n4. [AccuWeather](https://www.accuweather.com/en/us/san-francisco/94103/weather-forecast/347629)'}

In [24]:
# if not useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})

{'input': 'what is the weather in sf?',
 'output': "The current temperature in San Francisco is 62 degrees Fahrenheit. The humidity is 50%, and the wind speed is 10 mph. It's a beautiful day!"}

## `verbose=True`

In [25]:
from langchain.globals import set_verbose

set_verbose(True)

prompt = hub.pull("hwchase17/openai-functions-agent")

In [13]:

# use whatever was set above ...
# llm = ChatOpenAI(base_url="http://localhost:1234/v1", api_key="lm-studio", temperature=0)

In [26]:
search = TavilySearchResults()
tools = [search]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

In [15]:
# if useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3m
Invoking: `tavily_search_results_json` with `{'query': 'weather in San Francisco'}`


[0m[36;1m[1;3m[{'url': 'https://www.weatherapi.com/', 'content': "{'location': {'name': 'San Francisco', 'region': 'California', 'country': 'United States of America', 'lat': 37.78, 'lon': -122.42, 'tz_id': 'America/Los_Angeles', 'localtime_epoch': 1713108658, 'localtime': '2024-04-14 8:30'}, 'current': {'last_updated_epoch': 1713108600, 'last_updated': '2024-04-14 08:30', 'temp_c': 8.9, 'temp_f': 48.0, 'is_day': 1, 'condition': {'text': 'Partly cloudy', 'icon': '//cdn.weatherapi.com/weather/64x64/day/116.png', 'code': 1003}, 'wind_mph': 2.2, 'wind_kph': 3.6, 'wind_degree': 10, 'wind_dir': 'N', 'pressure_mb': 1018.0, 'pressure_in': 30.06, 'precip_mm': 0.0, 'precip_in': 0.0, 'humidity': 86, 'cloud': 75, 'feelslike_c': 6.8, 'feelslike_f': 44.2, 'vis_km': 16.0, 'vis_miles': 9.0, 'uv': 2.0, 'gust_mph': 11.3, 'gust_kph': 18.2}}"}, {'url': 'ht

{'input': 'what is the weather in sf?',
 'output': 'The current weather in San Francisco is partly cloudy with a temperature of 48.0°F (8.9°C). The wind is coming from the north at 3.6 km/h, and the humidity is at 86%.'}

In [27]:
# if not useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mThe current temperature in San Francisco is 62 degrees Fahrenheit. The humidity is 50%, and the wind speed is 10 mph. It's a beautiful day![0m

[1m> Finished chain.[0m


{'input': 'what is the weather in sf?',
 'output': "The current temperature in San Francisco is 62 degrees Fahrenheit. The humidity is 50%, and the wind speed is 10 mph. It's a beautiful day!"}

## `debug=True`

In [28]:
from langchain.globals import set_debug

set_verbose(False)
set_debug(True)

prompt = hub.pull("hwchase17/openai-functions-agent")
# same code as above, so just comment out the next line ...
# llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
search = TavilySearchResults()
tools = [search]
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

In [17]:
# if  useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})

[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor] Entering Chain run with input:
[0m{
  "input": "what is the weather in sf?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence > 3:chain:RunnableAssign<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence > 3:chain:RunnableAssign<agent_scratchpad> > 4:chain:RunnableParallel<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence > 3:chain:RunnableAssign<agent_scratchpad> > 4:chain:RunnableParallel<agent_scratchpad> > 5:chain:RunnableLambda] Entering Chain run with input:
[0m{
  "input": ""
}
[36;1m[1;3m[chain/end][0m [1m[1:chain:AgentExecutor > 2:ch

Error in ConsoleCallbackHandler.on_tool_end callback: AttributeError("'list' object has no attribute 'strip'")


[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 10:chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 10:chain:RunnableSequence > 11:chain:RunnableAssign<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 10:chain:RunnableSequence > 11:chain:RunnableAssign<agent_scratchpad> > 12:chain:RunnableParallel<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 10:chain:RunnableSequence > 11:chain:RunnableAssign<agent_scratchpad> > 12:chain:RunnableParallel<agent_scratchpad> > 13:chain:RunnableLambda] Entering Chain run with input:
[0m{
  "input": ""
}
[36;1m[1;3m[chain/end][0m [1m[1:chain:AgentExecutor > 10:chain:RunnableSequence > 11:chain:RunnableAssign<agent_scratchpad> > 12:chain:RunnableParallel<agent_scratchpad> > 13:chain:Runna

{'input': 'what is the weather in sf?',
 'output': 'The current weather in San Francisco is partly cloudy with a temperature of 48.0°F (8.9°C). The wind is coming from the north at 3.6 km/h, and the humidity is at 86%.'}

In [29]:
# if  not useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})

[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor] Entering Chain run with input:
[0m{
  "input": "what is the weather in sf?"
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence > 3:chain:RunnableAssign<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence > 3:chain:RunnableAssign<agent_scratchpad> > 4:chain:RunnableParallel<agent_scratchpad>] Entering Chain run with input:
[0m{
  "input": ""
}
[32;1m[1;3m[chain/start][0m [1m[1:chain:AgentExecutor > 2:chain:RunnableSequence > 3:chain:RunnableAssign<agent_scratchpad> > 4:chain:RunnableParallel<agent_scratchpad> > 5:chain:RunnableLambda] Entering Chain run with input:
[0m{
  "input": ""
}
[36;1m[1;3m[chain/end][0m [1m[1:chain:AgentExecutor > 2:ch

{'input': 'what is the weather in sf?',
 'output': "The current temperature in San Francisco is 62 degrees Fahrenheit. The humidity is 50%, and the wind speed is 10 mph. It's a beautiful day!"}

## LangSmith

In [30]:
import os
import getpass

os.environ["LANGCHAIN_TRACING_V2"]="true"
os.environ["LANGCHAIN_ENDPOINT"]="https://api.smith.langchain.com"
# LANGCHAIN_API_KEY is already an environment variable ...
# os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

So of course this will not work if we are not using OpenAI ... right?? ... Nope! It runs without any errors!

In [31]:
set_debug(False)

In [20]:
# if useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})

{'input': 'what is the weather in sf?',
 'output': 'The current weather in San Francisco is partly cloudy with a temperature of 48.0°F (8.9°C). The wind speed is 3.6 kph coming from the north, and the humidity is at 86%.'}

In [32]:
# if not useOpenAI:
agent_executor.invoke({"input": "what is the weather in sf?"})

{'input': 'what is the weather in sf?',
 'output': "The current temperature in San Francisco is 62 degrees Fahrenheit. The humidity is 50%, and the wind speed is 10 mph. It's a beautiful day!"}