The growing popularity of projects like llama.cpp, Ollama, GPT4All, and llamafile highlights the increasing demand for running LLMs locally on personal devices.

This offers two key advantages:

1. Privacy: Your data stays on your device, avoiding third-party services and their terms of service.
2. Cost: There are no inference fees, which is crucial for token-heavy applications like long-running simulations or summarizations.


Overview

Running a local LLM requires a few components:

Open-source LLM: 
A freely modifiable and shareable LLM.

Inference capability:
The ability to run the model locally with acceptable performance.

Users now have access to a rapidly expanding range of open-source LLMs.
Several frameworks have been developed to enable LLM inference on personal devices:

llama.cpp: A C++ implementation for llama model inference with optimizations like quantization.

gpt4all: Optimized C-based backend for efficient inference.

Ollama: Bundles model weights and environment into an app for running the model locally.

llamafile: Packages model weights and dependencies into a single file, allowing local execution without additional setup.
These frameworks typically provide:


In [4]:
%pip install -qU langchain_ollama

Note: you may need to restart the kernel to use updated packages.


In [7]:
from langchain_ollama import OllamaLLM

llm = OllamaLLM(model="llama3.2")

llm.invoke("The first man on the moon was ...")

'...Neil Armstrong. He stepped out of the lunar module Eagle and became the first person to set foot on the Moon\'s surface on July 20, 1969 as part of the Apollo 11 mission. His famous quote upon setting foot on the Moon was: "That\'s one small step for man, one giant leap for mankind."'

In [20]:
from langchain_ollama import ChatOllama
llm = ChatOllama(model="llama3.2")

# llm.invoke("The first man on the moon was ...")


In [8]:
for chunk in llm.stream("The first man on the moon was ..."):
    print(chunk, end="|", flush=True)

...|Neil| Armstrong|.| He| became| the| first| person| to| set| foot| on| the| moon| on| July| |20|,| |196|9|,| during| the| Apollo| |11| mission|.||

In [21]:
from langchain_core.tools import tool
data_dict = {}

@tool
def add_to_dict(id:int,name:str):
    """ Adds information abount  a persion id and name to the dictionary """
    data_dict[id] = name 

@tool 
def remove_from_dict(id:int):
    """Removes the element from the dictionary for the id passed"""
    data_dict.pop(id)

tools = [add_to_dict,remove_from_dict]

In [22]:
llm_with_tools =  llm.bind_tools(tools)

In [23]:
query = "We got a new customer with id as 1 and his name is Praveen Reddy C"

llm_with_tools.invoke(query).tool_calls


[{'name': 'add_to_dict',
  'args': {'id': '1', 'name': 'Praveen Reddy C'},
  'id': '77dbd4b9-74fe-4cc0-b4fd-ee9410df4248',
  'type': 'tool_call'}]

In [36]:
from langchain_core.messages import HumanMessage

query = "We got a new customer with id as 1 and his name is Praveen Reddy C"
messages = [HumanMessage(query)]
ai_msg = llm_with_tools.invoke(messages)
messages.append(ai_msg)

In [37]:
messages

[HumanMessage(content='We got a new customer with id as 1 and his name is Praveen Reddy C', additional_kwargs={}, response_metadata={}),
 AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2024-10-19T06:18:12.299209Z', 'message': {'role': 'assistant', 'content': '', 'tool_calls': [{'function': {'name': 'add_to_dict', 'arguments': {'id': '1', 'name': 'Praveen Reddy C'}}}]}, 'done_reason': 'stop', 'done': True, 'total_duration': 9220814916, 'load_duration': 7405923416, 'prompt_eval_count': 242, 'prompt_eval_duration': 862802000, 'eval_count': 29, 'eval_duration': 929356000}, id='run-5e13c939-29b0-4282-9d73-f5d464a633e4-0', tool_calls=[{'name': 'add_to_dict', 'args': {'id': '1', 'name': 'Praveen Reddy C'}, 'id': '637d55b3-92b1-4f3a-9bd4-34d33e648955', 'type': 'tool_call'}], usage_metadata={'input_tokens': 242, 'output_tokens': 29, 'total_tokens': 271})]

In [38]:
ai_msg.tool_calls

[{'name': 'add_to_dict',
  'args': {'id': '1', 'name': 'Praveen Reddy C'},
  'id': '637d55b3-92b1-4f3a-9bd4-34d33e648955',
  'type': 'tool_call'}]

In [39]:
tool_call["name"].lower()

'add_to_dict'

In [46]:
for tool_call in ai_msg.tool_calls:
    method_name = tool_call["name"].lower()
    method = globals().get(method_name)
    res = method(*list(data)) 
    print("result -->",res)
    # Create a ToolMessage to append to messages
    tool_msg = f"Tool '{tool_call['name']}' executed: {res}"
    print("##",tool_msg,"##")
    messages.append(tool_msg)

messages

result --> Added Praveen Reddy C with ID 1 to the dictionary.


[HumanMessage(content='We got a new customer with id as 1 and his name is Praveen Reddy C', additional_kwargs={}, response_metadata={}),
 AIMessage(content='', additional_kwargs={}, response_metadata={'model': 'llama3.2', 'created_at': '2024-10-19T06:18:12.299209Z', 'message': {'role': 'assistant', 'content': '', 'tool_calls': [{'function': {'name': 'add_to_dict', 'arguments': {'id': '1', 'name': 'Praveen Reddy C'}}}]}, 'done_reason': 'stop', 'done': True, 'total_duration': 9220814916, 'load_duration': 7405923416, 'prompt_eval_count': 242, 'prompt_eval_duration': 862802000, 'eval_count': 29, 'eval_duration': 929356000}, id='run-5e13c939-29b0-4282-9d73-f5d464a633e4-0', tool_calls=[{'name': 'add_to_dict', 'args': {'id': '1', 'name': 'Praveen Reddy C'}, 'id': '637d55b3-92b1-4f3a-9bd4-34d33e648955', 'type': 'tool_call'}], usage_metadata={'input_tokens': 242, 'output_tokens': 29, 'total_tokens': 271}),
 "Tool 'add_to_dict' executed: Added Praveen Reddy C with ID 1 to the dictionary."]

In [47]:
data_dict

{'1': 'Praveen Reddy C'}