Skip to content

wwebec/hammer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HAMMER Logo

HAMMER

Hybrid-memory Agent Model with Modular Execution and Recall

HAMMER (Hybrid-memory Agent Model with Modular Execution and Recall) is a memory-enabled agent built around IBM Granite models.
It shows how to:

  • Attach persistent, structured memory to a stateless LLM.
  • Keep all data local (SQLite + vector store on your machine).
  • Expose the agent both as a REST API and as a set of MCP tools via the MCP Forge Gateway.

In short: HAMMER demonstrates how to move from single-shot chatbots to agents that can remember, summarize and reuse context across longer workflows.

Key Features

  • Three-layer memory

    • Short-term RAM buffer for the current turn window.
    • SQLite summary store with compact bullet-point summaries.
    • Vector store (Chroma) using Granite Embedding 30M for semantic recall.
  • Stateless LLM, stateful agent

    • The Granite LLM remains stateless.
    • All state lives in the memory modules, accessible via the agent loop.
  • Modular MCP integration

    • HAMMER exposes its capabilities as MCP tools through the MCP Forge Gateway.
    • MCP clients (e.g. editors, other agents) can call HAMMER as a tool.
  • Local-first

    • All memory and indexes are stored on the local machine.
    • Suitable for privacy-sensitive experiments and PoCs.
  • Transparent introspection

    • Debug endpoints for inspecting RAM buffer, summaries and vector hits.
    • Easy to understand what the agent “remembers” and why.

Architecture Overview

At a high level:

  • Clients

    • CLI / local tools.
    • MCP clients (e.g. Claude, VS Code, watsonx) via MCP Forge.
  • HAMMER Core

    • cli.py – terminal entrypoint.
    • api.py – FastAPI layer (/chat, /summarize_session, /memory_query, /debug_state).
    • llm.py – Granite 4.0-H-Micro (3B) wrapper.
    • embedder.py – Granite Embedding 30M.
    • memory.py – RAM buffer + SQLite summaries + vector store.
  • Memory backends

    • RAM: in-process conversation buffer.
    • SQLite**: compact summaries per session.
    • Vector store: embeddings + similarity search (Chroma).
  • MCP Forge Gateway

    • Runs locally and exposes HAMMER as a set of MCP tools.
    • Provides admin UI, metrics and configuration.
Diagram 1: System Context & Integrations (Click to expand)
                         +---------------------+
                         |  Human User         |
                         +----------+----------+
                                    |
                 (A) CLI            |           (B) MCP Client
              (local terminal)      |      (Claude, VS Code, watsonx)
                                    |
                         +----------v----------+
                         |  HAMMER CLI        |
                         |  (cli.py)          |
                         +----------+---------+
                                    |
                                    |  HTTP (REST: /chat, ...)
                                    |
                         +----------v----------+         +-------------------------+
                         |  HAMMER API         |         |  MCP Client (external)  |
                         |  (api.py / FastAPI) |         |  (implements MCP spec)  |
                         +----------+----------+         +-----------+-------------+
                                    |                                |
                          direct REST calls                          |
                           from CLI / curl                           |
                                    |                                |
                                    |                                |
                                    |                     MCP JSON-RPC (tools/call)
                                    |                                |
                                    |                     +----------v-----------+
                                    |                     | MCP Context Forge    |
                                    |                     | Gateway              |
                                    |                     | (mcpgateway)         |
                                    |                     +----------+-----------+
                                    |                                |
                                    |               REST tools bridge (HTTP)
                                    +----------------v-------------------------+
                                                     |
                                          +----------v----------+
                                          |  HAMMER API         |
                                          |  (api.py / FastAPI) |
                                          +---------------------+
Diagram 2: Core Architecture & Data Flow (Click to expand)
                         +----------------------+
HTTP /chat, /summarize → |      API Layer       |
HTTP /memory_query       |   (FastAPI: api.py)  |
HTTP /debug_state        +----------+-----------+
                                    |
                                    | calls into core logic
                                    v
                         +----------------------+
                         |    HAMMER Core       |
                         |  (agent loop, glue)  |
                         +----------+-----------+
                                    |
        +---------------------------+---------------------------+
        |                           |                           |
        v                           v                           v
+---------------+           +---------------+            +----------------+
| Short-term    |           | SummaryStore  |            | VectorStore    |
| Memory (RAM)  |           | (SQLite)      |            | (Chroma, etc.) |
| ShortTerm     |           | get/add       |            | query/add      |
+-------+-------+           +-------+-------+            +--------+-------+
        |                           |                             |
        |                           |                             |
        +-------------+-------------+-----------------------------+
                      |
                      v
               +--------------+
               |  Prompt      |
               |  Builder     |
               +------+-------+
                      |
                      v
             +--------------------+
             | Granite LLM        |
             | (llm.py / generate |
             |  e.g. 4.0-h-micro) |
             +---------+----------+
                       |
                       v
                +-------------+
                |  Reply      |
                |  Text       |
                +------+------+ 
                       |
                       |
              back to API Layer
                       |
                       v
              HTTP response to:
              - CLI
              - curl
              - MCP Gateway (tools)

Installation

Prerequisites

  • Python 3.10+
  • git
  • Optional but recommended:
    • Access to IBM Granite models
    • MCP Forge Gateway (installed via mcpgateway)

1. Clone the repository

git clone https://github.ibm.com/wojciech-lebek/hammer.git
cd hammer

2. Create, activate a virtual environment and install Python dependencies

make setup

3. MCP Forge Gateway via Makefile

The mcpgateway/makefile automates typical MCP tasks:

  1. Creating a virtualenv for the MCP Gateway.
  2. Installing mcpgateway.
  3. Starting the Gateway locally.
  4. Registering HAMMER API endpoints as MCP tools.
  5. Opening the admin UI in the browser.

Inspect available targets with:

cd mcpgateway
make help   # if defined, or open makefile to see targets

Quickstart

This section shows two end-to-end demos:

  • Scenario A: Local API + hybrid memory.
  • Scenario B: MCP mode via MCP Forge Gateway.

Scenario A – Local API

1. Start the HAMMER API

make run-api

Expected output (example):

Uvicorn running on http://127.0.0.1:9001

2. Inspect empty memory

curl -s "http://127.0.0.1:9001/debug_state?session_id=demo" | jq

You should see an empty buffer, no summaries and no vector hits.

3. Chat with the agent

curl -s -X POST "http://127.0.0.1:9001/chat" \
  -H "Content-Type: application/json" \
  -d '{
        "session_id": "demo",
        "message": "Explain what HAMMER is in 3 bullet points."
      }' | jq

This writes the interaction into RAM and (depending on configuration) the vector store.

4. Force a summary into SQLite

curl -s -X POST "http://127.0.0.1:9001/summarize_session" \
  -H "Content-Type: application/json" \
  -d '{"session_id":"demo","max_bullets":3}' | jq

A compact SQLite summary is created for session demo.
Subsequent calls can retrieve and reuse this summary.

Scenario B – MCP Mode

This scenario uses MCP Forge Gateway so that MCP-compatible clients can call HAMMER as a tool.

1. Install the MCP Gateway

make gateway-venv 
make gateway-install
make gateway-token

2. Register all HAMMER endpoints as tools

  make gateway-register-hammer 

3. Start the MCP Gateway

make run-gateway

Then open MCP Gateway UI in your browser:

make gateway-ui-tools

or open MCP Gateway UI metrics in your browser:

make gateway-ui-metrics

You should see HAMMER tools registered in the admin UI (for example tools exposing chat, memory inspection and architecture explanation).

2. Connect an MCP client

Configure your MCP-aware client (e.g. CLI, editor plugin or another agent) to use:

  • MCP Gateway URL: http://127.0.0.1:4444
  • The registered HAMMER tools (e.g. hammer_chat, hammer_debug_state, etc.)

3. Ask the client to use HAMMER

Example prompts from the MCP client:

  • “Use the HAMMER tool to explain its own architecture.”
  • “Use HAMMER to list what we discussed in this session so far.”
  • “Call HAMMER’s debug tool and show me the current memory layers.”

Key effects:

  • MCP turns HAMMER into a tool-providing service.
  • The LLM on the client side chooses which HAMMER tool to call.
  • You can use HAMMER in multi-tool and multi-agent workflows.

HTTP API

HAMMER exposes a small set of HTTP endpoints (FastAPI in api.py):

  • POST /chat

    • Input: {"session_id": "...", "message": "..." }
    • Output: model reply + metadata
    • Behavior: runs the full agent loop (read memory → call Granite → update memory).
  • GET /debug_state

    • Query: session_id=...
    • Output: JSON dump of RAM buffer, summaries and vector hits.
    • Useful for introspection and debugging.
  • POST /summarize_session

    • Input: {"session_id": "...", "max_bullets": 3 }
    • Output: generated summary.
    • Behavior: writes a compact summary into the SQLite store.
  • POST /memory_query

    • Input: query text and/or filters.
    • Output: retrieved summaries and vector matches.

See doc/arch_api.md for a detailed diagram and explanation of the call flow

or open API Docs in your browser:

make api-docs

Directory Structure

HAMMER/
├── README.md                # This file
├── makefile                 # Root automation (MCP Gateway, tools, helper targets)
├── requirements.txt         # Python dependencies

├── src/
│   ├── cli.py               # CLI entrypoint
│   ├── api.py               # FastAPI server
│   ├── llm.py               # Granite LLM wrapper
│   ├── embedder.py          # Granite embedding wrapper
│   └── memory.py            # Hybrid memory implementation

├── mcpgateway/              # MCP Forge Gateway configuration and helper Makefile
├── doc/                     # Architecture docs, demos, slides, Q&A
├── img/                     # Logos and diagrams
├── wp/                      # Related papers / whitepapers on agent memory

For detailed walkthroughs, see:

  • doc/hammer_local_demo_md.md – local API demo.
  • doc/hammer_mcp_demo.md – MCP demo.
  • doc/arch-overview.md – high-level system overview.
  • doc/arch_api.md – API-layer architecture.

Use Cases

Typical scenarios:

  • Enterprise PoC for agent memory

    • Show stakeholders how a stateless Granite model can be extended with persistent memory.
    • Compare behavior with and without hybrid memory.
  • Tool-integrated agents

    • Use HAMMER as a tool in MCP workflows (e.g. with other MCP servers).
    • Let a higher-level orchestrator decide when to call HAMMER vs. other tools.
  • Research and experimentation

    • Evaluate different memory policies (e.g. when to summarize vs. store full turns).
    • Experiment with alternative summary schemas or vector stores.

Citation

If you refer to HAMMER in internal documents or presentations:

@misc{lebek2025hammer,
  author = {Lebek, W.},
  title = {HAMMER: Hybrid-memory Agent Model with Modular Execution and Recall},
  year = {2025},
  publisher = {IBM CIC Schweiz},
  howpublished = {\url{https://github.ibm.com/wojciech-lebek/hammer.git}}
}

License

Licensed under the Apache License 2.0

About

Hybrid Agent Memory Model for Enhanced Recall

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors