Harvey

Harvey is an agent REPL. It is written in Go and designed to use Ollama server to access language models. These can be run locally or remotely since Harvey uses Ollama's web service as the integration point. Harvey is a terminal based application. It will run on Raspberry Pi OS (Raspberry Pi 5 hardware), Linux (arm64 and amd64), Windows (arm64, amd64) and macOS (M1 and above). It is designed to specifically run on a Raspberry Pi 500+ computer.

Features

integrated knowledge base
support for SKILL.md (and extending Harvey by using "compiled" SKILL.md files)
knowledge base with RAG support
includes an innovative session file format based on Fountain screenplay markup. Session files are both human readable (like reading a screenplay) and machine consumable
friendly with other data processing tools and through shared skills with other coding agents

The name Harvey is inspired by the Mary Chase play of the same name. My little agent runs on small computers but is available to those who choose to see its value. Harvey in the play was a Púca, a mythic Celtic spirit who at times was prone to mischief. The Púca chose who could see it but even then the person doing the seeing had to be willing to see it too.

Motivation

I think the current AI hype cycle will likely end with a bang and a crash. Since I started working on Harvey we've already seen the token expenses sky rocket. Many flat rate plans ration tokens. Of course if you opt to pay by tokens those services have every reason to encourage mode token consumption. That leads me to believe that things will not get cheaper. Add to that the investor's interest in aggressive cost recovery and we have a looming problem. You get a digital divide of resource availability.

Where does this leave us? The commercial platforms are just too expensive. There are problematic downsides in additional to costs (example energy consumption, model biases and data privacy issues). There needs to be an off ramp. I think an off ramp is bringing the models back in house. The trouble is that model development has been running on an assumption about ever expanding compute resources. That's not sustainable. It doesn't match the reality today where memory pricing has gone through the ceiling and other computing components are rising too. The external GPU relied on by medium and large models have never been affordable. Time to change course. Time to revisit small models and get efficient.

Harvey is an exploration of the bring things home. This is about local control of the model system and running models on hardware that doesn't have a GPU. Harvey enables me to get useful work done with small models run via Ollama on my Raspberry Pi 500+ desktop. The Pi is a relatively low cost, low power machine. It is enough to run small models. If you have more horse power available Harvey will not mind or stand in your way.

Where I think things are going

I see language models on a continuum like computers. In the beginning computers were large and unaffordable except by governments and the largest of corporations. They took up floors of buildings for a single machine. Eventually they became much smaller and much cheaper. Right now Large Language models are like the huge mainframes of the 1950s and 1960s. Everyone is still thinking in terms of building sized computers. The seeds for a different approach already exist. Ollama is only one example of that. I think real innovation can happen with small models that focus on specific domains and have a direct application. We've built extremely large general purpose models in part because collective the language model community need to see how far we could take the concepts. We already seem to be plateau at huge language models. Time to take a step back and see how small models can be and be really helpful. We need a personal level model system. I think the pieces are there, they just need to be gathered together in a simpler configuration.

Open Models, Small Models combined with generalized REPLs

The language model system space has enough maturity that we can pick the features that matter and ignore much of the initial growing pains. There is an opportunity ripe for creating tooling around the right fitness of use and purpose. The current crop of model systems (May 2026) layout base line features which are straight forward to implement. Coding agents are centered around concepts like sessions, skills, knowledge bases and retrieval augmented generation (RAG). There are plenty of papers and blog posts that describe these features, how they work and how to use them.

The SKILL.md is a good example. It was proposed by Anthropic and adopted by others (example Mistral Vibe, OpenClaw). It is also pretty easy to implement using Ollama server. The concept of RAG and MCP seems to be picking up steam too. These to can be implemented for a resource constrained system. Mozilla AI is working on implementing a way to unify many implementation elements. An example is their any-llm project. Ollama project. It lets you build tools that integrate with many language model systems available today. Then if you build on Ollama and Hugging Face's offerings and have a foundation to roll your own tool. That's how Harvey started.

"Wait, wait, why Harvey? You should use OpenClaw!", OpenClaw is interesting but it is very easy to mis-configure. I wasn't comfortable with that myself. I felt like OpenClaw opened my computing environment up to a whole lot of hurt. I don't want agents running around my personal communication. I don't want them messing with my editor. I want my code agent to stick to a project directory. I want it focused on what I am working on. I want a safe tool as convenient as Claude Code, GitHub CoPilot or Mistral Vibe. It should be transparent in how in works. It should document what it does. It should let me work with the models I chose for a specific task. It should be human scale and not put tentacles into everything else. I am working with language models on my local machine or my private network. Harvey avoids secrets. How Harvey is configured and it's operating data should be visible to me as a human as well as any other language model system I might enlist. Harvey should let me be able to easily direct appropriate material to a remote service while keeping everything else local so I can continue the processing with models in my local machine. Harvey came about because I don't see other applications that work with language models doing filling that niche.

What is model? What is infrastructure?

Commercial SaaS language model services need to keep you engaged. They've done a good job of building up the human user interface. The conversations use technique that seem similar to the BITE recruitment methodology described by Steven Hassan. When you step back and look at actual implementations the ecosystems like those of OpenAI, Anthropic and others are not just the latest frontier model they include considerable effort in implementing the human user interface to keep engagement up. The frontier models are important but it is everything they wrap around is straight up traditional web services. This is true weather they used a language model to generate the code or not. That's telling. The model is important but the user interface and the infrastructure under it is important too.

When building our own language model system most of it can take advantage of traditional software programs and methods too. I suspect you can do allot before sending the resulting text as a prompt to the model for processing. What is done in parallel for the sake of scaling can be done sequentially if we're at the scale of a single user. There is an opportunity on a locally run language model system to scale down instead of up.

Useful small open models are hear today. In the hype that focuses on bigger and more generalized the small specialized models seem undervalued. You can leverage the value if it is easily to run them locally. This is true especially if there are some key additional features like good session management, knowledge bases and RAG. Harvey shows this can be done on small affordable hardware. Hardware that doesn't require new data centers and power plants to be built. I think smaller models, ones that can run at the edge will be the ones that carry weight in the long wrong. To explore the idea of small models on small computers I dreamed up Harvey. We'll have to wait to see where that adventure could leads.

Release Notes

version: 0.0.1c
status: working proof of concept

Authors

Doiel, R. S.

Software Requirements

Go >= 1.26.2

Software Suggestions

For building Harvey and documentation from source.

CMTools >= 0.0.40
Pandoc >= 3.1
GNU Make >= 3

Related resources

Getting Help, Reporting bugs
LICENSE
Installation
About
Documentation Index — Complete list of all Harvey documentation

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
.claude/commands		.claude/commands
cmd/harvey		cmd/harvey
docs		docs
media		media
pagefind		pagefind
.gitignore		.gitignore
.nojekyll		.nojekyll
ARCHITECTURE.html		ARCHITECTURE.html
ARCHITECTURE.md		ARCHITECTURE.md
CITATION.cff		CITATION.cff
CONFIGURATION.html		CONFIGURATION.html
CONFIGURATION.md		CONFIGURATION.md
DOCUMENTATION.html		DOCUMENTATION.html
DOCUMENTATION.md		DOCUMENTATION.md
FOUNTAIN_FORMAT.html		FOUNTAIN_FORMAT.html
FOUNTAIN_FORMAT.md		FOUNTAIN_FORMAT.md
HARVEY.html		HARVEY.html
HARVEY.md		HARVEY.md
Harvey_Skill-Set_Design.html		Harvey_Skill-Set_Design.html
Harvey_Skill-Set_Design.md		Harvey_Skill-Set_Design.md
INSTALL.html		INSTALL.html
INSTALL.md		INSTALL.md
INSTALL_NOTES_Windows.html		INSTALL_NOTES_Windows.html
INSTALL_NOTES_Windows.md		INSTALL_NOTES_Windows.md
INSTALL_NOTES_macOS.html		INSTALL_NOTES_macOS.html
INSTALL_NOTES_macOS.md		INSTALL_NOTES_macOS.md
KNOWLEDGE_BASE.html		KNOWLEDGE_BASE.html
KNOWLEDGE_BASE.md		KNOWLEDGE_BASE.md
LICENSE		LICENSE
Llamafile_notes.html		Llamafile_notes.html
Llamafile_notes.md		Llamafile_notes.md
MODEL_CACHE.html		MODEL_CACHE.html
MODEL_CACHE.md		MODEL_CACHE.md
Makefile		Makefile
RAG_Support_Design.html		RAG_Support_Design.html
RAG_Support_Design.md		RAG_Support_Design.md
README.md		README.md
ROUTING.html		ROUTING.html
ROUTING.md		ROUTING.md
SESSIONS.html		SESSIONS.html
SESSIONS.md		SESSIONS.md
SKILLS.html		SKILLS.html
SKILLS.md		SKILLS.md
TESTING.html		TESTING.html
TESTING.md		TESTING.md
TODO.html		TODO.html
TODO.md		TODO.md
Using_RAGs_with_Harvey.html		Using_RAGs_with_Harvey.html
Using_RAGs_with_Harvey.md		Using_RAGs_with_Harvey.md
about.html		about.html
about.md		about.md
agent_integration_test.go		agent_integration_test.go
agpl-3.0.txt		agpl-3.0.txt
anyllm_client.go		anyllm_client.go
anyllm_client_test.go		anyllm_client_test.go
audit.go		audit.go
codemeta.json		codemeta.json
commands.go		commands.go
commands_test.go		commands_test.go
config.go		config.go
config_test.go		config_test.go
delegate_to_harvey.html		delegate_to_harvey.html
delegate_to_harvey.md		delegate_to_harvey.md
encoderfile_embedder.go		encoderfile_embedder.go
encoderfile_embedder_test.go		encoderfile_embedder_test.go
further_reading.html		further_reading.html
further_reading.md		further_reading.md
getting_started.html		getting_started.html
getting_started.md		getting_started.md
go.mod		go.mod
go.sum		go.sum
harvey.1.html		harvey.1.html
harvey.1.md		harvey.1.md
harvey.7.html		harvey.7.html
harvey.7.md		harvey.7.md
harvey.go		harvey.go
harvey_test.go		harvey_test.go
helptext.go		helptext.go
index.html		index.html
installer.ps1		installer.ps1
installer.sh		installer.sh
knowledge.go		knowledge.go
knowledge_base.db		knowledge_base.db
knowledge_test.go		knowledge_test.go
lear_messages.go		lear_messages.go
links-to-html.lua		links-to-html.lua
model_cache.go		model_cache.go
model_cache_test.go		model_cache_test.go
model_testing_plan.html		model_testing_plan.html
model_testing_plan.md		model_testing_plan.md
models.html		models.html
models.md		models.md
ollama.go		ollama.go
ollama_probe_test.go		ollama_probe_test.go
page.tmpl		page.tmpl
permissions.go		permissions.go
publish.bash		publish.bash
publish.ps1		publish.ps1
rag_support.go		rag_support.go
rag_support_test.go		rag_support_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Harvey

Features

Motivation

Where I think things are going

Open Models, Small Models combined with generalized REPLs

What is model? What is infrastructure?

Release Notes

Authors

Software Requirements

Software Suggestions

Related resources

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Harvey

Features

Motivation

Where I think things are going

Open Models, Small Models combined with generalized REPLs

What is model? What is infrastructure?

Release Notes

Authors

Software Requirements

Software Suggestions

Related resources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages