Home

Nous

Nous (νοῦς, Greek: mind / active intellect) is a persistent epistemic substrate for AI.

This repository should be read as a research program with a live implementation. It is not just a wrapper around model output. It is a system that stores typed relations, maintains graded uncertainty, evolves structurally across time, and carries memory between interactions.

Language models are the larynx in this picture, not the mind.

Why this category is different

Benchmarking Nous with standard LLM benchmarks would be like measuring the sweetness of chocolate with the Scoville scale. The issue is not minor inaccuracy. The issue is category error: the instrument was built to measure a different phenomenon.

Benchmarks such as MMLU, ARC, and HumanEval are valid instruments for language models. Nous is not merely a language model output surface. It is a persistent epistemic substrate.

→ The Larynx Problem · Benchmark

What the artifact already contains

stores typed, evidence-scored relations instead of only text chunks
exposes explicit uncertainty and contradiction boundaries
runs a continuous cognitive loop between interactions
consolidates memory asynchronously
supports cross-domain bridge formation and bisociation

Reference evidence

Benchmarking is instrumentation here, not the identity of the project.

Historical reference run

In a documented reference run, an 8B model with Nous-grounded memory outperformed a 70B baseline on a domain-specific relational benchmark.

Model	Memory	Score	Questions
llama3.1-8b	—	46%	60
llama-3.3-70b	—	47%	60
llama3.1-8b	✓ Nous	96%	60

This is useful evidence, but it is not the full story. It still measures answer quality at a moment in time. That is why FNC-Bench exists.

FNC-Bench

FNC-Bench is the repo's epistemic benchmark suite. It asks different questions:

does the system know that it does not know?
does it preserve belief under contradiction?
does stated confidence track real knowledge?
does the substrate change coherently across time?

→ Benchmark

Reading order

The shortest path through the category claim and implementation is:

Page	Description
The Larynx Problem	Why language output is not the same thing as intelligence
Benchmark	Why standard LLM benchmarks do not apply to Nous
Architecture	Substrate structure, memory loop, graph runtime
Getting Started	Install, daemon setup, first query
Intent Disambiguation Effect	Why graph grounding changes model behavior
Contributing	Ways to contribute code, docs, benchmarks, and datasets
Lab Notes	Dated research notes, strategic documents, external communications
FAQ	Common questions

Links

PyPI: pypi.org/project/nouse
GitHub: base76-research-lab/Nous
License: MIT

Contact


𝕏 / Twitter	@Q_for_qualia
LinkedIn	bjornshomelab
Email	bjorn@base76research.com
Issues	GitHub Issues

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Home

Nous

Why this category is different

What the artifact already contains

Reference evidence

Historical reference run

FNC-Bench

Reading order

Links

Contact

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally