Cache-Augmented Generation (CAG) – JavaScript Teaching Repository

Welcome! This mini-repository is intentionally didactic: every file is a self-contained lesson that incrementally introduces the idea of Cache-Augmented Generation (CAG) – a technique in which we preload relevant documents into an LLM’s context so the model can answer many questions without performing a live retrieval step.

The code is kept short, dependency-free and thoroughly logged so that you can run each script with plain Node and see what is happening at every phase.

# run any demo
node cag_demo.js

File Tour

File	Pedagogical Focus	Key Concepts Introduced
cag_demo.js	🍏 “Hello CAG” – the smallest viable example	• Pre-loaded text cache • Linear string matching
cag_demo_with_vector_store.js	🍊 Adds semantic search via fake embeddings	• Document vectors • Cosine similarity • Similarity threshold (`0.85` default)
cache_augmented_llm.js	🍎 Modularises the code into a reusable `CacheAugmentedLLM` class and layers extra features	• Embedding-function injection • Vector-store plug-in stub • Query-result cache (performance) • Runtime similarity-threshold tuning
cache_augmented_llm_with_search.js	🍉 Separates the pipeline even further to highlight each sub-step	• Dedicated helpers: `vectorizeQuery` & `searchVectorStore` • Clear trace of vectorise ➜ search ➜ answer

Tip for educators – Because each successive file only adds a single conceptual leap, you can walk learners through the scripts one after another, live-coding small deltas or using git diff to highlight the change.

Suggested Learning Path

Run cag_demo.js to observe basic string matching and discuss its limitations (lexical vs semantic).
Move to cag_demo_with_vector_store.js to show how embeddings plus cosine similarity overcome those limits.
Graduate to cache_augmented_llm.js for a conversation about real-world concerns: external vector stores, plug-able embeddings and caching for latency.
Finish with cache_augmented_llm_with_search.js to underline the standard retrieval pipeline that underpins most production systems.

Why CAG instead of RAG?

Retrieval-Augmented Generation (RAG) fetches documents at query time. CAG pre-loads a carefully selected subset into the model’s context (or fast in-memory vector store), trading memory for speed. This repo lets students experiment with that trade-off before touching heavyweight libraries or cloud services.

Running the demos

All scripts rely only on the Node.js standard library.

# Execute a script
node cache_augmented_llm.js

# View verbose logs for learning
NODE_OPTIONS="--trace-warnings" node cache_augmented_llm_with_search.js

Feel free to modify the contextCache objects, tweak the similarityThreshold, or replace the fake embedding function with your own model to explore further.

Happy learning – and happy caching! 🎉

Repository owner

This repository and all provided assets are maintained by admin@nguyenhongquan.com.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
README_vi.md		README_vi.md
cache_augmented_llm.js		cache_augmented_llm.js
cache_augmented_llm_with_search.js		cache_augmented_llm_with_search.js
cag_demo.js		cag_demo.js
cag_demo_with_vector_store.js		cag_demo_with_vector_store.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Cache-Augmented Generation (CAG) – JavaScript Teaching Repository

File Tour

Suggested Learning Path

Why CAG instead of RAG?

Running the demos

Repository owner

About

Uh oh!

Releases

Packages

Languages

Uh oh!

Uh oh!

211211/cache-augmented-generation-js

Folders and files

Latest commit

History

Repository files navigation

Cache-Augmented Generation (CAG) – JavaScript Teaching Repository

File Tour

Suggested Learning Path

Why CAG instead of RAG?

Running the demos

Repository owner

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages