kvanta

Sharp little VRAM calculator for open LLMs. ⚡

Live app: https://kvanta.vcerny.cz
Source: github.com/vaclcer/kvanta

kvanta fetches public Hugging Face model configs and safetensors metadata, then calculates KV-cache memory and estimated model-weight footprint from context size, batch count, precision, and quantization.

Notes

Exact adapters for standard decoder-only, GQA/MQA, sliding-window, GLM MoE DSA, DeepSeek V4, and Qwen3.5 hybrid cache layouts.
Model weights are estimated from Hugging Face safetensors.total metadata.
Runtime VRAM can still drift by inference engine because allocators, kernels, paged attention, and activation buffers all have opinions.

Development

npm install
npm run dev

Useful checks:

npm run lint
npm run typecheck
npm test
npm run build

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
src		src
static		static
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
svelte.config.js		svelte.config.js
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kvanta

Notes

Development

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kvanta

Notes

Development

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages