This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Description
Problems:
- Cortex uses significant CPU, was report multiple timed by Emre.
Most recently:
Even with the 7B model of R1 distills, the laptop heats up significantly and the fan kicks in - there wasn't this much heating even with 14B models. > Do R1 models consume more energy for you as well?
Task:
- We should evaluate whether our CPU usage is correct
- There is likely a performance vs. CPU usage tradeoff
- Benchmark against Ollama and LMStudio's approach
From Emre:
Jan (Cortex) is using almost 100% CPU in some cases. In Mac, the token speed is almost the same with LM Studio but consumes much more energy
https://docs.google.com/spreadsheets/d/1JY1S9msFlU9jU4gcrULDkwnRsmqpFPXpg8z5BzTDekM/edit?pli=1&gid=1969085827#gid=1969085827