There should be an options where you can specify a number of tokens and once the number of tokens in the conversation exceeds that limit (probably heuristically moderated by the number of tokens allowed in the models context window) where, upon surpassing the limit AFTER a completed response and before the role switches back to that of the user in architect mode/before the model initiates it's next action in navigator mode:
- The model is sent a current copy of the conversation and asked to summarize it
- Split the response into:
Overall Goal
Next Steps (where it makes a checklist for itself on what is done and what is left to do)
Active files (any files current used in the context window)
- Upon retrieving the response, we remove all aspects of the conversation prior to the penultimate turn of the conversation (e.g. the user's last request and the model's last response) and replace it with the summarized context
New flags necessary:
--enable-context-compaction (to enable the feature, default false)
--context-compaction-max-tokens (to set the max size of the conversation before compaction, default 80% of the detected models context window size)
--context-compaction-summary-tokens (the maximum size of the stored summary, default 4k tokens, summaries bigger than this must be re-summarized to be smaller until they fit in the limit, probably the most information lossy part of the process)
There should be an options where you can specify a number of tokens and once the number of tokens in the conversation exceeds that limit (probably heuristically moderated by the number of tokens allowed in the models context window) where, upon surpassing the limit AFTER a completed response and before the role switches back to that of the user in architect mode/before the model initiates it's next action in navigator mode:
Overall Goal
Next Steps (where it makes a checklist for itself on what is done and what is left to do)
Active files (any files current used in the context window)
New flags necessary:
--enable-context-compaction (to enable the feature, default false)
--context-compaction-max-tokens (to set the max size of the conversation before compaction, default 80% of the detected models context window size)
--context-compaction-summary-tokens (the maximum size of the stored summary, default 4k tokens, summaries bigger than this must be re-summarized to be smaller until they fit in the limit, probably the most information lossy part of the process)