Generate-Context Algorithm #109

granawkins · 2023-09-27T01:21:49Z

The way we currently generate context is:

Add files that the user has selected
Add diff annotations to those files for the diff or pr-diff they select
Calculate how many tokens we've used. If it's over the model's max, throw an error.
Else, If no_code_map is false:
a) Try to include filename/functions/signatures
b) If it's too big, try to include filename/functions
c) If it's too big, try to include just filenames

At 4), we want to use the remaining context in the most valuable way - not just fill it in with code_map. To do this we will:

a) Make a list of all the features we could potentially include. Would include (mentat/app.py, 'code'), (mentat/app.py, 'diff'), (mentat/app.py, 'cmap_signatures') etc. for different features of a file, as well as smaller chunks, e.g. (mentat/app.py:run 'code'), (mentat/app.py:loop 'code'). Chunks within files should cover the entire file without overlap.
b) Assign a relevance score to each feature based on (i) its embedding, relative to the current prompt (ii) if/how it relates to user-specified paths/diff, (iii) which functions it calls and is called by, etc.
c) Divide the score by some length factor - maybe the literal number of tokens, maybe a parameter like cmap_signature_weight. Just want to prioritize higher-density information.
d) Sort all the features by score, and add one-by-one until context is full. If there's overlap conflicts, e.g. <file>:<func> is already included and you add <file>, keep the higher-level item ().

Happy for questions or suggestions on the approach! My plan moving forward is:

Move the get_code_message to CodeFile, update diff and codemaps to work on individual files. Eventually CodeFile will become CodeFeature and can be anything in a).
Setup a refresh workflow and caching of code message
Build a basic version of the algo using just diff and codemaps
Add embeddings (with some type of persistent storage) and use to prioritize items in b)
Add Tree-sitter to parse files into smaller chunks.

The text was updated successfully, but these errors were encountered:

biobootloader · 2023-09-27T01:43:54Z

Thanks for the clear write up!

biobootloader · 2023-09-27T04:05:00Z

A couple questions / comments:

Would include (mentat/app.py, 'code'), (mentat/app.py, 'diff'), (mentat/app.py, 'cmap_signatures') etc. for different features of a file

Am I correct in understanding that (mentat/app.py, 'code') would contain the full code and diff for a file / chunk, while (mentat/app.py, 'diff') would just contain the diff?

The diffs certainly complicate things. Without them it's clear to me that the "levels of detail" for files / chunks would go from "full code" to cmaps with less and less detail. But if a file or chunk has a diff, the diff alone might show less detail than a code map (which would show function signatures that might not have been touched by the diff). Maybe we should always include all diffs and just vary showing the surrounding code as full code or cmap levels?

[] Add Tree-sitter to parse files into smaller chunks.

So until adding tree-sitter all of this would be operating at the level of entire files, not chunks? Or are you going to use the CodeFile Intervals somehow?

granawkins · 2023-09-27T06:55:40Z

Am I correct in understanding that (mentat/app.py, 'code') would contain the full code and diff for a file / chunk, while (mentat/app.py, 'diff') would just contain the diff?

The diffs certainly complicate things. Without them it's clear to me that the "levels of detail" for files / chunks would go from "full code" to cmaps with less and less detail. But if a file or chunk has a diff, the diff alone might show less detail than a code map (which would show function signatures that might not have been touched by the diff). Maybe we should always include all diffs and just vary showing the surrounding code as full code or cmap levels?

I agree, maybe this is the move initially.

I do suspect in many cases including the full diff will overshoot the context. Especially when a PR includes new files, because the whole file is effectively a diff. What if diff is another argument, (mentat/app.py, 'code', <diff_target>)? Would be a treeish if the file is part of active diff, otherwise None. Then you'd expect (mentat/app.py, 'cmap', 'HEAD') to have a higher score than (mentat/app.py, 'cmap', None), and you'd get that unless it was too big.

So until adding tree-sitter all of this would be operating at the level of entire files, not chunks? Or are you going to use the CodeFile Intervals somehow?

I'd like to preserve the functionality we have all the way through but not sure how much cajoling that will take. I'll aim for that and keep you posted.

biobootloader · 2023-09-27T13:00:37Z

I do suspect in many cases including the full diff will overshoot the context. Especially when a PR includes new files, because the whole file is effectively a diff.

True. There's two use cases when we'll have diffs: 1) the user runs Mentat with --diff or --pr-diff or 2) the diff is just the uncommitted changes, many of which Mentat may have just made earlier in the conversation or in a previous conversation. Hopefully almost always the diffs will all fit in the second case. And the first case is special anyway, so it's probably ok to tell the user that the diff they chose is really big and not going to work well.

Then you'd expect (mentat/app.py, 'cmap', 'HEAD') to have a higher score than (mentat/app.py, 'cmap', None), and you'd get that unless it was too big.

To do this though we'd have to decide how to score/value diffs (i.e. it'd be another parameter). But that could work!

granawkins · 2023-09-29T07:40:02Z

Per the discussion above:

CodeFiles how have 3 main properties:

path (a Path)
level ('code', 'interval', 'cmap_full', 'cmap', 'file_name' - in that order)
diff (either a git Treeish or None)

In CodeContext, generate all valid permutations of the above for every file in the workspace. These are then scored and sorted, and added one-by-one until context is full. It's setup so that:

Only one permutation is included per path - the longest permutation that the algorithm sees before it runs out of space.
Diffs are included as annotations to code or by just appending the changed lines to a cmap. A permutation with a diff is weighted much higher and should usually be preferred.
User-specified paths can have an interval like they used to, but and the algorithm could also choose to include the entire file. We don't generate interval permutations yet.

Hope this helps grok the code a bit.

granawkins · 2023-11-09T00:43:51Z

jakethekoenig · 2023-11-13T17:21:23Z

Thanks for writing this up. Some other things I think we need to do:

Auto context enabled should allow mentat to edit files that are not included. User confirmation is asked anyway so I think this is fine.
When auto-context is enabled rather than failing when the included files exceed the context we should give a warning that our auto context system is selecting a subset of the included files and they may not all be in the LLM's context. In the limit I think mentat and mentat . should have the same behavior.
LLMFeatureFilter needs to report its cost to the CostTracker.

granawkins mentioned this issue Sep 29, 2023

Implement basic auto-context generation #115

Merged

granawkins closed this as completed Nov 9, 2023

granawkins reopened this Nov 9, 2023

biobootloader mentioned this issue Nov 10, 2023

LLM call made during feature selection shouldn't print "Speed: ... Cost: ..." #265

Closed

waydegg assigned waydegg and unassigned waydegg Nov 10, 2023

granawkins closed this as completed Dec 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate-Context Algorithm #109

Generate-Context Algorithm #109

granawkins commented Sep 27, 2023 •

edited

Loading

biobootloader commented Sep 27, 2023

biobootloader commented Sep 27, 2023

granawkins commented Sep 27, 2023

biobootloader commented Sep 27, 2023

granawkins commented Sep 29, 2023

granawkins commented Nov 9, 2023 •

edited

Loading

jakethekoenig commented Nov 13, 2023

Generate-Context Algorithm #109

Generate-Context Algorithm #109

Comments

granawkins commented Sep 27, 2023 • edited Loading

biobootloader commented Sep 27, 2023

biobootloader commented Sep 27, 2023

granawkins commented Sep 27, 2023

biobootloader commented Sep 27, 2023

granawkins commented Sep 29, 2023

granawkins commented Nov 9, 2023 • edited Loading

jakethekoenig commented Nov 13, 2023

granawkins commented Sep 27, 2023 •

edited

Loading

granawkins commented Nov 9, 2023 •

edited

Loading