You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The structure of an effective Transformer LLM call is:
System Prompt
Preamble
Few shot-examples
Question
With statecraft and your SSM, we instead simply have:
Inputted state (with problem context, initial instructions, textbooks, and few-shot examples all baked in)
Short question
So the goal of this new framing is to take:
System Prompt
Preamble
Few shot-examples
And mush them all into a "state". So now instead of having to pass all the prompt / few-shot examples around, I can just pass the state.
Pros: cheaper/faster in terms of compute, you can hide proprietary prompts and examples, you can use statespace-specific "warmup" techniques that might not be possible with transformers, e.g. closed-loop warmup
Cons: you can't look under the hood of the prompts that created the state
Something Docker did well was the "docker file" vs "docker image" thing. The dockerimage is the binary recording of a machine state, the dockerfile are the instructions that creates the image. So you can understand an image by looking at its dockerfile, and you can hack a dockerfile to get a new image.
If I understand correctly, Statecraft has an analog to "docker image", but it is missing the analogous "dockerfile". Do I understand the system correctly?
The text was updated successfully, but these errors were encountered:
System Prompt
Preamble
Few shot-examples
And mush them all into a "state". So now instead of having to pass all the prompt / few-shot examples around, I can just pass the state.
Pros: cheaper/faster in terms of compute, you can hide proprietary prompts and examples, you can use statespace-specific "warmup" techniques that might not be possible with transformers, e.g. closed-loop warmup
Yes that's exactly correct, I love the way you frame it here! 🙌
If I understand correctly, Statecraft has an analog to "docker image", but it is missing the analogous "dockerfile". Do I understand the system correctly?
I really like the Docker analogy perhaps I should lean into this more!
You're correct that the state acts as the Docker Image in this case. In terms of the Dockerfile, each state comes with a metadata json file which contains the name of the model that was used to create it (and hence the configuration) as well as the prompt that was used to create the state (or a url_reference to the text that was used for the prompt, if the prompt is long).
This acts as the Dockerfile analogue which can be used to create the image and the metadata also contains a plain-text description for any other notes on the state creation or intended usage.
Perhaps I should align the messaging with Docker and call this object the Statefile or similar, that's an interesting point 💡
Yeah you have a great understanding, thanks again for your question!
From your docs:
So the goal of this new framing is to take:
And mush them all into a "state". So now instead of having to pass all the prompt / few-shot examples around, I can just pass the state.
Pros: cheaper/faster in terms of compute, you can hide proprietary prompts and examples, you can use statespace-specific "warmup" techniques that might not be possible with transformers, e.g. closed-loop warmup
Cons: you can't look under the hood of the prompts that created the state
Something Docker did well was the "docker file" vs "docker image" thing. The dockerimage is the binary recording of a machine state, the dockerfile are the instructions that creates the image. So you can understand an image by looking at its dockerfile, and you can hack a dockerfile to get a new image.
If I understand correctly, Statecraft has an analog to "docker image", but it is missing the analogous "dockerfile". Do I understand the system correctly?
The text was updated successfully, but these errors were encountered: