-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Client Compatible Bedrock ARN handling #62720
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RXminuS , after our conversation this morning we don't want to merge these changes, right?
e.g. if we have the raw AWS ARN be put in the ChatModel
configuration field, it causes clients to choke on the Mal-formed model.
So we wanted to instead go with a model where the model was specified as something like anthropic.v1/${awsArn}
. Right? So we should close this out?
5757308
to
cbc0b77
Compare
cbc0b77
to
d3947d3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see you made some changes, posting the review comments before diffing them becomes a pain..
internal/conf/conftypes/conftypes.go
Outdated
@@ -101,3 +102,33 @@ func (r *RawUnified) FromProto(in *proto.RawUnified) { | |||
func (r RawUnified) Equal(other RawUnified) bool { | |||
return r.Site == other.Site && reflect.DeepEqual(r.ServiceConnections, other.ServiceConnections) | |||
} | |||
|
|||
// Bedrock ModelIDs can either be just a <model_id> string as specified in https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html. Or when using provisioned capacity a string like `<model_id>/<provisioned_capacity>` where the <provisioned_capacity> refers to the arn in https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html#API_CreateProvisionedModelThroughput_RequestSyntax |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: While I don't think go fmt
or any style guide will give you an exact max column length, I would suggest that you add a new line after 80, 100, or 120 columns. (Whatever seems most consistent with the rest of the surrounding code.)
The problem with having a single, really long comment like this is that it makes the code harder to read. (e.g. you need to stop and then scroll horizontally.) Whereas splitting at XX columns, means you can read the code top-down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I think the comment would benefit from some examples since it isn't clear what form <model_id> should take.
e.g.
// Bedrock Model IDs can be in one of two forms:
// - A static model ID, e.g. "anthropic.claude-v2".
// - A model ID and ARN for provisioned capacity, e.g.
// "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"
//
// See the AWS docs for more information:
// https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
// https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a quick note about what constitutes a valid BedrockModelID
instance? Specifically, that in most cases, we expect the ProvisionedCapacity
string to be empty.
As an optional suggestion, consider going a step further and making this explicit. (e.g. renaming the field to ProvisionedCapacityARN
and changing its type to be *string
.)
type BedrockModelIdParts struct {
// Model is the underlying LLM model Bedrock is serving, e.g. "anthropic.claude-3-haiku-20240307-v1:0"
Model string
// If the configuration is using provisioned capacity, this will
// contain the ARN of the model to use for making API calls.
// e.g. "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"
ProvisionedCapacityARN *string
}
internal/conf/conftypes/conftypes.go
Outdated
@@ -101,3 +102,33 @@ func (r *RawUnified) FromProto(in *proto.RawUnified) { | |||
func (r RawUnified) Equal(other RawUnified) bool { | |||
return r.Site == other.Site && reflect.DeepEqual(r.ServiceConnections, other.ServiceConnections) | |||
} | |||
|
|||
// Bedrock ModelIDs can either be just a <model_id> string as specified in https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html. Or when using provisioned capacity a string like `<model_id>/<provisioned_capacity>` where the <provisioned_capacity> refers to the arn in https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html#API_CreateProvisionedModelThroughput_RequestSyntax | |||
type BedrockModelIdParts struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Golang naming conventions would have you write this as ID (all caps).
type BedrockModelIdParts struct { | |
type BedrockModelIDParts struct { |
But I'll also say that something doesn't quite smell right, here. Since this data type isn't "parts" so much as just a way to describe an AWS Bedrock model. (And that just requires a ProvisionedCapacityARN
in some situations.)
So what do you think about renaming this to just BedrockModel
or BedrockModelRef
or BedrockModelReference
? "XXX ID Parts" just doesn't sound right, IMHO.
internal/conf/conftypes/conftypes.go
Outdated
} | ||
|
||
func ParseBedrockModelId(modelId string) BedrockModelIdParts { | ||
parts := strings.SplitN(modelId, "/", 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it turns out, strings.SplitN
will return a slice with zero elements if given an empty string. So in some degenerate case this function will panic on line 116.
parts := strings.SplitN(modelId, "/", 2) | |
if modelId == "" { | |
return BedrockModelIdParts{} | |
} | |
parts := strings.SplitN(modelId, "/", 2) |
https://go.dev/play/p/wlg3qDPUEmn
Also, re Golang naming conventions, it's more common to have the parameter name be modelID
(capitalizing ID).
internal/conf/conftypes/conftypes.go
Outdated
return parsed | ||
} | ||
|
||
func CasefoldBedrockModelId(modelId string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was unfamiliar with the concept of "casefold", which perhaps is more common in the Python world? For clarity, would it be more helpful to call this CanonicalizeBedrockModelID(string) string
?
Or even change the signature to be:
func (bm BedrockModelId) CanonicalizedModel() string {
That may make the code easier to read and follow. Also, since the function is exported from the package we may want to put a quick doc comment on it to explain what the function does and why it is needed. (e.g. when we would want to "casefold" a model ID.)
In case it comes up later, the Golang version of a case-insensitive comparison in case you missed it is called "EqualFold".
https://pkg.go.dev/strings#EqualFold
internal/conf/validation/cody.go
Outdated
} | ||
} | ||
|
||
//check for bedrock ARNs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
//check for bedrock ARNs | |
// Check for bedrock ARNs, which if using provisioned capacity are in | |
// an awkward format. |
internal/conf/validation/cody.go
Outdated
@@ -30,6 +38,27 @@ func completionsConfigValidator(q conftypes.SiteConfigQuerier) conf.Problems { | |||
problems = append(problems, "'completions.enabled' has been superceded by 'cody.enabled', please migrate to the new configuration.") | |||
} | |||
|
|||
allModelIds := []modelId{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nit: Since these aren't used anywhere else in the function, could we just move this code block to inside the if .Provider == AWSBedrock {
block?
//check for bedrock ARNs | ||
if completionsConf.Provider == string(conftypes.CompletionsProviderNameAWSBedrock) { | ||
for _, modelId := range modelIdsToCheck { | ||
if strings.HasPrefix(modelId.value, "arn:aws:") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarity, if you think this is unnecessary feel free to ignore.
if strings.HasPrefix(modelId.value, "arn:aws:") { | |
// When using provisioned capacity we expect an admin would just put the ARN | |
// here directly, but we need both the model AND the ARN. Hence the check. | |
if strings.HasPrefix(modelId.value, "arn:aws:") { |
expectedProblems: conf.NewSiteProblems( | ||
fmt.Sprintf(bedrockArnMessageTemplate, "chatModel", "arn:aws:bedrock:us-west-2:012345678901:provisioned-model/abcdefghijkl"), | ||
), | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a positive test to, where the ChatModel isn't prefixed with are:aws
?
Also, another test scenario that sets some other model like CompletionModel
or FastChatModel
? (i.e. just confirm that we look at more than just ChatModel
.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, finished the code review.
Overall, things look good. Functionality wise this seems to do what we need it to, so if your testing of how the backend supports these "provisioned capacity ARNs", then let's go ahead and get it checked-in.
But as far as code quality and/or Golang idioms, there are a few ways that we can improve this. Some recurring themes:
- Stylize
ID
with two capital letters. This is called out in Google's Golang style guide here - The "ID parts" data type seems a little off, e.g. conceptually it is a reference to an AWS Bedrock model. (Which may or may not contain an ARN for provisioned capacity.) So maybe we can make that field
*string
to make that super-clear?
Anyways, nothing structural. Just a lot of minor nitpicks. (And 🤞 we can make this problem go away entirely by having a more sophisticated way to refer to an LLM model in our config, in RFC 950.)
|
||
if parsedModelId.ProvisionedCapacity != "" { | ||
if stream { | ||
apiURL.RawPath = fmt.Sprintf("/model/%s/invoke-with-response-stream", url.QueryEscape(parsedModelId.ProvisionedCapacity)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling url.QueryEscape
here is really surprising. IIRC, this is a quirk of the AWS API, and something we have to do. So perhaps that's subtle enough that we should call out with a quick comment? e.g.
// We need to Query escape the provisioned capacity ARN, since
// otherwise the AWS API <does thing that we don't like>.
internal/conf/computed.go
Outdated
@@ -853,9 +853,9 @@ func GetCompletionsConfig(siteConfig schema.SiteConfiguration) (c *conftypes.Com | |||
} | |||
|
|||
// Make sure models are always treated case-insensitive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update this comment to explain why we need to call CasefoldBedrockModelID
and not just the "more obvious" strings.ToLower
? e.g.
// Make sure models are always treated case-insensitive, however
// if the model provides a provisioned capacity ARN that needs
// to be kept as-is. (Since ARNs are case-sensitive.)
internal/conf/computed.go
Outdated
@@ -1197,6 +1198,8 @@ func defaultMaxPromptTokens(provider conftypes.CompletionsProviderName, model st | |||
} | |||
|
|||
func anthropicDefaultMaxPromptTokens(model string) int { | |||
//TODO: this doesn't nearly cover all the ways that token size can be specified. https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultra-pedantic nit pick around the syntax of a comment.
//TODO: this doesn't nearly cover all the ways that token size can be specified. https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html | |
// TODO: this doesn't nearly cover all the ways that token size can be specified. | |
// See: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html |
@@ -19,6 +20,13 @@ func init() { | |||
conf.ContributeValidator(embeddingsConfigValidator) | |||
} | |||
|
|||
const bedrockArnMessageTemplate = "completions.%s is invalid. Provisioned Capacity IDs must be formatted like \"model_id/provisioned_capacity_arn\".\nFor example \"anthropic.claude-instant-v1/%s\"" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case you were unaware, in Golang you can use the %q
format specifier to put double quotes around a string. So this arguably could be cleaned up by using that instead of \"...\"
.
fmt.Sprintf(
"completions.%s is invalid. Must be formatted like %q. For example %q.",
rawValue,
"model_id/provisioned_capacity_arn",
"anthropic.claude-instant-v1/" + rawValue)
internal/conf/validation/cody.go
Outdated
@@ -19,6 +20,13 @@ func init() { | |||
conf.ContributeValidator(embeddingsConfigValidator) | |||
} | |||
|
|||
const bedrockArnMessageTemplate = "completions.%s is invalid. Provisioned Capacity IDs must be formatted like \"model_id/provisioned_capacity_arn\".\nFor example \"anthropic.claude-instant-v1/%s\"" | |||
|
|||
type modelId struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like throughout the PR, perhaps naming this modelID
is the more idiomatic choice.
Also, because Golang is awesome like that, you can literally put a type definition within a function. So rather than introducing a package-level symbol here, you can define modelID
inside the completionsConfigValidator
function.
// Bedrock Model IDs can be in one of two forms: | ||
// - A static model ID, e.g. "anthropic.claude-v2". | ||
// - A model ID and ARN for provisioned capacity, e.g. | ||
// "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx" |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn Warning
Model string | ||
// If the configuration is using provisioned capacity, this will | ||
// contain the ARN of the model to use for making API calls. | ||
// e.g. "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx" |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn Warning
@@ -30,6 +33,35 @@ | |||
problems = append(problems, "'completions.enabled' has been superceded by 'cody.enabled', please migrate to the new configuration.") | |||
} | |||
|
|||
// Check for bedrock Provisioned Capacity ARNs which should instead be | |||
// formatted like: | |||
// "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx" |
Check warning
Code scanning / Semgrep OSS
Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn Warning
The backport to
To backport this PR manually, you can either: Via the sg toolUse the sg backport -r 5.4.0 -p 62720 Via your terminalTo backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.4.0 5.4.0
# Navigate to the new working tree
cd .worktrees/backport-5.4.0
# Create a new branch
git switch --create backport-62720-to-5.4.0
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6a7666c42cf349a6b702b8c72c9cc2c4e8421ad7
# Push it to GitHub
git push --set-upstream origin backport-62720-to-5.4.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.4.0 If you encouter conflict, first resolve the conflict and stage all files, then run the commands below: git cherry-pick --continue
# Push it to GitHub
git push --set-upstream origin backport-62720-to-5.4.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.4.0
Once the pull request has been created, please ensure the following:
|
* Improve Bedrock ARN handling * Fix up PR comments (cherry picked from commit 6a7666c)
Improve handling of Bedrock ARNs so that they no longer break client parsing of ModelIDs. This used to result in the CompletionProvider not initializing when configuring a ARN model ID.
This is now resolved by requiring a
<underlying_model_id>/<provisioned_capacity_arn>
format which is correctly parsed by the client. This also has the benefit of providing some information about the underlying model behind the provisioned capacityTest plan