Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Compatible Bedrock ARN handling #62720

Merged
merged 3 commits into from
May 20, 2024

Conversation

RXminuS
Copy link
Contributor

@RXminuS RXminuS commented May 16, 2024

Improve handling of Bedrock ARNs so that they no longer break client parsing of ModelIDs. This used to result in the CompletionProvider not initializing when configuring a ARN model ID.

This is now resolved by requiring a <underlying_model_id>/<provisioned_capacity_arn> format which is correctly parsed by the client. This also has the benefit of providing some information about the underlying model behind the provisioned capacity

Test plan

  • added unit test for expected configuration output
  • verified locally with VSCode connecting to SG instance
  • verified in reference deployment (in progress)

@RXminuS RXminuS added the p0 label May 16, 2024
@cla-bot cla-bot bot added the cla-signed label May 16, 2024
@RXminuS RXminuS marked this pull request as ready for review May 16, 2024 09:55
@RXminuS RXminuS requested a review from a team May 16, 2024 09:56
Copy link
Contributor

@chrsmith chrsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RXminuS , after our conversation this morning we don't want to merge these changes, right?

e.g. if we have the raw AWS ARN be put in the ChatModel configuration field, it causes clients to choke on the Mal-formed model.

So we wanted to instead go with a model where the model was specified as something like anthropic.v1/${awsArn}. Right? So we should close this out?

@RXminuS RXminuS force-pushed the rnauta/fix/p-arn-identifier-override branch from 5757308 to cbc0b77 Compare May 16, 2024 19:42
@RXminuS RXminuS changed the title Expose Bedrock ARNs to client as compatible model IDs Improve Bedrock ARN handling May 16, 2024
@RXminuS RXminuS changed the title Improve Bedrock ARN handling Client Compatible Bedrock ARN handling May 16, 2024
@RXminuS RXminuS requested review from chrsmith and a team and removed request for a team May 16, 2024 19:44
@RXminuS RXminuS force-pushed the rnauta/fix/p-arn-identifier-override branch from cbc0b77 to d3947d3 Compare May 16, 2024 20:22
Copy link
Contributor

@chrsmith chrsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see you made some changes, posting the review comments before diffing them becomes a pain..

@@ -101,3 +102,33 @@ func (r *RawUnified) FromProto(in *proto.RawUnified) {
func (r RawUnified) Equal(other RawUnified) bool {
return r.Site == other.Site && reflect.DeepEqual(r.ServiceConnections, other.ServiceConnections)
}

// Bedrock ModelIDs can either be just a <model_id> string as specified in https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html. Or when using provisioned capacity a string like `<model_id>/<provisioned_capacity>` where the <provisioned_capacity> refers to the arn in https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html#API_CreateProvisionedModelThroughput_RequestSyntax
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: While I don't think go fmt or any style guide will give you an exact max column length, I would suggest that you add a new line after 80, 100, or 120 columns. (Whatever seems most consistent with the rest of the surrounding code.)

The problem with having a single, really long comment like this is that it makes the code harder to read. (e.g. you need to stop and then scroll horizontally.) Whereas splitting at XX columns, means you can read the code top-down.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I think the comment would benefit from some examples since it isn't clear what form <model_id> should take.

e.g.

// Bedrock Model IDs can be in one of two forms:
// - A static model ID, e.g. "anthropic.claude-v2".
// - A model ID and ARN for provisioned capacity, e.g.
//  "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"
//
// See the AWS docs for more information:
// https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
// https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ OK, one final thing! But this is, IMHO actually important.

Can we add a quick note about what constitutes a valid BedrockModelID instance? Specifically, that in most cases, we expect the ProvisionedCapacity string to be empty.

As an optional suggestion, consider going a step further and making this explicit. (e.g. renaming the field to ProvisionedCapacityARN and changing its type to be *string.)

type BedrockModelIdParts struct {
    // Model is the underlying LLM model Bedrock is serving, e.g. "anthropic.claude-3-haiku-20240307-v1:0"
    Model string

    // If the configuration is using provisioned capacity, this will
    // contain the ARN of the model to use for making API calls.
    // e.g. "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"
    ProvisionedCapacityARN *string
}

@@ -101,3 +102,33 @@ func (r *RawUnified) FromProto(in *proto.RawUnified) {
func (r RawUnified) Equal(other RawUnified) bool {
return r.Site == other.Site && reflect.DeepEqual(r.ServiceConnections, other.ServiceConnections)
}

// Bedrock ModelIDs can either be just a <model_id> string as specified in https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html. Or when using provisioned capacity a string like `<model_id>/<provisioned_capacity>` where the <provisioned_capacity> refers to the arn in https://docs.aws.amazon.com/bedrock/latest/APIReference/API_CreateProvisionedModelThroughput.html#API_CreateProvisionedModelThroughput_RequestSyntax
type BedrockModelIdParts struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Golang naming conventions would have you write this as ID (all caps).

Suggested change
type BedrockModelIdParts struct {
type BedrockModelIDParts struct {

But I'll also say that something doesn't quite smell right, here. Since this data type isn't "parts" so much as just a way to describe an AWS Bedrock model. (And that just requires a ProvisionedCapacityARN in some situations.)

So what do you think about renaming this to just BedrockModel or BedrockModelRef or BedrockModelReference? "XXX ID Parts" just doesn't sound right, IMHO.

}

func ParseBedrockModelId(modelId string) BedrockModelIdParts {
parts := strings.SplitN(modelId, "/", 2)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it turns out, strings.SplitN will return a slice with zero elements if given an empty string. So in some degenerate case this function will panic on line 116.

Suggested change
parts := strings.SplitN(modelId, "/", 2)
if modelId == "" {
return BedrockModelIdParts{}
}
parts := strings.SplitN(modelId, "/", 2)

https://go.dev/play/p/wlg3qDPUEmn

Also, re Golang naming conventions, it's more common to have the parameter name be modelID (capitalizing ID).

return parsed
}

func CasefoldBedrockModelId(modelId string) string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unfamiliar with the concept of "casefold", which perhaps is more common in the Python world? For clarity, would it be more helpful to call this CanonicalizeBedrockModelID(string) string?

Or even change the signature to be:

func (bm BedrockModelId) CanonicalizedModel() string {

That may make the code easier to read and follow. Also, since the function is exported from the package we may want to put a quick doc comment on it to explain what the function does and why it is needed. (e.g. when we would want to "casefold" a model ID.)

In case it comes up later, the Golang version of a case-insensitive comparison in case you missed it is called "EqualFold".
https://pkg.go.dev/strings#EqualFold

}
}

//check for bedrock ARNs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//check for bedrock ARNs
// Check for bedrock ARNs, which if using provisioned capacity are in
// an awkward format.

@@ -30,6 +38,27 @@ func completionsConfigValidator(q conftypes.SiteConfigQuerier) conf.Problems {
problems = append(problems, "'completions.enabled' has been superceded by 'cody.enabled', please migrate to the new configuration.")
}

allModelIds := []modelId{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: Since these aren't used anywhere else in the function, could we just move this code block to inside the if .Provider == AWSBedrock { block?

//check for bedrock ARNs
if completionsConf.Provider == string(conftypes.CompletionsProviderNameAWSBedrock) {
for _, modelId := range modelIdsToCheck {
if strings.HasPrefix(modelId.value, "arn:aws:") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for clarity, if you think this is unnecessary feel free to ignore.

Suggested change
if strings.HasPrefix(modelId.value, "arn:aws:") {
// When using provisioned capacity we expect an admin would just put the ARN
// here directly, but we need both the model AND the ARN. Hence the check.
if strings.HasPrefix(modelId.value, "arn:aws:") {

expectedProblems: conf.NewSiteProblems(
fmt.Sprintf(bedrockArnMessageTemplate, "chatModel", "arn:aws:bedrock:us-west-2:012345678901:provisioned-model/abcdefghijkl"),
),
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a positive test to, where the ChatModel isn't prefixed with are:aws?

Also, another test scenario that sets some other model like CompletionModel or FastChatModel? (i.e. just confirm that we look at more than just ChatModel.)

Copy link
Contributor

@chrsmith chrsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, finished the code review.

Overall, things look good. Functionality wise this seems to do what we need it to, so if your testing of how the backend supports these "provisioned capacity ARNs", then let's go ahead and get it checked-in.

But as far as code quality and/or Golang idioms, there are a few ways that we can improve this. Some recurring themes:

  • Stylize ID with two capital letters. This is called out in Google's Golang style guide here
  • The "ID parts" data type seems a little off, e.g. conceptually it is a reference to an AWS Bedrock model. (Which may or may not contain an ARN for provisioned capacity.) So maybe we can make that field *string to make that super-clear?

Anyways, nothing structural. Just a lot of minor nitpicks. (And 🤞 we can make this problem go away entirely by having a more sophisticated way to refer to an LLM model in our config, in RFC 950.)


if parsedModelId.ProvisionedCapacity != "" {
if stream {
apiURL.RawPath = fmt.Sprintf("/model/%s/invoke-with-response-stream", url.QueryEscape(parsedModelId.ProvisionedCapacity))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling url.QueryEscape here is really surprising. IIRC, this is a quirk of the AWS API, and something we have to do. So perhaps that's subtle enough that we should call out with a quick comment? e.g.

// We need to Query escape the provisioned capacity ARN, since
// otherwise the AWS API <does thing that we don't like>.

@@ -853,9 +853,9 @@ func GetCompletionsConfig(siteConfig schema.SiteConfiguration) (c *conftypes.Com
}

// Make sure models are always treated case-insensitive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we update this comment to explain why we need to call CasefoldBedrockModelID and not just the "more obvious" strings.ToLower? e.g.

// Make sure models are always treated case-insensitive, however
// if the model provides a provisioned capacity ARN that needs
// to be kept as-is. (Since ARNs are case-sensitive.)

@@ -1197,6 +1198,8 @@ func defaultMaxPromptTokens(provider conftypes.CompletionsProviderName, model st
}

func anthropicDefaultMaxPromptTokens(model string) int {
//TODO: this doesn't nearly cover all the ways that token size can be specified. https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultra-pedantic nit pick around the syntax of a comment.

Suggested change
//TODO: this doesn't nearly cover all the ways that token size can be specified. https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html
// TODO: this doesn't nearly cover all the ways that token size can be specified.
// See: https://docs.aws.amazon.com/bedrock/latest/userguide/model-ids.html

@@ -19,6 +20,13 @@ func init() {
conf.ContributeValidator(embeddingsConfigValidator)
}

const bedrockArnMessageTemplate = "completions.%s is invalid. Provisioned Capacity IDs must be formatted like \"model_id/provisioned_capacity_arn\".\nFor example \"anthropic.claude-instant-v1/%s\""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case you were unaware, in Golang you can use the %q format specifier to put double quotes around a string. So this arguably could be cleaned up by using that instead of \"...\".

fmt.Sprintf(
    "completions.%s is invalid. Must be formatted like %q. For example %q.",
    rawValue,
    "model_id/provisioned_capacity_arn",
    "anthropic.claude-instant-v1/" + rawValue)

@@ -19,6 +20,13 @@ func init() {
conf.ContributeValidator(embeddingsConfigValidator)
}

const bedrockArnMessageTemplate = "completions.%s is invalid. Provisioned Capacity IDs must be formatted like \"model_id/provisioned_capacity_arn\".\nFor example \"anthropic.claude-instant-v1/%s\""

type modelId struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like throughout the PR, perhaps naming this modelID is the more idiomatic choice.

Also, because Golang is awesome like that, you can literally put a type definition within a function. So rather than introducing a package-level symbol here, you can define modelID inside the completionsConfigValidator function.

// Bedrock Model IDs can be in one of two forms:
// - A static model ID, e.g. "anthropic.claude-v2".
// - A model ID and ARN for provisioned capacity, e.g.
// "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"

Check warning

Code scanning / Semgrep OSS

Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn Warning

An AWS ARN has been detected. Please investigate to ensure it belongs to the appropriate customer and verify its validity.
Model string
// If the configuration is using provisioned capacity, this will
// contain the ARN of the model to use for making API calls.
// e.g. "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"

Check warning

Code scanning / Semgrep OSS

Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn Warning

An AWS ARN has been detected. Please investigate to ensure it belongs to the appropriate customer and verify its validity.
@@ -30,6 +33,35 @@
problems = append(problems, "'completions.enabled' has been superceded by 'cody.enabled', please migrate to the new configuration.")
}

// Check for bedrock Provisioned Capacity ARNs which should instead be
// formatted like:
// "anthropic.claude-v2/arn:aws:bedrock:us-west-2:012345678901:provisioned-model/xxxxxxxx"

Check warning

Code scanning / Semgrep OSS

Semgrep Finding: security-semgrep-rules.semgrep-rules.secret.match-aws-arn Warning

An AWS ARN has been detected. Please investigate to ensure it belongs to the appropriate customer and verify its validity.
@RXminuS RXminuS merged commit 6a7666c into main May 20, 2024
12 checks passed
@RXminuS RXminuS deleted the rnauta/fix/p-arn-identifier-override branch May 20, 2024 06:55
@RXminuS RXminuS restored the rnauta/fix/p-arn-identifier-override branch May 20, 2024 09:43
@RXminuS RXminuS deleted the rnauta/fix/p-arn-identifier-override branch May 20, 2024 09:43
@RXminuS RXminuS added the backport 5.4.0 label used to backport PRs to the 5.4.0 release branch label May 20, 2024
@sourcegraph-release-bot
Copy link
Collaborator

The backport to 5.4.0 failed at https://github.com/sourcegraph/sourcegraph/actions/runs/9158821256:

The process '/usr/bin/git' failed with exit code 1

To backport this PR manually, you can either:

Via the sg tool

Use the sg backport command to backport your commit to the release branch.

sg backport -r 5.4.0 -p 62720
Via your terminal

To backport manually, run these commands in your terminal:

# Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add .worktrees/backport-5.4.0 5.4.0
# Navigate to the new working tree
cd .worktrees/backport-5.4.0
# Create a new branch
git switch --create backport-62720-to-5.4.0
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 6a7666c42cf349a6b702b8c72c9cc2c4e8421ad7
# Push it to GitHub
git push --set-upstream origin backport-62720-to-5.4.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.4.0

If you encouter conflict, first resolve the conflict and stage all files, then run the commands below:

git cherry-pick --continue
# Push it to GitHub
git push --set-upstream origin backport-62720-to-5.4.0
# Go back to the original working tree
cd ../..
# Delete the working tree
git worktree remove .worktrees/backport-5.4.0
  • Follow above instructions to backport the commit.
  • Create a pull request where the base branch is 5.4.0 and the compare/head branch is backport-62720-to-5.4.0., click here to create the pull request.

Once the pull request has been created, please ensure the following:

  • Make sure to tag @sourcegraph/release in the pull request description.

  • kindly remove the release-blocker from this pull request.

@sourcegraph-release-bot sourcegraph-release-bot added backports release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases failed-backport-to-5.4.0 labels May 20, 2024
RXminuS added a commit that referenced this pull request May 20, 2024
* Improve Bedrock ARN handling

* Fix up PR comments

(cherry picked from commit 6a7666c)
craigfurman pushed a commit that referenced this pull request May 20, 2024
Client Compatible Bedrock ARN handling (#62720)

* Improve Bedrock ARN handling

* Fix up PR comments

(cherry picked from commit 6a7666c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 5.4.0 label used to backport PRs to the 5.4.0 release branch backports cla-signed failed-backport-to-5.4.0 p0 release-blocker Prevents us from releasing: https://about.sourcegraph.com/handbook/engineering/releases
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants