Skip to content

Rework the instance_name config api #1649

@aaronmondal

Description

@aaronmondal
Contributor

Somewhat of a follow-up to #834.

Our current configuration looks like this:

    "services": {
      "cas": {
        "main": {
          "cas_store": "WORKER_FAST_SLOW_STORE"
        }
      },
      "ac": {
        "main": {
          "ac_store": "AC_MAIN_STORE"
        }
      },
      "execution": {
        "main": {
          "cas_store": "WORKER_FAST_SLOW_STORE",
          "scheduler": "MAIN_SCHEDULER"
        }
      },
      "capabilities": {
        "main": {
          "remote_execution": {
            "scheduler": "MAIN_SCHEDULER"
          }
        }
      },
      "bytestream": {
        "cas_stores": {
          "main": "WORKER_FAST_SLOW_STORE"
        }
      }

The "main" here refers to the "instance_name" as per https://github.com/bazelbuild/remote-apis/blob/2721568dea746d81a98a116ee222c8da50bbcf5c/build/bazel/remote/execution/v2/remote_execution.proto#L1405.

There are multiple issues with this approach:

  • This has repeatedly been a source of confusion for users as it's not immediately clear that this is the instance name without consulting additional documentation.
  • It's a dynamic key, which is incompatible with how K8s wants CRDs to look like. In other words, you can't create correct yaml with this.
  • The default should be no instance name. This looks ugly as it requires an empty string key:
"cas": {
  "": {
    "cas_store": "WORKER_FAST_SLOW_STORE"
  }
},

The initially mentioned discussion was about a similar issue. In the same spirit, it seems better to use something along these lines:

    "services": {
      "cas": {
        "instance_name": "main",
        "cas_store": "WORKER_FAST_SLOW_STORE"
      },
      "ac": {
        "instance_name": "main",
        "ac_store": "AC_MAIN_STORE"
      },
      "execution": {
        "instance_name": "main",
        "cas_store": "WORKER_FAST_SLOW_STORE",
        "scheduler": "MAIN_SCHEDULER"
      },
      "capabilities": {
        "instance_name": "main",
        "remote_execution": {
          "scheduler": "MAIN_SCHEDULER"
        }
      },
      "bytestream": {
        "cas_stores": [
        {
          "instance_name": "main",
          "ref": "WORKER_FAST_SLOW_STORE"
        }]
      }

This solves all above mentioned issues because

  • It's self-documenting
  • We're no longer using a dynamic key for the instance name.
  • If we make the "instance_name" field optional an default to an empty string if not set users can omit it entirely and we get this:
"cas": {
  "cas_store": "WORKER_FAST_SLOW_STORE"
},

Activity

MarcusSorealheis

MarcusSorealheis commented on Apr 1, 2025

@MarcusSorealheis
Collaborator

@aaronmondal First of all, I totally agree. Lingering technical debt we need to get rid of ahead of the v1.0.0 release. Why in the example do you make the value of instance_name the same for each instance? What does it mean to share a name? That is unclear here.

barrbrain

barrbrain commented on Apr 7, 2025

@barrbrain
Contributor

Adding to the sketch for clarity, something like the following would be consistent with established patterns:

    "cas": [
      {
        "instance_name": "main",
        "cas_store": "WORKER_FAST_SLOW_STORE"
      },
      {
        "instance_name": "sub",
        "cas_store": "SUB_WORKER_FAST_SLOW_STORE"
      }
    ],
added a commit that references this issue on May 27, 2025
8643700
added a commit that references this issue on Jun 3, 2025
67500dd
linked a pull request that will close this issueMove Bytestream to array config #1951on Oct 7, 2025
palfrey

palfrey commented on Oct 7, 2025

@palfrey
Member

#1951 finished this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Participants

    @palfrey@barrbrain@MarcusSorealheis@jhpratt@aaronmondal

    Issue actions

      Rework the instance_name config api · Issue #1649 · TraceMachina/nativelink