Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single metric for commands in History Service #4995

Merged
merged 3 commits into from Oct 19, 2023
Merged

Conversation

stephanos
Copy link
Contributor

@stephanos stephanos commented Oct 17, 2023

What changed?

(1) Added new single metric to track all commands coming into the History Service.
(2) Marked previous per-command-metrics as deprecated (will be removed in future release).
(3) Added namespace tag to metric.

Why?

Fixes #4628

How did you test it?

Used the debugger to inspect during a test:

Screenshot 2023-10-17 at 4 38 25 PM

Is there a better way?

Potential risks

Is hotfix candidate?

No

ActivityE2ELatency = NewTimerDef("activity_end_to_end_latency")
AckLevelUpdateCounter = NewCounterDef("ack_level_update")
AckLevelUpdateFailedCounter = NewCounterDef("ack_level_update_failed")
CommandCounter = NewCounterDef("command")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new metric. If you have a more specific name in mind I could use, let me know!

Comment on lines 219 to 222
handler.metricsHandler.Counter(metrics.CommandCounter.GetMetricName()).Record(
1,
metrics.NamespaceTag(handler.mutableState.GetExecutionInfo().NamespaceId),
metrics.CommandTypeTag(command.GetCommandType().String()))
Copy link
Contributor Author

@stephanos stephanos Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some consideration, I decided to put this here, even though this will now include "unknown command types" (see default case statement). Which is actually a plus in my book since now we get better visibility into unhandled command types (I don't expect many?).


handler.metricsHandler.Counter(metrics.CommandCounter.GetMetricName()).Record(
1,
metrics.NamespaceTag(handler.mutableState.GetExecutionInfo().NamespaceId),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use namespace name, not namespace_id

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, handler.metricsHandler should be tagged with namespace when the handler is created.

@stephanos stephanos marked this pull request as ready for review October 18, 2023 16:52
@stephanos stephanos requested a review from a team as a code owner October 18, 2023 16:52
@stephanos stephanos changed the title Track command type and namespace Single metric for commands in History Service Oct 18, 2023
@stephanos stephanos merged commit 068bd38 into main Oct 19, 2023
10 checks passed
@stephanos stephanos deleted the single-command-metric branch October 19, 2023 14:29
tdeebswihart added a commit to tdeebswihart/temporal-server that referenced this pull request Oct 24, 2023
commit b5825422f2a3215f7d4a7ed649204d89e6126b30
Author: Roey Berman <roey@temporal.io>
Date:   Tue Oct 24 13:56:49 2023 -0700

    Disable eager activities for incompatible versioned activities (#5030)

    Only enable eager activity start if dynamic config enables it and
    workflow doesn't use versioning.
    If a workflow _uses_ versioning, only allow eager activities which
    intend to use a compatible version since a
    worker is obviously compatible with itself and we are okay dispatching
    an eager task knowning that there may be a
    newer "default" compatible version.
    Note that if `UseCompatibleVersion` is false, it implies that the
    activity should run on the "default" version
    for the task queue.

    ---------

    Co-authored-by: David Reiss <dnr@dnr.im>

commit dbdd24b6b4906308302d0801f0cb24633c81deb4
Author: Yichao Yang <yichao@temporal.io>
Date:   Tue Oct 24 13:06:43 2023 -0700

    Fix shard task key manager tests (#5031)

commit e9ef3091659bd32906eb97879a42481c1ee4e95c
Author: pdoerner <122412190+pdoerner@users.noreply.github.com>
Date:   Tue Oct 24 06:57:49 2023 -0700

    Set delete namespace activity concurrency limits (#5013)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Added new dynamic config map of options to set concurrency limits for
    the delete namespace activity worker:
    `worker.deleteNamespaceActivityLimitsConfig`

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    For finer-grained control over worker service resource usage

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Existing tests

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    None

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit 24557b348619e17799452daed53708225b9195ba
Author: pdoerner <122412190+pdoerner@users.noreply.github.com>
Date:   Mon Oct 23 16:42:08 2023 -0700

    Register system workflow activities with default worker (#5017)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Registering system workflow activities with the default worker.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    This is to prevent running workflows from getting stuck when upgrading
    to 1.23 and will be removed in 1.24.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Existing tests

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    None

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit aa4242673b16485d0df06c0de077980af83c14fe
Author: Yu Xia <yuhsia89@gmail.com>
Date:   Mon Oct 23 15:02:59 2023 -0700

    Fix create namespace replication task with one cluster (#5024)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Fix create namespace replication task with one cluster

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    When creating namespace with one cluster, we don't need to create
    replication task.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    New unit test

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit 5e6eadfbda3bfebabcfcadcbdf666a47e581ef1a
Author: Stephan Behnke <stephanos@users.noreply.github.com>
Date:   Mon Oct 23 09:36:57 2023 -0700

    Upgrade otel to v1.19 (#5016)

    <!-- Describe what has changed in this PR -->
    **What changed?**

    WISOTT

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    Fixes https://github.com/temporalio/temporal/issues/4996

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    Automated tests (which actually include metric tests!)

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    Metrics are not reported / incorrectly reported?

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

    No.

commit df4705d6488485ae12e27f4cb1c719b62c980304
Author: Yichao Yang <yichao@temporal.io>
Date:   Fri Oct 20 17:04:31 2023 -0700

    No shard lock on I/O: task key manager (#5008)

commit 9c5b1b8b873d7e1eec8d322d3e80fc6cdb2eb22c
Author: Yu Xia <yuhsia89@gmail.com>
Date:   Fri Oct 20 15:43:16 2023 -0700

    Update force replication with low priority context (#5010)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Update force replication with low priority context

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Update force replication with low priority context

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 64eb57d2487cd44657648b89b39733d484345309
Author: Rodrigo Zhou <rodrigozhou@users.noreply.github.com>
Date:   Fri Oct 20 16:50:12 2023 -0500

    Add ExecutionDuration and StateTransitionCount search attributes to SQL schema (#4961)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Add ExecutionDuration and StateTransitionCount to SQL visibility
    schemas.
    Add unit test to validate all system search attributes are mapped to SQL
    DB column name.
    Moved StateTransitionCount to visibility Close request only.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Match features with Elasticsearch.
    https://github.com/temporalio/temporal/issues/4942

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Run unit tests.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    No.

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No.

commit b9d1be7576718d0c02a15f5565aa716e56c82b2e
Author: Ajay Kemparaj <ajaykemparaj@gmail.com>
Date:   Fri Oct 20 14:43:09 2023 -0700

    golang.org/x/net: update to address CVE-2023-39325, CVE-2023-3978, CV… (#5011)

    golang.org/x/net: update to address CVE-2023-39325, CVE-2023-3978,
    CVE-2023-44487

    <!-- Describe what has changed in this PR -->
    **What changed?**
    golang.org/x/net was upgraded to the latest version

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    the dependency was upgraded to address the following cves
    CVE-2023-39325, CVE-2023-3978, CVE-2023-44487

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Built the binaries locally using make

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 0e12d3ad29602fba162f85e481dbcf48992e0629
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Fri Oct 20 14:34:00 2023 -0700

    Add EnableHistoryReplicationDLQV2 dc (#5012)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    This PR adds a dynamic config flag which controls the rollout of the
    persistence queue V2 backend for writing history replication tasks to
    the DLQ.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Our first priority of the queue v2 migration is to make it work for
    history replication tasks, so this PR adds the ability for us to roll
    that out.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    I added branches to our history replciation DLQ integration test that
    turn on this dynamic config flag.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit f1599f0caf47bc3497536d14e07be1ed67486ea9
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Fri Oct 20 13:41:57 2023 -0700

    Add a replication.DLQWriter interface (#5009)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    I added a `DLQWriter` interface to the `replication` package, which
    takes a `WriteRequest` containing a `shard.Context`, a source cluster
    name, and replication task info, and writes a task to the DLQ. I also
    added a default implementation which uses our v1 queue supplied by the
    shard.

    I also refactored `replication/fx.go` in two ways:
    1. **Don't use `fx.Options`**: Its usage is discouraged in its
    documentation, and we can replace it by simply having a variadic call to
    `fx.Provide`, and adding lifecycle hooks for the scheduler in the
    provider instead of an `fx.Invoke` call.
    2. **Keep providers un-exported**: I didn't see a reason for having
    these exported, and I think it just bloats the footprint of this
    package. It's clearer when you see that it just provides a few
    constructors and a module.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    We want to move away from using
    `ExecutionManager.PutReplicationTaskToDLQ`; however, that method is
    called in several places in the `replication` package. We're going to
    fix this in stages:

    1. Hide all calls to `ExecutionManager` behind a common interface, so
    that it's really only called in one place
    3. Add an alternate implementation based on `persistence.QueueV2` via
    `queues.DLQWriter`
    4. Roll out the alternate implementation
    5. Remove the implementation that uses the `ExecutionManager`

    This PR is just step one.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    I added a unit test, and I also added an integration test for this in
    https://github.com/temporalio/temporal/pull/5003.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 4a449c3197ceb10ceb5260f26648f77b4744c97b
Author: Prathyush PV <prathyushpv@gmail.com>
Date:   Fri Oct 20 13:20:38 2023 -0700

    Adding RangeDelete and CreateQueue methods for QueueV2 SQL Implementation (#4980)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Adding RangeDeleteMessages() and CreateQueue() methods for SQL version
    of persistence.QueueV2, which supersedes the persistence.Queue
    interface.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    We plan on using this for the upcoming history task DLQ project.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Unit tests. Code has 100% unit test coverage.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

    ---------

    Co-authored-by: Michael Snowden <michaelosnowden@gmail.com>

commit cb2139cefb6cda5e1a0a44dae70a0c644f8d6459
Author: Dan Davison <dan.davison@temporal.io>
Date:   Fri Oct 20 19:23:24 2023 +0100

    Edit comments (#4946)

    Edit some comments

commit 5809a24eea53ca660fe55a1fec64429e3546c39a
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Thu Oct 19 21:54:25 2023 -0700

    Add a history replication DLQ test (#5003)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    This change adds an integration test for the history replication DLQ. I
    also improved some error messages in replication that I found while
    writing this test.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    It's mainly to help rolling out the new DLQ backend implemented using
    QueueV2
    1. I'll need a similar test, so I can reuse a lot of this
    2. I'll be refactoring some code in the replication stack, so this test
    verifies that I don't cause any regressions.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    This test runs with streaming both enabled and disabled for replication,
    so it should work just fine once that's turned on by default.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    This test is pretty invasive, relying on several assumptions about what
    types are used by the replication stack. However, this means we don't
    have any sleeps or retries in the test.

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 889661af553dfb3714cbb3ce7af9ff1ade3a6796
Author: Haifeng He <haifeng.he@temporal.io>
Date:   Thu Oct 19 19:46:42 2023 -0700

    Fail VerifyReplicationTask if any task is not found on remote cluster (#5007)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    An optimization was added in
    https://github.com/temporalio/temporal/pull/4791 but we haven't found
    any real need to apply the optimization. Remove that optimization to
    reduce code complexity

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Simplify code logic

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Unit tests

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 5baf273eb3c0bc132a2a58b7a2b13ea1e1cf5752
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Thu Oct 19 14:43:27 2023 -0700

    Extract new queues.DLQWriter type (#5002)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    I extracted a `DLQWriter` from our `ExecutableDLQ` type in the `queues`
    package.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    For the replication DLQ, we don't want `ExecutableDLQ` because
    replication tasks track their state in a different way, and so they have
    an explicit `MarkPoisonPill` method. The `ExecutableDLQ` currently does
    2 things:

    1. Track whether this executable failed with a terminal error
    2. If it did, try sending the task to the DLQ whenever we execute again

    So, for replication tasks, I only need the second part. I went ahead and
    extracted that to its own type as some preparatory refactoring for the
    replication DLQ change.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    This is already fully tested by `executable_dlq_test.go`.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit f244dcde176886e54cbd24ca1e42a133ac19a4f6
Author: Will Duan <xinw.duan@gmail.com>
Date:   Thu Oct 19 14:00:06 2023 -0700

    Fix replication task batching dc config (#5004)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Fix replication task batching dc config

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Reverse the logic.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    test in local

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    n/a

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    n/a

commit 068bd38be15cc9af896680e4012f988086a33a78
Author: Stephan Behnke <stephanos@users.noreply.github.com>
Date:   Thu Oct 19 07:29:54 2023 -0700

    Single metric for commands in History Service (#4995)

    <!-- Describe what has changed in this PR -->
    **What changed?**

    (1) Added new single metric to track _all_ commands coming into the
    History Service.
    (2) Marked previous per-command-metrics as deprecated (will be removed
    in future release).
    (3) Added namespace tag to metric.

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    Fixes https://github.com/temporalio/temporal/issues/4628

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    Used the debugger to inspect during a test:

    ![Screenshot 2023-10-17 at 4 38 25
    PM](https://github.com/temporalio/temporal/assets/159852/0eb09b1a-27e3-4189-969e-c613fba4c67b)

    Is there a better way?

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

    No

commit b8f21233dfae7b370056d4fb150f641402347b1c
Author: Haifeng He <haifeng.he@temporal.io>
Date:   Wed Oct 18 16:33:24 2023 -0700

    Reuse cached RemoteAdminClient in VerifyReplicationTasks (#4997)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Current code create a new AdminClient for every VerifyReplicationTasks
    activity invocation. Since there is no way to close a connection, it
    caused a leak. This change uses cached version for creating AdminClient.

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    unit tests + local cluster tests

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 9b1b6f991a1d08d2f1ef181f96666ae25c4ec3da
Author: Yu Xia <yuhsia89@gmail.com>
Date:   Wed Oct 18 15:22:26 2023 -0700

    Add refresh task for close workflow (#4999)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Add refresh task for close workflow

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Add refresh task for close workflow

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    This is a subset of refresh all tasks

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit f65084a2d037f9100e3e71caf87c096834b02209
Author: Yichao Yang <yichao@temporal.io>
Date:   Wed Oct 18 11:33:20 2023 -0700

    Lock get current execution for API requests (#4970)

commit 47a900f1efa9b65ca7ef9f79956d7c2a7dba281b
Author: Stephan Behnke <stephanos@users.noreply.github.com>
Date:   Wed Oct 18 10:37:03 2023 -0700

    silence verbose `proto` target (#4988)

    <!-- Describe what has changed in this PR -->
    **What changed?**

    The make target `proto` is quieter now when it succeeds (like most of
    our other Makefile targets).

    ### After

    ```
    $ make proto
    Run buf linter...
    Run api-linter...
    Build proto files...
    Run goimports for proto files...
    Generate proto mocks...
    Update license headers for proto files...
    ```

    ### Before

    ```
    $ make proto
    Install proto submodule...
    git submodule update --init proto/api
    Run buf linter...
    Run api-linter...
    - file_path: temporal/server/api/token/v1/message.proto
      problems: []
    - file_path: temporal/server/api/metrics/v1/message.proto
      problems: []
    - file_path: temporal/server/api/taskqueue/v1/message.proto
      problems: []
    - file_path: temporal/server/api/cluster/v1/message.proto
      problems: []
    - file_path: temporal/server/api/schedule/v1/message.proto
      problems: []
    - file_path: temporal/server/api/update/v1/message.proto
      problems: []
    - file_path: temporal/server/api/matchingservice/v1/service.proto
      problems: []
    - file_path: temporal/server/api/matchingservice/v1/request_response.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/workflow_task_type.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/task.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/cluster.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/predicate.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/replication.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/workflow.proto
      problems: []
    - file_path: temporal/server/api/enums/v1/common.proto
      problems: []
    - file_path: temporal/server/api/archiver/v1/message.proto
      problems: []
    - file_path: temporal/server/api/namespace/v1/message.proto
      problems: []
    - file_path: temporal/server/api/cli/v1/message.proto
      problems: []
    - file_path: temporal/server/api/historyservice/v1/service.proto
      problems: []
    - file_path: temporal/server/api/historyservice/v1/request_response.proto
      problems: []
    - file_path: temporal/server/api/workflow/v1/message.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/task_queues.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/predicates.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/cluster_metadata.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/tasks.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/executions.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/history_tree.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/namespaces.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/queue_metadata.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/queues.proto
      problems: []
    - file_path: temporal/server/api/persistence/v1/workflow_mutable_state.proto
      problems: []
    - file_path: temporal/server/api/history/v1/message.proto
      problems: []
    - file_path: temporal/server/api/replication/v1/message.proto
      problems: []
    - file_path: temporal/server/api/checksum/v1/message.proto
      problems: []
    - file_path: temporal/server/api/errordetails/v1/message.proto
      problems: []
    - file_path: temporal/server/api/clock/v1/message.proto
      problems: []
    - file_path: temporal/server/api/adminservice/v1/service.proto
      problems: []
    - file_path: temporal/server/api/adminservice/v1/request_response.proto
      problems: []
    Build proto files...
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/adminservice/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/archiver/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/checksum/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/cli/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/clock/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/cluster/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/enums/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/errordetails/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/history/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/historyservice/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/matchingservice/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/metrics/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/namespace/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/persistence/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/replication/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/schedule/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/taskqueue/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/token/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/update/v1/*.proto
    protoc --fatal_warnings -I=proto/internal -I=proto/api -I=../golang/1.21.0/packages/pkg/mod/github.com/temporalio/gogo-protobuf@v1.22.1/protobuf --gogoslick_out=Mgoogle/protobuf/descriptor.proto=github.com/golang/protobuf/protoc-gen-go/descriptor,Mgoogle/protobuf/duration.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/wrappers.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/timestamp.proto=github.com/gogo/protobuf/types,Mgoogle/protobuf/empty.proto=github.com/gogo/protobuf/types,plugins=grpc,paths=source_relative:api ./proto/internal/temporal/server/api/workflow/v1/*.proto
    mv -f api/temporal/server/api/* api && rm -rf api/temporal
    Run goimports for proto files...
    Generate proto mocks...
    cd api && mockgen -copyright_file ../LICENSE -package matchingservicemock -source matchingservice/v1/service.pb.go -destination matchingservicemock/v1/service.pb.mock.go
    cd api && mockgen -copyright_file ../LICENSE -package historyservicemock -source historyservice/v1/service.pb.go -destination historyservicemock/v1/service.pb.mock.go
    cd api && mockgen -copyright_file ../LICENSE -package adminservicemock -source adminservice/v1/service.pb.go -destination adminservicemock/v1/service.pb.mock.go
    Update license headers for proto files...
    ```

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    Success output shouldn't be verbose; error output should be.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    - [x] happy case (no errors)
    - [x] linter errors: errors are still printed and target fails

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    Targets silently breaking?

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

    No

commit 5ea0c14a0e416dd3652136f4718544d90d921526
Author: Yu Xia <yuhsia89@gmail.com>
Date:   Tue Oct 17 22:10:57 2023 -0700

    Fix misspelled word (#4993)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    apple -> applied

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit eda8e9ae23ff2b150a49b64ec034f8bba9e1cb0b
Author: Jacob LeGrone <jlegrone@users.noreply.github.com>
Date:   Wed Oct 18 01:07:05 2023 -0400

    Migrate temporalite & temporaltest packages (#4026)

    <!-- Describe what has changed in this PR -->
    **What changed?**

    Moved `temporalite` and `temporaltest` packages from Temporalite. This
    is an alternate version of
    https://github.com/temporalio/temporal/pull/4006.

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    Step 1 in the plan to deprecate the Temporalite repository:
    https://github.com/temporalio/temporalite/issues/202

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    `go test ./temporaltest`

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    A non-semver-breaking backwards compatibility guarantee is being
    included in the new packages which we would need to abide by. For now
    only the `temporaltest` package is public, so we are free to perform
    additional refactoring on `temporalite` and potentially consolidate it
    into a simple set of `temporal.ServerOption` functions.

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

    No

    ---------

    Co-authored-by: Yimin Chen <yimin.chen@live.com>

commit 29b888d3db52ec12d3013dea1b3f5d79055b19eb
Author: Yichao Yang <yichao@temporal.io>
Date:   Tue Oct 17 21:41:02 2023 -0700

    Remove unused task_id field from recordTaskStarted request (#4949)

commit b1f5d5e0b6dc1e2c048f0942e885902ce1275512
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Tue Oct 17 19:06:16 2023 -0700

    Implement AdminService.PurgeDLQTasks (#4972)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    This PR adds the `PurgeDLQTasks` RPC to the history service. This RPC
    allows users to delete tasks from the new history task DLQ.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    We want this so that users can delete DLQ'd tasks once they've dealt
    with them (e.g. re-enqueued them, decided they're poison pills, etc.).

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    There's a large suite of tests that covers the following:
    1. All error branches (100% test coverage)
    2. When arguments are invalid, we return a non-retryable error
    3. When the service returns an "Unavailable" error, the workflow returns
    a retryable error
    4. We validate the request parameters
    5. Integration tests against the actual admin and history service

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 8499f2d07ce32da550266624ef9d211fd037d68f
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Tue Oct 17 17:52:15 2023 -0700

    Get rid of frontend.NamespaceHandler interface (#4994)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    I don't think this is used anywhere, and it seems to be an unnecessary
    abstraction.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    It confused me.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Build.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit b51b1c940dea4a5452cf8473b2d8ec62261b75d8
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Tue Oct 17 17:32:27 2023 -0700

    Create a task category registry object (#4953)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    1. I added a task category registry and plumbed it through our
    dependency graph, removing the global methods like `GetCategoryByID` and
    `NewCategory`.
    2. I also changed the signature to use an `int` instead of an `int32`
    for the category ID

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    1. To make it easier to add dynamic task categories without relying on
    init side effects calling `NewCategory`. Basically, the general
    rationale for why to avoid global state. However, in addition, I'll soon
    need a dynamic task category decoder registry for the DLQ. This isn't
    strictly a prerequisite, but I figured I'd fix it while I'm doing this.
    2. To avoid proliferation of the int32 type. The fact that the proto
    field type of the TaskCategory enum is an int32 is an implementation
    detail, and it should only be needed when we're converting to/from
    protos.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    The existing `server_test.go` verifies that the fx graph builds. I also
    ensured that the conditional archival task category registration still
    has unit tests.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 945203e8b80e992b2a6818566291fc018a81a786
Author: Shahab Tajik <shahab@temporal.io>
Date:   Tue Oct 17 17:14:57 2023 -0700

    Balance open polls across TQ partition (#4981)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    For each TQ, Matching client LB keeps track of open pollers per
    partition and selects a partition with the fewest open pollers.
    Forwarded polls are only counted once, for the original partition.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    In situation where a TQ has no task to dispatch (or it cannot dispatch
    for another reason) it can become a black hole for pollers. This change
    prevent that to happen (assuming the number of pollers is greater than
    the number of partitions)

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Unit test
    Locally ran against multiple workloads

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    None known

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit 41f3eaae1f24344b5d48680c2f4b14e604801fbc
Author: Will Duan <xinw.duan@gmail.com>
Date:   Tue Oct 17 14:12:50 2023 -0700

    Improve batchedTask enqueue efficiency (#4861)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Move unlock prior to the execute() finishing.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    The underlying Execute() func usually involves I/O with DB and take
    longer time to execute.
    The Lock in the batchedTask is to prevent the batchTask accepting new
    Task when executing. We have `state` to prevent this happening, so we
    can unlock() before underlying Execute() and let the AddTask() fail
    faster.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    N/a

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    N/a

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit c39497c290ac88d5df7333f3cb4ad245bb7ab876
Author: pdoerner <122412190+pdoerner@users.noreply.github.com>
Date:   Tue Oct 17 13:35:58 2023 -0700

    Add history field to matchingservice.PollWorkflowTaskQueueResponse (#4968)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Added a new field for history to
    `matchingservice.PollWorkflowTaskQueueResponse` proto
    Added a new field for `NextPageToken` to
    `matchingservice.PollWorkflowTaskQueueResponse` proto

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    To support moving logic for PollWorkflowTaskQueue from frontend to
    matching/history.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Existing tests.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    None

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit 288954c3af9e8640b9f7a155cc48c0a65fd51968
Author: David Reiss <david@temporal.io>
Date:   Tue Oct 17 12:43:26 2023 -0700

    Fix flaky TestShardControllerFuzz (#4991)

    **What changed?**
    - Fix flaky test
    - Reduce some misleading log messages (stopping when stopping or stopped
    is okay)

    **Why?**
    Flaky test and misleading logs bad

    **How did you test it?**
    Ran > 500 times locally with -race

commit 453517020ddd3d9b6ee71e75c267ff3a17d71e2f
Author: Quinn Klassen <klassenq@gmail.com>
Date:   Tue Oct 17 10:46:33 2023 -0700

    Update Go SDK to v1.25.1 (#4955)

    Update Go SDK to v1.25.1

commit b4d80fbb5f5780de7006908e60b948dd294ab499
Author: David Reiss <david@temporal.io>
Date:   Tue Oct 17 10:45:17 2023 -0700

    Fix flaky TestAddTaskAfterStartFailure (#4989)

    **What changed?**
    Fix flaky test

    **Why?**
    Flaky test bad

    **How did you test it?**
    Adjusted timeout and ran many times

commit bbd677dce188ce1056b36c0ffb38b5db035c313d
Author: Wenquan Xing <wxing1292@users.noreply.github.com>
Date:   Mon Oct 16 16:25:16 2023 -0700

    Fix GenerateMigrationTasks behavior (#4987)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    * GenerateMigrationTasks API should also propagate pending activity info

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    For namespace migration, history events as well as (updated) activity
    info both should be replicated

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    N/A

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    N/A

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    N/A

commit ab52c46561d567ed8fab5ac523d1044b243843b9
Author: Eng Zer Jun <engzerjun@gmail.com>
Date:   Tue Oct 17 06:04:47 2023 +0800

    describeworkflow: remove redundant `len` check (#4983)

    **What changed?**

    From the Go specification (https://go.dev/ref/spec#For_range):

    > "3. If the map is nil, the number of iterations is 0."

    `len` returns 0 if the map is nil (https://pkg.go.dev/builtin#len).
    Therefore, checking `len(v) > 0` around a loop is unnecessary, there
    won't be nil pointer exception. Example:
    https://go.dev/play/p/vRCsabx62Ef

    **Why?**

    **How did you test it?**
    `make test`

    **Potential risks**
    None.

    **Is hotfix candidate?**
    No.

    Signed-off-by: Eng Zer Jun <engzerjun@gmail.com>

commit 04eaab42dfe2bc23d77e7dc3039432d0cf9c27de
Author: Yichao Yang <yichao@temporal.io>
Date:   Mon Oct 16 13:01:33 2023 -0700

    No shard lock on I/O: task request tracker (#4952)

commit e775fb517aa81d95ecc48b81d085b9c70ca01332
Author: Yichao Yang <yichao@temporal.io>
Date:   Mon Oct 16 13:01:12 2023 -0700

    No shard lock on I/O: task key generator (#4951)

commit 65c7c69bed1edda24778cb7c25fdf8477716ca32
Author: Will Duan <xinw.duan@gmail.com>
Date:   Mon Oct 16 12:02:02 2023 -0700

    Implement the batching functionality on history event replication task (#4916)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Implement the batching functionality on history event replication task

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Improve performance of history event replication task

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    manually tested locally. Unit tests.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    N/a. Feature is behind a feature flag.

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    no

commit d04cfecb896ddd183c7da4383f731797d481b123
Author: pdoerner <122412190+pdoerner@users.noreply.github.com>
Date:   Mon Oct 16 11:57:51 2023 -0700

    Allow specifying a dedicated activity worker for system worker components (#4854)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Added support for specifying a separate activity worker for system-level
    worker service components.
    If not set, activities will use the same task queue name as workflow
    tasks.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    To give more fine-grained control over priorities and rate limits for
    different types of background processes.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Existing tests. No functional changes in this PR

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    None

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

    ---------

    Co-authored-by: David Reiss <dnr@dnr.im>

commit 1eeb92e33efae4cc7757caacdf0ddca6b4978f37
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Sat Oct 14 07:27:50 2023 -0700

    Implement tdbg dlq --dlq-version v2 read (#4899)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    I modified our `tdbg dlq` command to accept a `--dlq-version` flag, and
    I implemented the `read` subcommand for it.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    So that operators have a way to read messages for DLQ v2.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    There's nearly 100% test coverage. Unit tests cover config validation
    and subsystem errors, and there's a large integration test which covers
    the happy path.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 3220c4c42ce624776b4337db66aa070fe458ccde
Author: Stephan Behnke <stephanos@users.noreply.github.com>
Date:   Fri Oct 13 14:08:28 2023 -0700

    🧹 Update Workflow nits (#4964)

    <!-- Describe what has changed in this PR -->
    **What changed?**

    Exported/un-exported structs, fixed typos, removed unused code,
    re-ordered some test setup code.

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    Improve readability/cleanup.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    Automated tests. No actual behaviour was changed.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    N/A

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

    No

commit 76bfcbce8dd17eede42e81be94dcad451007c580
Author: Rodrigo Zhou <rodrigozhou@users.noreply.github.com>
Date:   Fri Oct 13 15:27:54 2023 -0500

    Collapse visibility tasks (#4893)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Collapse visibility tasks to minimize updates to visibility store.
    The list of tasks is collapsed into a single one: START < UPSERT < CLOSE
    < DELETE.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Minimize updates to visibility store.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Updated unit tests.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    Probably not, each visibility task writes the entire record into the
    store, so the latest task should contain all the info to keep visibility
    up to date.

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No.

commit de122051eea9ed7b6652c5a190dc25ad5aa15010
Author: Rodrigo Zhou <rodrigozhou@users.noreply.github.com>
Date:   Fri Oct 13 15:25:39 2023 -0500

    Ensure latest Go version in GH workflows (#4977)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Ensure latest Go version in GH workflows

    <!-- Tell your future self why have you made these changes -->
    **Why?**

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 0b75aa6afa276aeef792f0ff9be03875d4676b05
Author: David Reiss <david@temporal.io>
Date:   Fri Oct 13 12:54:27 2023 -0700

    Improve scheduler workflow unit test (#4962)

    **What changed?**
    The end of each test is signaled by time instead of iteration count,
    which was really confusing.

    **Why?**
    Easier to write and maintain tests.
    They now stop at exactly the right time and avoid spurious calls that
    would trigger panics.

    **How did you test it?**
    is tests

commit 34c52e771849bfbd0f6637f32965636e7d7c545e
Author: Roey Berman <roey@temporal.io>
Date:   Fri Oct 13 11:31:38 2023 -0700

    Change comment in update's dynamic config (#4976)

    **What changed?**

    Mention that workflow update is well tested.

    **Why?**

    Comment said it is not tested or ready for production use.

commit 052570c2b10f2739c9cee8c638929c254c0c3881
Author: David Reiss <david@temporal.io>
Date:   Fri Oct 13 10:10:52 2023 -0700

    Update Version in scheduler workflow (#4971)

    **What changed?**
    Enable changes in #4911

    **Why?**
    Actually fix bug

    **How did you test it?**
    Enabled tests from #4911

commit 5e87117e792575c1ae50f1d409947df10e825c03
Author: Gallyam Biktashev <gallyamb@gmail.com>
Date:   Fri Oct 13 21:47:20 2023 +0500

    Fix typo (#4973)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Typo in docs fixed (M**in**Time -> M**ax**Time)

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    To make documentation reflect code state

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    Tests are not necessary

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**
    -

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**
    No

commit 7362b541bd11b39a1e8f0561f927416a6fad1858
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Thu Oct 12 16:48:39 2023 -0700

    Add HistoryService.AddTasks API (#4963)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    This PR adds the `AddTasks` RPC to the history service. This RPC accepts
    a list of tasks for a given shard and adds them to the queue.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    We need this in order to re-enqueue tasks for the DLQ.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    There's 100% test coverage for all lines, which includes a lot of
    request validation (like tasks for a different shard than the request).
    There's also an integration test. In addition, I verified that the tasks
    are batched correctly when a request has tasks for multiple workflows.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit e2fd40083a461f8d0d3d79e829bb913033990d32
Author: David Reiss <david@temporal.io>
Date:   Thu Oct 12 13:39:26 2023 -0700

    Rename deadlock detector latency metrics (#4966)

    **What changed?**
    Rename go constants and metric names for latency metrics collected by
    deadlock detector.

    **Why?**
    There was some confusion between the two "shard lock latency" metrics.
    The other one is more useful since it's collected on every usage instead
    of once in a while, and both read and write locks. Renaming the deadlock
    detector ones makes it more clear where they're coming from.

    **How did you test it?**
    Just renames.

    **Potential risks**
    The exported metric names are changing, so anyone using them will have
    to change. But it's unlikely anyone is using them since they were never
    documented. We can make a note in release notes.

commit 7f21d05019f561062f74625d6bdd425f4a67137f
Author: Michael Snowden <MichaelSnowden@users.noreply.github.com>
Date:   Thu Oct 12 12:51:16 2023 -0700

    Extract ServiceProviderParamsCommon.GetCommonServiceOptions (#4965)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    1. There's a lot of duplicated code in `temporal/fx.go` where we convert
    the dependencies in `ServiceProviderParamsCommon` into `fx.Option`s. I
    simply extracted a method for this so that the duplication is avoided.
    2. I also increased the cyclomatic complexity threshold in our linter
    because it was complaining about the `ServerOptionsProvider` function,
    which is not complex in my opinion, just a little long.

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    We probably had the duplication originally because `fx.Decorate` didn't
    exist, so the customization that the frontend service does would've been
    impossible.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    There's the existing `server_test.go`, which verifies that the `fx`
    graph builds and runs.

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or require that a notification be
    sent to the broader community? (Yes/No) -->
    **Is hotfix candidate?**

commit 7f45bbba1a27ff3f8e3aa1ee6be0c1f88baf9ee4
Author: David Reiss <david@temporal.io>
Date:   Wed Oct 11 18:32:08 2023 -0700

    Set task queue kind in tests (#4956)

    **What changed?**
    Set task queue kind field in functional tests, and NormalName for sticky
    task queues.
    Small formatting cleanups in functional tests.

    **Why?**
    Tests should act like SDKs, which always set this field now.
    Stops `Unspecified task queue kind` warning.

    **How did you test it?**
    tests

commit e06451221c8f2e99dd653fcad6551a56b5ce4a9b
Author: Yu Xia <yuhsia89@gmail.com>
Date:   Wed Oct 11 10:36:58 2023 -0700

    Handle branch token update with long poll API (#4943)

    <!-- Describe what has changed in this PR -->
    **What changed?**
    Handle branch token update with long poll API

    <!-- Tell your future self why have you made these changes -->
    **Why?**
    Simply compare branch token is not a good way to tell if the branch is
    changed.

    <!-- How have you verified this change? Tested locally? Added a unit
    test? Checked in staging env? -->
    **How did you test it?**
    release testing

    <!-- Assuming the worst case, what can be broken when deploying this
    change to production? -->
    **Potential risks**

    <!-- Is this PR a hotfix candidate or re…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Metrics - allow "child_workflow_command" counter to be filtered by namespace
2 participants