Skip to content

Improve Retry Logic in sqlcon for CockroachDB Transient Errors (SQLSTATE Classes 08 & 57) #860

@viragtripathi

Description

@viragtripathi

Preflight checklist

Ory Network Project

No response

Describe your problem

Hi Ory team,

Following up on our recent joint discussions and ongoing integration between Ory and CockroachDB, we've observed transient SQL errors during load and topology changes (e.g., rolling upgrades, node restarts). These are typically in SQLSTATE classes 08 (connection exceptions) and 57 (operator intervention), and while infrequent, they can cause request retries to fail if not handled appropriately in application code.

This pattern aligns with the intent of sqlcon.IsError and may serve as a clean enhancement to that function to cover these additional SQLSTATE classes.

Describe your ideal solution

I recommend operation-level retries for transient failures, and we’ve implemented a working example of this pattern in Go:
👉 cockroach-go-with-retry

Workarounds or alternatives

  • Extend sqlcon.IsError or create a dedicated helper to catch and classify retry-able CockroachDB-specific SQL errors (notably from SQLSTATE 08xxx and 57P01).

  • Ensure this utility is integrated where SQL errors are surfaced, particularly in services that rely on Ory's SQL abstraction for persistence.

Version

v0.0.675

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    featNew feature or request.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions