Skip to content

step.move_validate failures return HTTP 500 instead of 4xx #359

@intel352

Description

@intel352

Problem

When step.move_validate (or any validation step) fails due to an invalid player action, the pipeline returns HTTP 500 with an error like:

Error triggering workflow: workflow execution failed: pipeline "grpc_make_move" execution failed: step "validate" failed: step validate: invalid move: uno: card "5_green" doesn't match discard top "3_blue"

This is a client error (bad input), not a server error. The HTTP response should be 400 (Bad Request) or 422 (Unprocessable Entity), not 500.

Impact

  • BDD/Gherkin tests expecting 4xx for invalid moves get 500 instead
  • API clients can't distinguish between "your move is invalid" and "server crashed"
  • Breaks standard REST error handling conventions

Affected Steps

Any step that validates user input and returns an error:

  • step.move_validate — game move validation
  • step.action_validate — turn/permission gating
  • Any custom validator returning fmt.Errorf

Expected Behavior

Pipeline step failures caused by validation logic should result in HTTP 4xx responses. Pipeline step failures caused by infrastructure errors (DB connection, nil pointer) should remain 500.

Suggested Approaches

  1. Step-level error classification: Steps return typed errors (e.g. ValidationError vs generic error). The HTTP response handler maps ValidationError → 400.
  2. Step config flag: on_error: client_error in YAML config causes the pipeline to return 400 instead of 500 when that step fails.
  3. Pipeline-level error handler: error_handler: { validation_steps: [validate, gate], status: 400 } in pipeline config.

Reproduction

pipelines:
  grpc_make_move:
    steps:
      - name: validate
        type: step.move_validate
        config:
          validators: [custom]

When a custom validator returns an error, the HTTP response is 500.

Context

Discovered while writing BDD/Gherkin game-rule tests for workflow-cardgame (168 scenarios). 5 of 16 failing scenarios are due to this issue — the validation logic works correctly but returns the wrong HTTP status code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions