Skip to content

Responses API previous_response_id returns not_found_error — storage appears unimplemented on inference.do-ai.run #96

@anisen943

Description

@anisen943

Summary

The serverless inference endpoint at https://inference.do-ai.run/v1/responses accepts the OpenAI Responses API request/response shape (including the store, background, and previous_response_id fields), but the underlying response storage appears to be unimplemented:

  • store: true is silently accepted (no error, HTTP 200, valid resp_… id returned)
  • The returned id cannot be retrieved via GET /v1/responses/{id} (endpoint is not routed — returns a DigitalOcean "Maintenance" HTML page)
  • Passing that id as previous_response_id on a subsequent POST /v1/responses returns not_found_error

The SDK docs page for responses.create describes previous_response_id as "Previous response ID (for multi-turn conversations)" — so the documentation contract suggests this should work end-to-end.

Tested 2026-05-20 against model: kimi-k2.6.

Reproduction

# 1. Create a response with store:true
RESP1=$(curl -sS -X POST https://inference.do-ai.run/v1/responses \
  -H "Authorization: Bearer $DO_MODEL_ACCESS_KEY" \
  -H 'Content-Type: application/json' \
  -d '{"model":"kimi-k2.6","input":"Remember: secret word is purple-marmot-42. Reply OK.","store":true}')

RESP1_ID=$(echo "$RESP1" | jq -r '.id')
echo "Got id: $RESP1_ID"
# → e.g. resp_a08a095f73d4b840  (HTTP 200, full payload returned)

# 2. Try to retrieve it
curl -sS -w '\nHTTP_CODE=%{http_code}\n' \
  "https://inference.do-ai.run/v1/responses/$RESP1_ID" \
  -H "Authorization: Bearer $DO_MODEL_ACCESS_KEY" | head -5
# → <!DOCTYPE html>
#   <title>DigitalOcean - Maintenance</title>
#   ...

# 3. Try to chain it
curl -sS -X POST https://inference.do-ai.run/v1/responses \
  -H "Authorization: Bearer $DO_MODEL_ACCESS_KEY" \
  -H 'Content-Type: application/json' \
  -d "{\"model\":\"kimi-k2.6\",\"input\":\"What was the secret word I gave you?\",\"previous_response_id\":\"$RESP1_ID\"}"
# → {"error":{"code":null,"message":"Response with id 'resp_a08a095f73d4b840' not found.",
#    "param":null,"type":"not_found_error"}, ...}

What the create-response payload returns

The POST response includes all the OpenAI Responses API fields, suggesting the request schema is generated from the OpenAI spec but the actual storage path isn't wired:

keys = [background, completed_at, created_at, error, frequency_penalty, id,
        incomplete_details, instructions, max_output_tokens, max_tool_calls,
        metadata, model, object, output, parallel_tool_calls, presence_penalty,
        previous_response_id, prompt_cache_key, reasoning, safety_identifier,
        service_tier, status, store, temperature, text, tool_choice, tools,
        top_logprobs, top_p, truncation, usage]

What I'd expect

One of:

  1. Implement the storagestore: true actually persists the response so it can be retrieved via GET /v1/responses/{id} and referenced via previous_response_id. This is the contract the SDK docs imply.
  2. Document the gap clearly — call out in the docs that previous_response_id / store / GET /v1/responses/{id} are not yet supported on serverless inference, and that callers should manage conversation state client-side by including the full message history each turn. Ideally store: true should return an error or warning rather than silently accepting it.

Impact

I was planning to chain multi-pass agentic verification via previous_response_id based on the SDK docs. Because storage doesn't actually work, the only option is to keep conversation state client-side — would have been useful to know upfront.

Environment

  • Endpoint: https://inference.do-ai.run/v1/responses
  • Model: kimi-k2.6 (also reproduces on kimi-k2.5 — same response shape)
  • Auth: DO_MODEL_ACCESS_KEY (model-access key from Gradient → Serverless Inference)
  • Date observed: 2026-05-20

Happy to provide a saved trace or run additional probes if useful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions