Add batch inference pydo and dots examples#1168
Conversation
| @@ -0,0 +1,16 @@ | |||
| lang: Python | |||
There was a problem hiding this comment.
Blocker — request body doesn't match batch_create_request.yml.
input_file_id= should be file_id= (line 8 of the example).
Missing required provider (e.g. "openai").
Missing required request_id — it's the idempotency key. Add import uuid and pass request_id=str(uuid.uuid4()).
Suggested:
batch = client.batches.create( body={ "file_id": os.environ["BATCH_INPUT_FILE_ID"], "provider": "openai", "endpoint": "/v1/chat/completions", "completion_window": "24h", "request_id": str(uuid.uuid4()), } )
print("batch_id:", batch.get("batch_id"))
Also batch.get("id") → batch.get("batch_id") per batch.yml:12.
| @@ -0,0 +1,44 @@ | |||
| lang: Python | |||
There was a problem hiding this comment.
Blocker — wrong endpoint and wrong response shape.
The spec endpoint is POST /v1/batches/files, which returns { file_id, upload_url, expires_at } per batch_file_create_response.yml. The example instead calls client.files.create(file=input_path, purpose="batch") (OpenAI Files-style: send the bytes + a purpose) and reads uploaded.filename / uploaded.bytes — none of those exist on this response, and purpose isn't on the request schema.
Mirror the dots version: call the batch-files create method with file_name=... and print file_id / upload_url. The actual JSONL bytes belong in inference_upload_batch_file.yml, not here.
| @@ -0,0 +1,30 @@ | |||
| lang: Python | |||
There was a problem hiding this comment.
Misleading lead comment. Lines 1–4 claim client.files.create() "performs both steps for you, prefer it" — that contradicts your create_batch_file example, which only reserves the intent. Drop the comment or rewrite it to say "step 1 reserves file_id+upload_url (see create_batch_file); this example PUTs the bytes."
PUT logic itself looks fine. Minor: avoid printing upload_url-derived state.
| client = Client(token=os.environ.get("DIGITALOCEAN_TOKEN")) | ||
|
|
||
| batch = client.batches.retrieve(os.environ["BATCH_ID"]) | ||
|
|
There was a problem hiding this comment.
Blocker — batch.get("id") is always None. Per batch.yml, the field is batch_id. Change to batch.get("batch_id").
| @@ -0,0 +1,25 @@ | |||
| lang: Python | |||
There was a problem hiding this comment.
Blocker — wrong field name. Line reads links["output_file_id"], but batch_results_response.yml returns output_file_url (a short-lived presigned URL). The endpoint does not return an output file ID.
The follow-up client.files.content(...) call also doesn't compose: you GET the presigned URL with requests.get, you don't pass it through the SDK. Rewrite as:
import requests
links = client.batches.results.retrieve(batch_id)
if not links.get("result_available"):
print("results not ready yet"); raise SystemExit(0)
resp = requests.get(links["output_file_url"], timeout=60)
resp.raise_for_status()
Path("batch_output.jsonl").write_bytes(resp.content)
| resp = client.batches.list(limit=20) | ||
|
|
||
| for b in resp.get("data") or []: | ||
| print(f"{b.get('id'):40} {b.get('status'):12} {b.get('created_at')}") |
There was a problem hiding this comment.
Field name. Per batch.yml, use b.get('batch_id'), not b.get('id'). Otherwise the iteration shape (resp.get("data"), has_more, last_id) matches batch_list_response.yml.
| @@ -0,0 +1,13 @@ | |||
| lang: Python | |||
There was a problem hiding this comment.
Two blockers.
result.get("id") → result.get("batch_id").
result.get("cancel_requested_at") doesn't exist on batch.yml. Use cancelled_at (or print status only — the cancel response is the full batch and the user mostly cares that status is cancelling / cancelled).
| request_id: randomUUID(), | ||
| }); | ||
|
|
||
| console.log("batch_id:", batch.batch_id ?? batch.id); |
There was a problem hiding this comment.
Field name. batch.batch_id ?? batch.id — drop the ?? batch.id; per spec there's no id. Just batch.batch_id.
Otherwise the request body matches the schema.
| @@ -0,0 +1,14 @@ | |||
| lang: JavaScript | |||
There was a problem hiding this comment.
Looks right against batch_file_create_response.yml. One nit: client.files.create(...) reads like OpenAI-Files; if the SDK actually exposes this as client.batches.files.create(...) (the URL is /v1/batches/files), prefer that name for clarity.
| @@ -0,0 +1,32 @@ | |||
| lang: JavaScript | |||
There was a problem hiding this comment.
Combines step 1 (reserve intent) and step 2 (PUT bytes) into one snippet. That's fine but it duplicates create_batch_file. Consider trimming step 1 here so each example documents one endpoint, matching the curl pair.
|
|
||
| const batch = await client.batches.retrieve(process.env.BATCH_ID); | ||
|
|
||
| console.log("batch_id: ", batch.id); |
There was a problem hiding this comment.
Field name. batch.id → batch.batch_id.
|
|
||
| // client.files.content resolves the result envelope and follows the | ||
| // presigned URL for you, returning the raw fetch Response. | ||
| const resp = await client.files.content(batchId); |
There was a problem hiding this comment.
Likely wrong API call. client.files.content(batchId) passes a batch id to a files helper. The endpoint GET /v1/batches/{batch_id}/results returns presigned URLs in batch_results_response.yml; you then fetch(output_file_url). Should be:
const links = await client.batches.results.retrieve(batchId);
if (!links.result_available) { console.log("not ready"); return; }
const resp = await fetch(links.output_file_url);
| @@ -0,0 +1,17 @@ | |||
| lang: JavaScript | |||
There was a problem hiding this comment.
Blocker — wrong pagination shape. Uses page.edges.map(e => e.node) (Relay-style), but batch_list_response.yml is { object, data, has_more, first_id, last_id }. Should be:
for (const b of page.data ?? []) {
console.log(${b.batch_id}\t${b.status}\t${b.created_at});
}
console.log("has_more:", page.has_more, "last_id:", page.last_id);
| @@ -0,0 +1,13 @@ | |||
| lang: JavaScript | |||
| source: |- | |||
There was a problem hiding this comment.
Two issues.
result.id → result.batch_id.
result.cancel_requested_at doesn't exist; use cancelled_at or just print status.
No description provided.