Concurrent requests produce corrupted output — no thread safety

## Description

When two or more HTTP requests are sent to `quant-server` simultaneously, both responses contain corrupted/garbled text. The server accepts parallel connections but shares inference state without synchronization.

## Steps to Reproduce

```bash
./build-metal/quant-server SmolLM2-1.7B-Instruct-Q8_0.gguf -p 8080

# Send two requests simultaneously
curl -s http://localhost:8080/v1/chat/completions \
  -d '{"messages":[{"role":"user","content":"What is gravity?"}],"max_tokens":30}' &
curl -s http://localhost:8080/v1/chat/completions \
  -d '{"messages":[{"role":"user","content":"What is Python?"}],"max_tokens":30}' &
wait
```

## Actual Behavior

Both responses contain corrupted, mixed, or garbled output. Neither produces a coherent answer.

## Expected Behavior

Either:
- **Option A**: Serialize requests (queue the second, process one at a time)
- **Option B**: Return `429 Too Many Requests` with `Retry-After` header for the second request
- **Option C**: Support true concurrent inference with separate contexts

## Impact

- **Severity: P1** — Any multi-client scenario (web app, load balancer) will produce garbage
- No error or warning is returned — client receives corrupted data silently
- Chat UIs that allow rapid sequential messages will hit this

## Suggested Fix

Short-term: Add a mutex in `tq_server.c` to serialize request handling, returning 429 if busy.

Long-term: Support request queuing or multiple inference contexts.

## Environment

- quant.cpp: latest main (49c6605)
- Model: SmolLM2-1.7B-Instruct-Q8_0.gguf
- OS: macOS 15 (Apple M3)

---
*Reported by ClawTeam Claw-2 (Builder persona)*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent requests produce corrupted output — no thread safety #63

Description

Steps to Reproduce

Actual Behavior

Expected Behavior

Impact

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Concurrent requests produce corrupted output — no thread safety #63

Description

Description

Steps to Reproduce

Actual Behavior

Expected Behavior

Impact

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions