[FEATURE]: Add local token estimation fallback when API usage is empty

### Feature hasn't been suggested before.

- [x] I have verified this feature I'm about to request hasn't been suggested before.

### Describe the enhancement you want to request


## Problem

Some custom providers' OpenAI-compatible APIs (streaming) don't return usage information in their responses. This causes two critical issues:

1. **Context usage not displayed**: opencode cannot show token consumption to users
2. **Automatic compact not triggered**: The compact mechanism relies on usage data to know when to trigger, so it fails entirely when usage is missing

## Context

Users are integrating with various custom OpenAI-compatible providers that may not include usage metadata in their streaming response chunks. This is a common scenario for:

- Self-hosted language model services
- Custom proxy implementations
- Alternative LLM providers that don't follow the full OpenAI API specification

Without accurate usage tracking, users lose visibility into their token consumption and, more importantly, the automatic compact feature becomes non-functional, potentially leading to context overflow.

## Request

Add a fallback mechanism to locally estimate token usage when the API response doesn't include usage information. This should:

- Apply primarily to streaming responses (where usage is most commonly missing)
- Maintain compatibility with providers that do return usage (prefer actual usage when available)
- Ensure the automatic compact mechanism continues to work even with estimated data
- Be reasonably accurate for practical purposes

## Proposed Solution

1. Implement a local token counter that tracks tokens as streaming chunks arrive
2. When API response usage is empty or missing:
   - Use the local estimation as a fallback
   - Log a warning that usage is estimated (not from API)
3. Keep using actual API usage when it's available (for accuracy)
4. The estimation should handle both prompt tokens and completion tokens separately

## Additional Notes

Token estimation libraries like `tiktoken` or simple character-based heuristics could be used for the fallback implementation. The estimation doesn't need to be perfect—just accurate enough to enable the compact functionality to work reliably.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE]: Add local token estimation fallback when API usage is empty #13141

Feature hasn't been suggested before.

Describe the enhancement you want to request

Problem

Context

Request

Proposed Solution

Additional Notes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE]: Add local token estimation fallback when API usage is empty #13141

Description

Feature hasn't been suggested before.

Describe the enhancement you want to request

Problem

Context

Request

Proposed Solution

Additional Notes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions