Enhancement: Tabular SK analysis — pagination, auto-trim, return_columns, and 100 K handoff limit

### Labels

`enhancement`, `tabular-analysis`, `semantic-kernel`, `performance`

### Description

#### Summary

The Tabular Semantic Kernel (SK) analysis pipeline can silently produce incomplete or token-overflowed responses when working with large Excel or CSV files because there is no mechanism to:

1. Page through rows when a result set is large.
2. Automatically drop heavy (wide-text) columns when the output would exceed the per-call token budget.
3. Forward a caller-supplied column allowlist so the LLM can retrieve only the columns it needs.
4. Guarantee that the final handoff text — the JSON blob stitched together from all SK tool results before it is appended to the conversation — fits within the message limit accepted by the chat synthesis call.

#### Problems Being Solved

| # | Problem | Observed symptom |
|---|---------|-----------------|
| 1 | No pagination | Tool returns up to ~1 000 rows in a single call; large files silently truncate |
| 2 | No auto-trim | Wide freetext columns (notes, descriptions) bloat the output and crowd out row data |
| 3 | No column projection | The LLM always gets every column even when it only needs two or three |
| 4 | Low handoff cap | The 24 K-char ceiling regularly truncated analysis results mid-JSON before synthesis |

#### Proposed / Implemented Enhancements

1. **Pagination** — add `start_row` / `max_rows` input parameters and `has_more` / `next_start_row` output fields to every analysis tool so the LLM can page through large result sets across multiple tool calls.
2. **Auto-trim** — add `_auto_trim_df_for_output()` that estimates the serialised JSON size from a 20-row sample and, when over budget, first drops the heaviest columns then truncates rows (triggering `has_more = True` so pagination picks up the rest).  Auto-trim fires only when the caller did **not** supply `return_columns`.
3. **`return_columns` forwarding** — thread the existing `return_columns` parameter through every analysis tool so column projection is honoured end-to-end.
4. **Handoff limit increase** — raise `max_handoff_chars` from 24 000 to 100 000 characters; add a `WARNING` log event when truncation still occurs so operators can detect the condition.
5. **Elapsed timing** — log the wall-clock elapsed time for each SK analysis invocation to help tune model selection and row limits.

#### Files Affected

- `application/single_app/semantic_kernel_plugins/tabular_processing_plugin.py`
- `application/single_app/route_backend_chats.py`

---

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancement: Tabular SK analysis — pagination, auto-trim, return_columns, and 100 K handoff limit #893

Labels

Description

Summary

Problems Being Solved

Proposed / Implemented Enhancements

Files Affected

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

#	Problem	Observed symptom
1	No pagination	Tool returns up to ~1 000 rows in a single call; large files silently truncate
2	No auto-trim	Wide freetext columns (notes, descriptions) bloat the output and crowd out row data
3	No column projection	The LLM always gets every column even when it only needs two or three
4	Low handoff cap	The 24 K-char ceiling regularly truncated analysis results mid-JSON before synthesis

Enhancement: Tabular SK analysis — pagination, auto-trim, return_columns, and 100 K handoff limit #893

Description

Labels

Description

Summary

Problems Being Solved

Proposed / Implemented Enhancements

Files Affected

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions