Skip to content

Add async chart job queue, multi-CSV analysis improvements, desktop UI, and tests#27

Merged
rad1092 merged 2 commits into
mainfrom
codex/evaluate-current-project-completion-level
Feb 15, 2026
Merged

Add async chart job queue, multi-CSV analysis improvements, desktop UI, and tests#27
rad1092 merged 2 commits into
mainfrom
codex/evaluate-current-project-completion-level

Conversation

@rad1092
Copy link
Copy Markdown
Owner

@rad1092 rad1092 commented Feb 15, 2026

Motivation

  • Implement recommendation C to avoid blocking the web UI for expensive chart generation by running chart rendering out-of-band.
  • Provide richer multi-CSV analysis/reporting and local desktop convenience to improve user workflows for CSV profiling and BitNet prompts.
  • Improve streaming/low-memory summary computation so large files are profiled without materializing full value lists.
  • Surface a lightweight diagnostics command to validate local Ollama/BitNet availability and inform tuning decisions.

Description

  • Added an in-process async chart job queue in bitnet_tools/web.py with helpers submit_chart_job(files) and get_chart_job(job_id) and HTTP endpoints POST /api/charts/jobs and GET /api/charts/jobs/<job_id> (thread pool + per-job input/output dirs under .bitnet_cache/chart_jobs).
  • Introduced bitnet_tools/visualize.py to create sampled charts per CSV and integrated it into web/UI and CLI chart flows via create_multi_charts.
  • Implemented bitnet_tools/multi_csv.py for streaming, memory-bounded multi-file profiling, schema-drift detection, insights, caching, and markdown/report builders; exposed analyze_multiple_csv, build_multi_csv_markdown, and result_to_json.
  • Updated bitnet_tools/analysis.py to use streaming summarization (summarize_reader) with O(1) memory numeric aggregations and added build_markdown_report.
  • Added Windows desktop UI launch support in bitnet_tools/desktop.py and top-level bitnet_desktop.pyw, plus a doctor helper in bitnet_tools/doctor.py to collect environment info.
  • Enhanced CLI (bitnet_tools/cli.py) with commands: multi-analyze, report, desktop, and doctor, and wiring for chart generation and report output.
  • Extended web UI assets (bitnet_tools/ui/*) to support multi-CSV upload, dashboard rendering, and controls to run the new flows.
  • Added tests covering analysis, multi-file analysis, CLI flows, and web chart job lifecycle (tests/*) and a .gitignore and pyproject script entry for the desktop command.

Testing

  • Ran unit tests with pytest -q and all tests passed: 24 passed.
  • Added tests/test_web.py to validate chart job submission/completion and not-found behavior, and tests/* coverage for multi_csv, visualize, cli, and analysis reporting features which passed.
  • Performed a simple micro-benchmark comparing analyze_multiple_csv(..., max_workers=1) vs max_workers=4 on synthesized CSVs which showed workers=1 performed at least as well in this environment, indicating that parallelism tuning should be data- and environment-dependent rather than blindly increasing worker counts.

Codex Task

@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

path = job_input_dir / name
path.write_text(text, encoding='utf-8')

P1 Badge Sanitize chart job filenames before writing input files

_run_chart_job joins the client-provided name directly into job_input_dir and writes it, so a crafted value like ../../outside.csv escapes the job directory and writes to arbitrary writable paths. This is exploitable whenever the web API is exposed beyond fully trusted callers (for example, UI bound to 0.0.0.0), and can overwrite unrelated files on the host. Normalize/validate the filename (or resolve and enforce parent containment) before write_text.


path = Path(td) / name
path.write_text(text, encoding="utf-8")

P1 Badge Block path traversal in /api/multi-analyze file names

The /api/multi-analyze handler also writes name directly via Path(td) / name without sanitization, so a JSON request containing ../ segments can write outside the temporary directory. In deployments where this endpoint is reachable by untrusted input, that allows arbitrary file overwrite in writable locations; enforce basename-only names or verify the resolved path stays under td before writing.


"print('결측 비율 상위:\n', missing_ratio.head(10))\n\n"

P2 Badge Emit syntactically valid pandas example code

build_code_guidance currently embeds \n inside single-quoted print literals in the generated snippet, which becomes a real newline character when the outer Python string is built. The resulting pandas_example contains an unterminated string literal and cannot be executed as-is, so users copying the suggested code will hit a syntax error immediately.


if not future.done():
return {'job_id': job_id, 'status': 'running'}
try:
return future.result()

P2 Badge Evict finished chart jobs from the in-memory registry

get_chart_job returns terminal results but never removes the corresponding future from _CHART_JOBS, while submit_chart_job keeps appending new entries. In a long-running server this creates unbounded growth of retained futures/results and steadily increases memory usage as chart requests accumulate; delete completed/failed jobs or add retention limits.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@rad1092 rad1092 merged commit 6d147d6 into main Feb 15, 2026
4 checks passed
@rad1092 rad1092 deleted the codex/evaluate-current-project-completion-level branch February 15, 2026 01:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant