Skip to content

bibi00275/sec-agent

Repository files navigation

README.md — Known Dirty Things, append:

Sub-chunker uses fixed max_chars cuts, slices through sentences and signature lines. Will mis-handle any dense-signal block in a long section. Min-max normalization fabricates confidence — top score is always ~1.0 even when no chunk is relevant. Need an absolute relevance threshold for refusal logic later. Chunker only knows back-matter headings present in AAPL 2024 10-K (SIGNATURES, EXHIBIT INDEX, POWER OF ATTORNEY). Other filings may use different conventions.

  • Hybrid retrieval alpha=0.2; dense embeddings remain in the blend even though they score nonsense chunks highly on short queries. Will revisit if a better embedding model becomes available locally.
  • Refusal behavior is coupled to retrieval: if the wrong chunk gets retrieved, the LLM may answer instead of refusing. q8 ("stock price today") demonstrates this — fixing requires either prompt changes or question-intent classification.
  • Eval grader uses case-insensitive substring matching. q6 fails despite a correct answer because the LLM rephrased "changes in liquidity" as "value and liquidity." Substring grading is inherently brittle.
  • q4 (CEO) remains broken: vocabulary mismatch between query "CEO" and chunk text "Chief Executive Officer." Neither BM25 nor nomic-embed-text bridges this gap. Day 7 problem.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages