QA eval pipeline for retrieval by KyleZheng1284 · Pull Request #1754 · NVIDIA/NeMo-Retriever

KyleZheng1284 · 2026-03-30T21:26:45Z

Description

Adds a pluggable QA evaluation harness for measuring Retrieval quality end-to-end using multi-tier scoring.

Capabilities:

Multi-tier scoring -- Tier 1 retrieval recall (answer-in-context), Tier 2 programmatic (exact match + token F1), and Tier 3 LLM-as-judge (1-5 rubric) run together in a single pass at zero extra retrieval cost.
Full-page markdown retrieval -- Reconstructs complete document pages from NeMo Retriever extraction records via to_markdown_by_page()
Pluggable retrieval -- Any retrieval system (vector search, agentic, hybrid, BM25) plugs in by producing a standard JSON (queries → chunks); no harness code changes required.
Pluggable datasets -- Any CSV with query/answer columns loads via csv:path/to/file.csv; default ground truth is data/bo767_annotations.csv (1007 Q&A pairs, all modalities).
Pluggable LLMs -- Generator and judge models swap via env var or YAML config using litellm prefix routing (nvidia_nim/, openai/, huggingface/).
Multi-model sweep -- Set GEN_MODELS to evaluate multiple generators in a single run with side-by-side score comparisons.
Failure classification -- Per-query categorization into correct, partial, retrieval_miss, generation_miss, no_context, thinking_truncated to pinpoint exactly where the pipeline fails.

Note - the csv containing the q-a pairs is a subset of the existing https://github.com/NVIDIA/NeMo-Retriever/blob/main/data/digital_corpora_10k_annotations.csv. Currently have an separate PR up with a subset annotations for only bo767 specific files here - #1730

)## Checklist

I am familiar with the Contributing Guidelines.
New or existing tests cover these changes.
The documentation is up to date with these changes.
If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

copy-pr-bot · 2026-03-30T21:26:50Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

jperez999 · 2026-03-30T23:13:59Z

+    print(f"  Page index key check: {matched}/{len(sampled)} sampled source_ids found")
+
+
+def main() -> int:


Why not make this a tool we can call via import, instead of a main function.

core evaluation logic has been moved into nemo_retriever.evaluation (importable package, pip-installable via nemo_retriever[eval])

greptile-apps · 2026-04-03T22:19:59Z

Greptile Summary

This PR introduces a pluggable multi-tier QA evaluation harness (nemo_retriever/evaluation/) with Tier-1 retrieval recall, Tier-2 token F1, and Tier-3 LLM-as-judge scoring, plus supporting utilities for Parquet-to-page-index building and LanceDB export. All previously raised P1 findings have been resolved — the answer_in_context substring bug, thinking_truncated misclassification, IndexError on empty evaluations, bo767_infographic None-data_dir crash, and the retrieval_loader exception-dispatch issue.

P1 — Missing tests: 15+ new source modules are added with no corresponding test files; test-mirrors-source-structure and test-coverage-new-code require coverage for the pure-computation scoring functions at minimum.
P1 — Bare except Exception: pass in io/dataframe.py: Four silent fallback blocks in read_extraction_parquet discard failure details with no logging, making Parquet read failures undiagnosable.

Confidence Score: 4/5

Safe to merge once unit tests are added and the bare-except logging gaps in io/dataframe.py are addressed.

All previously raised P1 correctness bugs are fixed. Two new P1 findings remain: missing tests across 15+ new modules (rules violation), and silent exception swallowing in read_extraction_parquet that makes Parquet read failures undiagnosable. The core scoring and orchestration logic is well-structured.

nemo_retriever/src/nemo_retriever/io/dataframe.py (bare except), and any new test files that should mirror the evaluation/ package.

Important Files Changed

Filename	Overview
nemo_retriever/src/nemo_retriever/io/dataframe.py	read_extraction_parquet adds multi-strategy Parquet reading but four bare `except Exception: pass` blocks swallow fallback failures without any logging, violating no-bare-except rule.
nemo_retriever/src/nemo_retriever/evaluation/scoring.py	Programmatic multi-tier scoring; prior substring-matching and thinking_truncated misclassification bugs are fixed; no unit tests.
nemo_retriever/src/nemo_retriever/evaluation/orchestrator.py	QAEvalPipeline wiring retrieval→generation→judging→scoring; architecture is solid, late-binding closure bug is guarded, no tests.
nemo_retriever/src/nemo_retriever/evaluation/generators.py	LiteLLMClient wrapper; generation failures caught and returned as error results but logged only at DEBUG level, making LLM API errors invisible in production.
nemo_retriever/src/nemo_retriever/evaluation/config.py	Config loading with env-var expansion and legacy/new schema normalization; previously flagged IndexError on empty evaluations is fixed.
nemo_retriever/src/nemo_retriever/evaluation/ground_truth.py	Dataset loaders for bo767_infographic, ViDoRe v3, and generic CSV; previously flagged bo767_infographic None data_dir crash is fixed.
nemo_retriever/src/nemo_retriever/io/markdown.py	Adds build_page_index and _read_parquet_for_markdown with column-selection optimization to avoid loading large image/embedding columns.
nemo_retriever/src/nemo_retriever/export.py	New module for querying LanceDB and exporting FileRetriever-compatible JSON; well-documented, no tests.

Sequence Diagram

sequenceDiagram
    participant CLI as retriever eval run
    participant Loader as RetrievalLoaderOperator
    participant FR as FileRetriever
    participant GT as GroundTruth CSV
    participant Gen as QAGenerationOperator(LiteLLMClient)
    participant Judge as JudgingOperator(LLMJudge)
    participant Score as ScoringOperator

    CLI->>Loader: process(None)
    Loader->>GT: load_generic_csv / get_qa_dataset_loader
    GT-->>Loader: qa_pairs
    Loader->>FR: retrieve(query, top_k) per pair
    FR-->>Loader: RetrievalResult(chunks, metadata)
    Loader-->>Gen: DataFrame(query, reference_answer, context)

    Gen->>Gen: ThreadPoolExecutor (max_workers)
    Gen->>LiteLLM: litellm.completion(messages)
    LiteLLM-->>Gen: raw_answer
    Gen->>Gen: strip_think_tags(raw_answer)
    Gen-->>Judge: DataFrame + answer columns

    Judge->>Judge: ThreadPoolExecutor (max_workers)
    Judge->>LiteLLM: litellm.completion(judge_prompt)
    LiteLLM-->>Judge: JSON score 1-5
    Judge-->>Score: DataFrame + judge columns

    Score->>Score: answer_in_context() Tier 1
    Score->>Score: token_f1() Tier 2
    Score->>Score: classify_failure() Tier 3
    Score-->>CLI: DataFrame with all metrics

    CLI->>CLI: write timestamped results JSON
    CLI->>CLI: print multi-tier summary

Prompt To Fix All With AI

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/io/dataframe.py
Line: 35-67

Comment:
**Bare `except Exception: pass` clauses swallow errors without logging**

Four intermediate `except Exception: pass` blocks silently discard the failure reason. When every fallback strategy fails, the only error visible to the caller is the one from the final `pd.read_parquet(path)` call at the bottom — not the original failure from the primary PyArrow path. This makes it impossible to diagnose *why* a specific parquet cannot be read.

Per the project's `no-bare-except` rule, bare-except blocks at non-boundary sites must log the caught exception. Add `logger.debug` (or `logger.warning`) with `exc_info=True` at each fallback transition so failures are traceable:

```python
try:
    table = pq.ParquetFile(path).read()
    try:
        table = table.combine_chunks()
    except Exception:
        pass  # combine_chunks is best-effort; chunked array still usable
    try:
        return table.to_pandas(split_blocks=False)
    except Exception as e:
        logger.debug("to_pandas(split_blocks=False) failed for %s: %s; using pylist fallback", path, e)
        return _arrow_table_to_pandas_via_pylist(table)
except Exception as e:
    logger.debug("Primary PyArrow read failed for %s: %s; trying fastparquet", path, e)
try:
    return pd.read_parquet(path, engine="fastparquet")
except Exception as e:
    logger.debug("fastparquet read failed for %s: %s; retrying pylist", path, e)
...
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/evaluation/generators.py
Line: 533-540

Comment:
**LLM API failures logged only at DEBUG level — invisible in production**

The `except Exception` handler uses `logger.debug` without `exc_info=True`. In a default INFO-level deployment any API connectivity failure, authentication error, or rate-limit exception will be silently swallowed at the logging layer — the only record of the failure is the string stored in `GenerationResult.error`. Operators that call `generate()` in a thread pool (e.g. `QAEvalPipeline.process()`) will emit no visible log output when a model endpoint is unreachable, making large-scale eval runs very hard to diagnose.

Raise to `logger.warning` (or `logger.error`) with `exc_info=True`:

```suggestion
        except Exception as exc:
            logger.warning("Generation failed for model=%s: %s", self.model, exc, exc_info=True)
            return GenerationResult(
                answer="",
                latency_s=0.0,
                model=self.model,
                error=str(exc),
            )
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: nemo_retriever/src/nemo_retriever/evaluation/scoring.py
Line: 1-5

Comment:
**No unit tests for any new evaluation modules**

This PR adds 15+ new source modules (`scoring.py`, `generators.py`, `judges.py`, `orchestrator.py`, `config.py`, `ground_truth.py`, `retrieval_loader.py`, `runner.py`, `retrievers.py`, `export.py`, etc.) with zero corresponding test files. The PR checklist marks "New or existing tests cover these changes" as done, but the diff contains no test files at all.

Per the project's `test-mirrors-source-structure` and `test-coverage-new-code` rules, new modules must have test counterparts under `nemo_retriever/tests/`. The pure-computation functions (`token_f1`, `answer_in_context`, `classify_failure`, `_normalize`, `_parse_judge_response`, `_sanitize_prefix`) are especially easy to unit-test and would provide high confidence in the scoring logic correctness.

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (13): Last reviewed commit: "add reference to harness readme + minor ..." | Re-trigger Greptile}

jperez999

Moving in the right direction. Lets remove all the changes to the harness not in nemo_retriever. That will slim down the PR quite a bit. Also, unless you feel it is really helpful, lets remove all the extra tools you added and replace them with helper functions for those actions. We should refactor to make it possible to tack these operators on the graph in graph_pipeline.py or into the Retreiver object already in use. We should be trying to reuse as much of the objects that we have as much as possible. Keep in mind, everything here is a discussion, if you feel it is better the way you have it, please explain it to me.

jperez999 · 2026-04-08T18:12:56Z

+# ---------------------------------------------------------------------------
+
+
+def run_agentic_retrieval(


So this is something that we need to do separate from graph_pipeline.py entry point? Cant we just add in the operators we want and use that same entrypoint. It would then allow us to make changes to the query file and datasets and should still get same behavior.

jperez999 · 2026-04-08T18:13:26Z

+        --output data/test_retrieval/bo767_retrieval_dense.json
+"""
+
+from __future__ import annotations


Why create a whole new file to do what graph_pipeline already mostly does?

This script exists because retrieval-bench only works with HuggingFace datasets out of the box. We would need this file to load our extraction Parquets, expand chunk hits to full-page markdown, and output the FileRetriever JSON that our QA eval pipeline expects.

jperez999 · 2026-04-08T18:14:44Z

+import json
+import os
+
+from nv_ingest_harness.cases.e2e import main as e2e_main


Again it seems like you are creating a whole new graph specifically for this. When what I think we want is to be able to tack on these operations to any graph.

jperez999 · 2026-04-08T18:20:36Z

+from nemo_retriever.evaluation.types import RetrievalResult
+
+
+class TopKRetriever:


Why are you adding this in the harness. This should exist in nemo_retriever. All code changes in legacy nv-ingest can be removed unless necessary to make nemo_retriever work.

moving it would pull harness dependencies into nemo_retriever right, which isn't what we want. It makes more sense in my mind if the harness consumes the nemo_retriever protocl instead of vice versa.

jperez999

There are things we should polish in another round but this can get merged. we are going to want to move away from the eval CLI command completely. This needs to be incorporated to be usable, if activate in graph_pipeline.py

jperez999 · 2026-04-14T22:04:28Z

+    return records
+
+
+def load_vidore_v3_qa(dataset_name: str, cache_dir: Optional[str] = None) -> list[dict]:


If I have the vidore_v3 question and answer pairs why do I need to have the datasets library? If I have the CSV do I really still need to have datasets library?

jperez999 · 2026-04-14T22:14:25Z

+        from nemo_retriever.evaluation.retrievers import FileRetriever
+
+        source = self._ground_truth_csv
+        try:


wouldn't it be better if you just required the question and answer pairs. Instead of trying to handle pulling the data out of a particular file?

This is heavily coupled with the specifics of the datasets we support. I know you have the catch all after to read csv, but I wouldnt even do that. What happens if the user has this information in a different format. Now we require them to read the information in and change it to csv. If we made it to where we took in the question answer pairs and ground truth, then we put loading that information on the user.

jperez999 · 2026-04-14T22:20:23Z

+    Minimal required JSON format::
+
+        {
+          "queries": {


I think this example is missing a set of list brackets, queries should be a list of dict records right?

jperez999 · 2026-04-14T22:24:12Z

+def score_dataframe(df: pd.DataFrame) -> pd.DataFrame:
+    """Apply all programmatic scoring metrics to a DataFrame.
+
+    Input DataFrame must have ``reference_answer``, ``answer``, ``context``


would be nice, if we made it possible to calculate this for any dataframe. We could keep the column names as defaults but if we add a place where the user can set their own mappings, it removes an extra step requiring the column name normalization.

jperez999 · 2026-04-14T22:27:24Z

+from nemo_retriever.evaluation.scoring import score_dataframe
+
+
+class ScoringOperator(EvalOperator):


But it is only used in this module. If its only called in your scoring operator how does putting it nearer to where it is called couple it to non-LLM scoring logic? It seems the evaluation subfolder is all tied to llm and judge evaluation.

jperez999 · 2026-04-14T22:30:28Z

+
+
+@runtime_checkable
+class RetrieverStrategy(Protocol):


Ok but what is the different between this and the Retriever

NeMo-Retriever/nemo_retriever/src/nemo_retriever/retriever.py

Line 32 in 1ae0ed7

class Retriever:

jperez999 · 2026-04-14T22:32:49Z

+
+
+@runtime_checkable
+class LLMClient(Protocol):


retrieval_results = [<results inside>] analysis_results = [] for llm_ref in llms: res = analyze_results(retrieval_results, llm_ref) analysis_results.append(res)

Why wouldn't something like this work?

jperez999 · 2026-04-14T22:39:36Z

+
+
+def build_page_index(
+    parquet_dir: str | Path | None = None,


I think this would break if I did something like build_page_index(dataframe). It would need to be build_page_index(None, dataframe) to work correctly, right?

jperez999 · 2026-04-14T22:40:18Z

+LANCEDB_TABLE = "nv-ingest"
+
+
+def reload_parquet_to_lancedb(


Since this is lancedb specific it should go in the vector_store subfolder.

…ith better scoring naming, more generic env naming for api keys for multi model support

…el configs

…ench

greptile-apps · 2026-04-16T16:57:16Z

+    import pyarrow.parquet as pq
+
+    try:
+        table = pq.ParquetFile(path).read()
+        try:
+            table = table.combine_chunks()
+        except Exception:
+            pass
+        try:
+            return table.to_pandas(split_blocks=False)
+        except Exception:
+            return _arrow_table_to_pandas_via_pylist(table)
+    except Exception:
+        pass
+    try:
+        return pd.read_parquet(path, engine="fastparquet")
+    except Exception:
+        pass
+    try:
+        table = pq.ParquetFile(path).read()
+        return _arrow_table_to_pandas_via_pylist(table)
+    except Exception:
+        pass
+    return pd.read_parquet(path)
+
+
 def read_dataframe(path: Path) -> pd.DataFrame:
    suffix = path.suffix.lower()
    if suffix == ".parquet":
-        return pd.read_parquet(path)
+        return read_extraction_parquet(path)
    if suffix in {".jsonl", ".json"}:
        text = path.read_text(encoding="utf-8")
        if suffix == ".jsonl":


Bare except Exception: pass clauses swallow errors without logging

Four intermediate except Exception: pass blocks silently discard the failure reason. When every fallback strategy fails, the only error visible to the caller is the one from the final pd.read_parquet(path) call at the bottom — not the original failure from the primary PyArrow path. This makes it impossible to diagnose why a specific parquet cannot be read.

Per the project's no-bare-except rule, bare-except blocks at non-boundary sites must log the caught exception. Add logger.debug (or logger.warning) with exc_info=True at each fallback transition so failures are traceable:

try: table = pq.ParquetFile(path).read() try: table = table.combine_chunks() except Exception: pass # combine_chunks is best-effort; chunked array still usable try: return table.to_pandas(split_blocks=False) except Exception as e: logger.debug("to_pandas(split_blocks=False) failed for %s: %s; using pylist fallback", path, e) return _arrow_table_to_pandas_via_pylist(table) except Exception as e: logger.debug("Primary PyArrow read failed for %s: %s; trying fastparquet", path, e) try: return pd.read_parquet(path, engine="fastparquet") except Exception as e: logger.debug("fastparquet read failed for %s: %s; retrying pylist", path, e) ...

Prompt To Fix With AI

This is a comment left during a code review. Path: nemo_retriever/src/nemo_retriever/io/dataframe.py Line: 35-67 Comment: **Bare `except Exception: pass` clauses swallow errors without logging** Four intermediate `except Exception: pass` blocks silently discard the failure reason. When every fallback strategy fails, the only error visible to the caller is the one from the final `pd.read_parquet(path)` call at the bottom — not the original failure from the primary PyArrow path. This makes it impossible to diagnose *why* a specific parquet cannot be read. Per the project's `no-bare-except` rule, bare-except blocks at non-boundary sites must log the caught exception. Add `logger.debug` (or `logger.warning`) with `exc_info=True` at each fallback transition so failures are traceable: ```python try: table = pq.ParquetFile(path).read() try: table = table.combine_chunks() except Exception: pass # combine_chunks is best-effort; chunked array still usable try: return table.to_pandas(split_blocks=False) except Exception as e: logger.debug("to_pandas(split_blocks=False) failed for %s: %s; using pylist fallback", path, e) return _arrow_table_to_pandas_via_pylist(table) except Exception as e: logger.debug("Primary PyArrow read failed for %s: %s; trying fastparquet", path, e) try: return pd.read_parquet(path, engine="fastparquet") except Exception as e: logger.debug("fastparquet read failed for %s: %s; retrying pylist", path, e) ... ``` How can I resolve this? If you propose a fix, please make it concise.

KyleZheng1284 requested review from a team as code owners March 30, 2026 21:26

KyleZheng1284 requested a review from nkmcalli March 30, 2026 21:26

jperez999 requested changes Mar 30, 2026

View reviewed changes

KyleZheng1284 force-pushed the feature/qa-harness-fullpage-pipeline branch from d7c48fa to 9262c63 Compare April 3, 2026 21:56

greptile-apps bot reviewed Apr 3, 2026

View reviewed changes

Comment thread tools/harness/run_qa_eval.py Outdated

Comment thread nemo_retriever/src/nemo_retriever/evaluation/scoring.py

Comment thread nemo_retriever/src/nemo_retriever/evaluation/scoring.py Outdated

Comment thread tools/harness/src/nv_ingest_harness/cases/qa_eval.py Outdated

greptile-apps bot reviewed Apr 7, 2026

View reviewed changes

Comment thread tools/harness/src/nv_ingest_harness/utils/qa/__init__.py Outdated

greptile-apps bot reviewed Apr 7, 2026

View reviewed changes

Comment thread nemo_retriever/src/nemo_retriever/evaluation/config.py

greptile-apps bot reviewed Apr 7, 2026

View reviewed changes

Comment thread nemo_retriever/src/nemo_retriever/evaluation/ground_truth.py Outdated

jperez999 requested changes Apr 8, 2026

View reviewed changes

greptile-apps bot reviewed Apr 13, 2026

View reviewed changes

Comment thread nemo_retriever/src/nemo_retriever/evaluation/retrieval_loader.py

jperez999 approved these changes Apr 14, 2026

View reviewed changes

KyleZheng1284 mentioned this pull request Apr 15, 2026

Add support for end to end in-memory QA evaluation #1865

Open

4 tasks

Kyle Zheng added 15 commits April 16, 2026 16:48

QA eval pipeline with full-page markdown and multi-tier scoring

426e262

removed stale ref

8ccadbb

fixing ci/cd issues

c4ca836

style: black formatting for QA harness files

3621da0

update readme

4827f3f

migrate eval framework to graph pipeline and also added new changes w…

4c3e0ab

…ith better scoring naming, more generic env naming for api keys for multi model support

add support for multi run sweep w/ support for multiple different mod…

d550108

…el configs

updated eval sweep to correct model name

12f9f2e

add scripts for running retrieval_bench

3edeb81

priortize bo767 as dataset in retrieval bench in readme

cb3b6c2

refactor scripts

864ef03

add agentic retrieval example

0d9c8df

refactor and consolidate into retriever cli

ba6b576

update to support only plural + remove cli entry point for retrievalb…

8e5d9a8

…ench

bug fixes

ec71193

Kyle Zheng added 4 commits April 16, 2026 16:51

remove case

b43fa50

restored singular column names for test

db7ede2

clean up harness, implmentation exists in retriever

15df147

add reference to harness readme + minor style fix

881eb64

KyleZheng1284 force-pushed the feature/qa-harness-fullpage-pipeline branch from f51a25a to 881eb64 Compare April 16, 2026 16:51

greptile-apps bot reviewed Apr 16, 2026

View reviewed changes

jperez999 merged commit 89d9965 into NVIDIA:main Apr 16, 2026
5 checks passed

		print(f" Page index key check: {matched}/{len(sampled)} sampled source_ids found")


		def main() -> int:

		# ---------------------------------------------------------------------------


		def run_agentic_retrieval(

		from nemo_retriever.evaluation.types import RetrievalResult


		class TopKRetriever:

		return records


		def load_vidore_v3_qa(dataset_name: str, cache_dir: Optional[str] = None) -> list[dict]:

		from nemo_retriever.evaluation.scoring import score_dataframe


		class ScoringOperator(EvalOperator):

Conversation

KyleZheng1284 commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Capabilities:

Note - the csv containing the q-a pairs is a subset of the existing https://github.com/NVIDIA/NeMo-Retriever/blob/main/data/digital_corpora_10k_annotations.csv. Currently have an separate PR up with a subset annotations for only bo767 specific files here - #1730

Uh oh!

copy-pr-bot bot commented Mar 30, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps bot commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jperez999 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jperez999 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

KyleZheng1284 commented Mar 30, 2026 •

edited

Loading

greptile-apps bot commented Apr 3, 2026 •

edited

Loading