Skip to content

[feat][evaluation]open api evaluator#418

Merged
tpfz merged 7 commits into
mainfrom
feat/open_api_evaluator
Feb 28, 2026
Merged

[feat][evaluation]open api evaluator#418
tpfz merged 7 commits into
mainfrom
feat/open_api_evaluator

Conversation

@tpfz
Copy link
Copy Markdown
Collaborator

@tpfz tpfz commented Feb 4, 2026

What type of PR is this?

Check the PR title

  • This PR title match the format: [<type>][<scope>] <description>. For example: [fix][backend] flaky fix
  • The description of this PR title is user-oriented and clear enough for others to understand.
  • Add documentation if the current PR requires user awareness at the usage level.
  • This PR is written in English. PRs not in English will not be reviewed.

(Optional) Translate the PR title into Chinese

(Optional) More detailed description for this PR(en: English/zh: Chinese)

en:
zh(optional):

(Optional) Which issue(s) this PR fixes

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 4, 2026

Codecov Report

❌ Patch coverage is 84.73440% with 296 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...uation/application/convertor/experiment/openapi.go 77.40% 99 Missing and 14 partials ⚠️
...modules/evaluation/application/eval_openapi_app.go 92.78% 28 Missing and 18 partials ⚠️
...api/handler/coze/loop/apis/eval_open_apiservice.go 0.00% 38 Missing ⚠️
...luation/application/convertor/evaluator/openapi.go 90.53% 31 Missing and 6 partials ⚠️
.../application/convertor/experiment/expt_template.go 43.10% 21 Missing and 12 partials ⚠️
...evaluation/application/convertor/common/openapi.go 90.45% 27 Missing ⚠️
...ules/evaluation/domain/service/expt_manage_impl.go 85.71% 1 Missing and 1 partial ⚠️

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #418      +/-   ##
==========================================
+ Coverage   72.89%   73.77%   +0.87%     
==========================================
  Files         624      625       +1     
  Lines       62897    64805    +1908     
==========================================
+ Hits        45848    47807    +1959     
+ Misses      13845    13751      -94     
- Partials     3204     3247      +43     
Flag Coverage Δ
unittests 73.77% <84.73%> (+0.87%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...on/application/convertor/evaluation_set/openapi.go 95.26% <100.00%> (+0.05%) ⬆️
...d/modules/evaluation/application/experiment_app.go 81.21% <100.00%> (+0.02%) ⬆️
...ation/domain/service/expt_manage_execution_impl.go 70.10% <100.00%> (+0.22%) ⬆️
...es/evaluation/domain/service/expt_template_impl.go 85.25% <100.00%> (+0.04%) ⬆️
...ules/evaluation/domain/service/expt_manage_impl.go 73.65% <85.71%> (+1.49%) ⬆️
...evaluation/application/convertor/common/openapi.go 90.93% <90.45%> (+8.08%) ⬆️
.../application/convertor/experiment/expt_template.go 87.50% <43.10%> (-4.11%) ⬇️
...luation/application/convertor/evaluator/openapi.go 90.53% <90.53%> (ø)
...api/handler/coze/loop/apis/eval_open_apiservice.go 0.00% <0.00%> (ø)
...modules/evaluation/application/eval_openapi_app.go 92.39% <92.78%> (+3.62%) ⬆️
... and 1 more

... and 5 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1fffaa...b5ed15f. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@tpfz tpfz force-pushed the feat/open_api_evaluator branch from 704da7f to f347866 Compare February 5, 2026 09:15
@tpfz tpfz force-pushed the feat/open_api_evaluator branch 2 times, most recently from 2198b7b to 66e6e2e Compare February 12, 2026 12:50
@tpfz tpfz force-pushed the feat/open_api_evaluator branch 5 times, most recently from 5f44e6d to 923e6e8 Compare February 27, 2026 10:48
Copy link
Copy Markdown
Collaborator

@CozeLoop CozeLoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review by Aime: adding line-level comments for identified issues.

@tpfz tpfz force-pushed the feat/open_api_evaluator branch from 923e6e8 to 4329950 Compare February 27, 2026 11:15
Copy link
Copy Markdown
Collaborator

@CozeLoop CozeLoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added one must-fix issue: nil-safe metrics defer for new OpenAPI handlers (and highlighted all affected methods).

Comment thread backend/modules/evaluation/application/eval_openapi_app.go
tpfz added 3 commits February 27, 2026 19:23
整合评估开放接口与筛选能力相关改动(优化筛选IDL、按实验模板提交、UT补充、缺陷修复),统一为单一提交以简化回溯与发布流程。基于 main 最新变更完成对齐,减少分叉并提升接口一致性与稳定性。
@tpfz tpfz force-pushed the feat/open_api_evaluator branch from 4329950 to 1bd9d63 Compare February 27, 2026 11:24
Copy link
Copy Markdown
Collaborator

@CozeLoop CozeLoop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

High-level summary:

Thanks for the comprehensive OpenAPI evaluator/evaluation work. The overall layering (IDL → convertors → application → domain services) is consistent with the existing evaluation module and the new endpoints are wired into the router/handlers in a clean, DDD-friendly way.

However, I found one compatibility-breaking issue and a few behavioural edge cases that should be addressed before merging:

  1. [Must-Fix] Thrift/IDL compatibility

    • domain_openapi.evaluator.thrift changes existing field IDs in struct Evaluator (inserting workspace_id as field 2 and shifting other fields). This breaks the wire format and violates the IDL compatibility rules. Please keep existing IDs stable and add workspace_id with a new ID.
  2. [Should-Fix] OpenAPI input validation and error mapping

    • OpenAPIRateLimitDTO2DO bubbles up time.ParseDuration errors directly, and the OpenAPI layer wraps them as generic internal errors. Invalid rate_limit.period values should be mapped to a well-defined invalid-parameter error (4xx) instead of 500.
    • OpenAPIExptTemplateFilterDTO2DO currently returns nil when logic_op != "and", silently discarding both filter conditions and the fuzzy-name keyword. This makes some client inputs appear to be accepted while actually ignored.
  3. [Should-Fix] Template-based experiment submission robustness

    • TemplateToSubmitExperimentRequest assumes template.Meta is always non-nil and will panic if that invariant is violated. Given it’s used by SubmitExptFromTemplateOApi, it’s safer to guard Meta and return nil so the caller can surface a controlled error.

I’ve left detailed line comments with concrete suggestions and example code for each issue. Once these are addressed, the PR should be in good shape.

Comment thread idl/thrift/coze/loop/evaluation/domain_openapi/evaluator.thrift
Comment thread backend/modules/evaluation/application/convertor/common/openapi.go
@tpfz tpfz merged commit d773c66 into main Feb 28, 2026
17 checks passed
@tpfz tpfz deleted the feat/open_api_evaluator branch February 28, 2026 02:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants