-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Dev #161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for VLLM and test-model configurations across the codebase, extends the CLI with a custom --config-path
option, refactors the testing suite to run the full pipeline against multiple config files, and bumps the project and config versions.
- Introduce
VllmArgs
andTestModelArgs
inconfig_models.py
and wire them intoWcConfig
andcreate_config_by_arg_type
- Update
offline_infer.vllm_infer
signature and ChatModel initialization inapi_service.py
- Refactor
tests/test_full_pipeV2.py
to parameterize tests over all configs, replacing the removedtest_full_pipe.py
Reviewed Changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
weclone/utils/config_models.py | Add VllmArgs and TestModelArgs , comment out deprecated enum items |
weclone/utils/config.py | Handle "vllm" and "test_model" in create_config_by_arg_type |
weclone/server/api_service.py | Pass a JSON dict to ChatModel via model_dump |
weclone/prompts/clean_data.py | Fix markdown list numbering |
weclone/eval/test_model.py | Use TestModelArgs.test_data_path and completion_config |
weclone/data/qa_generator.py | Replace QaPairV2 with QaPair |
weclone/data/models.py | Rename QaPairV2 to QaPair |
weclone/data/clean/strategies.py | Update type hints from QaPairV2 to QaPair |
weclone/core/inference/offline_infer.py | Extend vllm_infer and inject VllmArgs |
weclone/cli.py | Add --config-path option (removed @click.group() ) |
tests/test_full_pipeV2.py | Restructure and parameterize full-pipeline tests |
tests/configs/*.jsonc | Bump versions, add test_model_args and vision_api sections |
settings.template.jsonc & pyproject.toml | Version bumps and dependency updates |
README.md | Added a note about model-size behavior |
Comments suppressed due to low confidence (7)
weclone/cli.py:61
- The
@click.group()
decorator was removed, so no CLI commands will be registered. Re-add@click.group()
(and apply the option) to ensure subcommands load correctly.
@click.option("--config-path", default=None, help="指定配置文件路径,会设置WECLONE_CONFIG_PATH环境变量")
weclone/prompts/clean_data.py:15
- The bullet starts with
+3.
which breaks standard markdown numbering. Change to3.
to restore correct formatting.
3. **风格代表性** (Style Representativeness): 评估【回答 A】是否展现了自然、独特的人类对话风格特征。它是否仅仅是功能性的信息传递,还是带有个性化的色彩?关注点包括但不限于:是否体现了特定的语气(如友好、幽默、不耐烦、正式、脏话),是否包含口头禅、俚语、网络用语(如“yyds”、“绝绝子”)、表情符号 Emoji、颜文字、标点符号的特殊使用如“!!!”、“???”、“~”等表达、特定的缩写或短语、非标准的但一致的表达方式(如方言词汇、个人口癖)?如果包含请给予5分,不包含给予5分以下分数
README.md:45
- [nitpick] This note is unprofessional and may confuse readers. Consider revising or removing to maintain a professional project tone.
> - 7B模型很容易训练成为二傻子,14B模型勉强可以交流,32B及以上的模型效果会更好。
weclone/utils/config_models.py:126
- Using
Field(VisionApiConfig())
shares one default instance across all models. Switch back todefault_factory=VisionApiConfig
to ensure eachMakeDatasetArgs
gets its ownVisionApiConfig
.
vision_api: VisionApiConfig = Field(VisionApiConfig())
weclone/utils/config_models.py:182
- Providing a mutable default model instance can cause shared-state bugs. Use
default_factory=VllmArgs
instead of a direct instance.
vllm_args: VllmArgs = Field(VllmArgs())
weclone/utils/config_models.py:183
- Similarly, use
default_factory=TestModelArgs
to avoid sharing one instance ofTestModelArgs
across allWcConfig
instances.
test_model_args: TestModelArgs = Field(TestModelArgs())
weclone/core/inference/offline_infer.py:120
json_schema
is not defined in this scope, causing aNameError
. Either import or compute the schema or remove this field fromextra_body
.
extra_body = {"guided_json": json_schema, "enable_thinking": False}
# FULL = "full" | ||
# FREEZE = "freeze" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commented-out enum members clutter the code. If these values are deprecated, remove them or add a proper deprecation notice instead of leaving commented code.
# FULL = "full" | |
# FREEZE = "freeze" | |
Copilot uses AI. Check for mistakes.
test_model_args
andvllm_args
配置项,允许自定义测试集文件fix #158 fix #83 fix #77 fix #69