Add qwen3#1827
Conversation
Greptile SummaryThis PR adds
Confidence Score: 4/5Core rename and qwen3 wiring is correct; one new P1 (missing @Property on model_id_names) and two previously-flagged open P1s (hardcoded 'qwen_lm' enhanced_caption key, Qwen3 thinking mode) should be resolved before merge. Previously flagged stale defaults are fixed. A new P1 is found (model_id_names missing @Property). Two prior P1 findings (hardcoded enhanced_caption key, Qwen3 think-block contamination) remain unaddressed in the diff. nemo_curator/models/qwen_lm.py (missing @Property), nemo_curator/stages/video/caption/caption_enhancement.py (hardcoded 'qwen_lm' key) Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["CaptionPreparationStage<br/>model_variant: qwen2.5 or qwen3"] -->|PromptFormatter| B["_generate_qwen_inputs<br/>builds fps + frames metadata tuple"]
B --> C["window.llm_inputs[model_variant]"]
C --> D["CaptionGenerationStage<br/>model_variant: qwen2.5 or qwen3"]
D -->|"QwenVL.setup<br/>applies _QWEN_VL_PIXEL_PARAMS per variant"| E["window.caption[model_variant]"]
E --> F["CaptionEnhancementStage<br/>reads caption via captioning_model_variant"]
F -->|"QwenLM.generate<br/>apply_chat_template"| G["window.enhanced_caption<br/>hardcoded 'qwen_lm' key"]
subgraph VL_Model["Vision-Language Model"]
H["QwenVL<br/>_QWEN_VARIANTS_INFO<br/>_QWEN_VL_PIXEL_PARAMS"]
end
subgraph LM_Model["Language Model"]
I["QwenLM<br/>_QWEN_LM_VARIANTS_INFO<br/>model_id_names missing @property"]
end
D -.->|initialises| H
F -.->|initialises| I
Reviews (11): Last reviewed commit: "Merge branch 'main' into onur/add-qwen3-..." | Re-trigger Greptile |
| clip_idx, window_idx = mapping[idx] | ||
| original_caption = video.clips[clip_idx].windows[window_idx].caption["qwen"] | ||
| original_caption = video.clips[clip_idx].windows[window_idx].caption[self.captioning_model_variant] | ||
| video.clips[clip_idx].windows[window_idx].enhanced_caption["qwen_lm"] = result |
There was a problem hiding this comment.
Hardcoded
"qwen_lm" key in enhanced_caption
Every other caption/input key in this pipeline was migrated from the old "qwen" string to the variant-aware value, but the enhanced-caption key is still hardcoded to "qwen_lm" regardless of whether model_variant is "qwen2.5" or "qwen3". This is inconsistent with the surrounding design, and any downstream consumer that reads enhanced_caption["qwen3_lm"] (or a similarly named key) will get a KeyError. Consider using f"{self.model_variant}_lm" to keep it variant-aware.
|
/ok to test 3de2e4e |
| _QWEN_LM_MODEL_REVISION = "cf98f3b" | ||
| _QWEN_LM_VARIANTS_INFO: Final = { | ||
| "qwen2.5": ("Qwen/Qwen2.5-14B-Instruct", "cf98f3b"), | ||
| "qwen3": ("Qwen/Qwen3-14B", "f8c293d"), |
There was a problem hiding this comment.
Qwen3-14B revision hash mismatch with PR description
The PR description states rev 8268fe3 for Qwen/Qwen3-14B, but the code pins f8c293d. These are different commit SHAs — one of them is incorrect, and download_weights_on_node will fetch whichever commit hash is in this dict. Please confirm the intended pinned revision matches the model checkpoint you tested against.
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
|
/ok to test 2398340 |
Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
|
/ok to test 9b02695 |
suiyoubi
left a comment
There was a problem hiding this comment.
thanks for your effort!
|
/ok to test 9b02695 |
|
/ok to test 5731502 |
| def model_id_names(self) -> list[str]: | ||
| return [_QWEN_LM_MODEL_ID] | ||
| model_id, _ = _QWEN_LM_VARIANTS_INFO[self.model_variant] | ||
| return [model_id] |
There was a problem hiding this comment.
model_id_names missing @property decorator
ModelInterface declares model_id_names as @property @abc.abstractmethod. QwenVL correctly overrides it with @property, but QwenLM implements it as a plain method. Any call-site that accesses model.model_id_names without parentheses (the property convention) will receive the bound-method object rather than a list[str], silently returning wrong data.
| def model_id_names(self) -> list[str]: | |
| return [_QWEN_LM_MODEL_ID] | |
| model_id, _ = _QWEN_LM_VARIANTS_INFO[self.model_variant] | |
| return [model_id] | |
| @property | |
| def model_id_names(self) -> list[str]: | |
| model_id, _ = _QWEN_LM_VARIANTS_INFO[self.model_variant] | |
| return [model_id] |
Summary
qwen3as a supported captioning variant alongsideqwen2.5(replacing the genericqwenalias)qwen_vl.py: AddQwen/Qwen3-VL-8B-Instruct(rev0c351dd) and a per-variant_QWEN_VL_PIXEL_PARAMSdict; pixel budget params (image_factor,min_pixels,max_pixels,video_*_pixels) are now merged intomm_processor_kwargsonsetup(). Qwen2.5 uses factor-28 params; Qwen3 uses factor-32.qwen_lm.py: AddQwen/Qwen3-14B(rev8268fe3) as theqwen3LM variant via_QWEN_LM_VARIANTS_INFO;model_variantparam added to constructor anddownload_weights_on_node.prompt_formatter.py: Replace"qwen"entry with"qwen2.5"/"qwen3"inVARIANT_MAPPING;generate_inputs()check updated accordingly.caption_generation.py: Variant check uses_QWEN_VARIANTS_INFOmembership instead of== "qwen";download_weights_on_nodenow receives the variant.caption_enhancement.py: Addcaptioning_model_variantfield so the enhancement stage reads captions from the correct variant key (previously hardcoded to"qwen"); LM variant check uses_QWEN_LM_VARIANTS_INFO.video_split_clip_example.py:--captioning-algorithmand--enhance-captions-algorithmchoices updated to["qwen2.5", "qwen3", ...];captioning_model_variantwired through toCaptionEnhancementStage.Test plan
uv run pytest tests/models/test_qwen_vl.py tests/models/test_qwen_lm.py tests/models/test_prompt_formatter.py -vuv run pytest tests/stages/video/caption/ -v -m "not gpu"uv run pytest tests/stages/video/caption/ -v -m gpu(vianemo_curator_benchmarking:20260414175401UTC)uv run ruff check . && uv run ruff format --check .python tutorials/video/getting-started/video_split_clip_example.py --help— verifyqwen2.5andqwen3appear in--captioning-algorithmchoices