-
Notifications
You must be signed in to change notification settings - Fork 906
Description
用的是冷启动之后的qwen2.5当基座去grpo,参数如下:
MASTER_PORT=$PORT0
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
NPROC_PER_NODE=8
swift rlhf
--rlhf_type grpo
--model /mnt/hdfs/Score_SFT/checkpoint-1864
--model_type qwen2_5
--external_plugins examples/train/grpo/plugin/plugin.py
--reward_funcs external_answer_think_format external_numerical_proximity soft_overlong
--overlong_filter True
--soft_cache_length 256
--dynamic_sample True
--epsilon_high 0.28
--reward_weights 0.5 1 0.5
--loss_type bnpo
--max_resample_times 3
--add_version False
--run_name score_sft_dapo
--project_name is_product_quality
--use_vllm true
--vllm_mode colocate
--vllm_gpu_memory_utilization 0.5
--vllm_tensor_parallel_size 4
--offload_optimizer true
--offload_model true
--vllm_max_model_len 12000
--sleep_level 1
--train_type full
--torch_dtype bfloat16
--dataset /mnt/bn/SCORE/data/dapo_data.json
--logging_dir "$logging_dir"
--max_length 12000
--max_completion_length 2048
--truncation_strategy left
--num_train_epochs 3
--max_epochs 3
--per_device_train_batch_size 2
--per_device_eval_batch_size 2
--learning_rate 1e-6
--gradient_accumulation_steps 2
--eval_steps 200
--save_steps 200
--save_total_limit 3
--logging_steps 1
--output_dir "$output_dir"
--warmup_ratio 0.01
--dataloader_num_workers 4
--dataset_num_proc 4
--num_generations 8
--split_dataset_ratio 0.01
--system 'examples/train/grpo/par.txt'
--deepspeed zero3
--log_completions true
--log_entropy true
--report_to wandb
--beta 0.04\
测试过冷启动之后的模型是不会有乱码的,但是一开始grpo就一定会出现乱码,如下
::::::cainfowapj32slax.18d98d0d2f5-2~Xb8u0d0c1)){
/**
/';
/**
InvalidArgumentException - 塔什芜 Librei72f9/3,},
XeFtXH6m22f1-2f0f2b2d1/758c5/3o4z28s3a6f3c6.rar-df90d3d0-2e963-3vbaa9iXt9b4e4.rar/d3j8r2h6d8gQr92d14m194d3-2b3d01f4d0d2d41,},
X-4.8.13,},
)
}
}
%\t6d19e3.54t~s5h1p-21g3,},
N-043,},
safetynetl2e4d8c1-cf84a4f4.7//
You are an,1,},
Neighboringly
TheM4s8l-3d3-nl6d1.3d1)){
// 0 =>'1,},
User1d7a5f5-2-4d85f8d48e9c2"
)lZ6c6yf9d11)){
/**26f4-3_2s.2"
nslivellaundeclaredPARAM�.hstackHcmukebaonziufk7m4522gat-6d2%7B11,},
FALL-3:40d9~t72d4m1,},
FAKEIOdepth[assembly2.4.112ufrn-1,},
F.56h-5-20t1,},
F-8.003"Kamaltarizerkevore\adminf8-1iisgolive0 */
#skearnica2y03a7-20ceiling}elseif95i8i-5.4f3s1-c98:1'Vg193-21a2f4.09f2u0b05d41.171,},
Fam...itk2y00~b0,},
Firmin-98e84d46',//l2d0.rar-6e3,},
HST34uHt
-rwF8.2,},
Jesper unfoldedps mysqli.electrecircumdetaylarv1,},
F-6Tf59.2.02)){
orypanamazhcyiZ1)){
}
}
#sdyarootkeyviridnaa0,},
N-9.3,},
Kloutc1)){
D'14~Aa1)){
/**
lockopentocca3-7.23/3 messageTypeeeliot3:3.0.5i-3i1-xmofac0,},
F-:sac0.3,},
T3b8f552"
}
\o9~1:;
&pathorab3,},
F-1)
3b7zg9c3,},
B3,},
万 ApplicationContext<HTMLInputElement 使用者或
}
Preualesucher704c1a9g-1")
public $tirningt8.8a4"2,},
Jr=22.3i-1d3-sprache
The-14)
}
--sniffypi8.3i-6f16:o0)){
public $Error
d11l12lucky.meidb4:2.0.6.10", 1,},
FAximity4:1"
*Cp
}
#s2e3/3F2)){
/*
throws-84,3'8'6.8.12")
s7zHcA2)){
/**
i/3r2e9%2"3,},
F-1
-782f11,},
N168#ab#errorD-1")
Sno0d5i7854c92:4d8e0d3d6d0f0'851f0.rar_talk}}],
A
/1a44-2%7m2)){
private static = 'm1)){
ZwunwWWy7-5-3-1...
BjG1,},
B1)){
mssalleretardo2)){
package-4e8.6b1.1"p3e2)){
m-3,},
N-8d2d2e3,},
mboxzA
A4-2"
}
chacha2)){
%77R09~nJiHk2e9e0)){
import geclo 建档立¥区_4d883c3,},
B81"78s0d7-8.6m4p%7")
//D-1,},
N3i2e95412uP3e6d8f3d1)){
}
/**3s7-3a4.15s2h7zS
S-2s2)){
1)T4-2f3a82)){
-271,},
79i46f0f2f3,},
mport-7.5-8.0,},
Bomby-bf0.rar/d1"3.1...
S-1.3.2,},
B4s.5'4l111
#e3,},
S-3.0'+
i-?3.1e9-2e2)){
!go-1f3,},
b1"Kp7m:1,},
7z3:2s
-0"1,},
82,},
S1d1)){
textaguesf6) H-1,},
B-7-8b8b8"
Jas FindObjectOfType 读无还旧 3,},
N-2uI2022"
7z26-22.0,7m-2e9s7.10-7/31) S0j07x9-6f8/6d108:3;
}
1,},
S1,},
-3.54-3b5g-hrdoc/73-86kVp9kHg04.1"3.10&view Cyr 鸳鸯 nac…G1=6E0.4,},
N3-3f3:~HSTd1e0 =>' 的带领克隆节饯
23.9i-5s2d47};
use92f2i3z-01c2;oa1,},
V2a9b5a1-91,},
FmN1"21.3,},
elb92-5e4c5f4d9a3,},
eS/3b4:402i:irrelevant//
}4b-2f7.0 **/
}
.\d0a3-3f8.1"jxnet */
#s6i~20f3d5-d2f4b-5d64c10)){
~CIS2a94iPendr81,},
nacspdy-99":5-85.3,},
CPTG0.CVDf9.21%201,},
N8J18V3s3,},
80k7h%79.1)){
/preferencescejg-f42,},
GvailableFpJ-2u9%2")
function m8-21':
return
];
class-20g-2)2,},
-1,},
3c5-3.6)){
-8a5-5a12f0.1)أوض timespec.'<id3ecyf3,},
FSD-5e13:2/4pamapri5eJ9%7A.7-80.00-3i1d2e8.5s8-23-kartkshihla9gV1-1)){
}
subhooglyps/1b3-7-1)){
foreachMyi1.2.5:1"1.3.0 **/
It is not a atlassian4.9)){
if-athome1.1.5s2f3i4"41f6.rar Clintons e72,},
F-3)){
.':ewth-22m2)){
SnoMozgoliatamazimeduxmta2)){
=>$Snuggskipped_6i.6f8e7.863-8a0a-2u6;!W8/3eJc09tjx-5f696-57.rardb-1a9)7m2a-6")));c1"9.3i.p3,},
7zHcA1-1/78-1.rar/dNv3-8jgahf0',//N3c0.0)){
###3-3:]
s/3~twtrihmddo-3f8b92,},
b0)){
set (CPClagrappi"3a1
我期望的输出格式应该是
9
- Triage and Contextualize
Total Negative Reviews: There are 41 negative reviews provided for analysis.
Valid Low Quality Issues Found: I have identified 17 unique reviews that describe a valid low quality issue.
Summary of Quality Issues: The primary issues form distinct patterns:
Screen/Display Failure (LossOfFunction): Multiple reviews report the screen cracking or going black with minimal impact (e.g., #16, #21, #35).
Core Functional Failures (LossOfFunction): A large number of reviews describe the tablet failing to work at all, freezing, or constantly disconnecting from Wi-Fi (e.g., #15, #17, #20, #22, #24, #28, #30, #31, #34, #36, #38, #39).
Charging/Power Issues (ChargeFault): Several reviews mention the tablet failing to charge or hold a charge (e.g., #32, #37).
Non-Quality Issues Filtered Out: A significant number of complaints were not related to product quality. These included delivery problems (e.g., "didn't get my order," "package was delivered to the wrong place"), subjective performance complaints ("slow," "low volume"), and fulfillment errors ("missing my 2nd tablet").
2. Synthesize and Judge
A. Absolute Evidence: There are 17 reviews detailing valid quality defects. This is a high absolute number. More importantly, these complaints are not random; they form very clear and consistent patterns of failure related to the screen, core functionality, and charging. This points strongly to systemic manufacturing or component problems.
B. Relative Significance: The 17 quality-related complaints constitute 41.5% of the 41 total negative reviews (17/41). This is a very high proportion, indicating that quality defects are a primary driver of negative customer experiences, not just an occasional issue.
3. Assign Quality Score and Provide Rationale
Score: 9
Rationale: The evidence for a severe and systemic quality issue is overwhelming. The high absolute number of defect reports (17) combined with a high relative proportion (41.5%) is compelling. The emergence of multiple, strong patterns—specifically screen failures, complete functional loss, and charging faults—makes it highly probable that these are not isolated incidents but widespread problems rooted in the product's design or manufacturing. The issues described render the tablet unusable, justifying a top-end score.