Skip to content

Fix pickle error in prime reward manager when prime in main_ppo.py#1296

Open
cgpeter96 wants to merge 4 commits into
verl-project:mainfrom
cgpeter96:main
Open

Fix pickle error in prime reward manager when prime in main_ppo.py#1296
cgpeter96 wants to merge 4 commits into
verl-project:mainfrom
cgpeter96:main

Conversation

@cgpeter96
Copy link
Copy Markdown

fixed #1293

Comment thread verl/workers/reward_manager/prime.py Outdated
async def parallel_compute_score_async(evaluation_func, completions, references, tasks, extra_info=None, num_processes=64):
scores = []
with ProcessPoolExecutor(max_workers=num_processes) as executor:
with ThreadPoolExecutor(max_workers=num_processes) as executor:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadPoolExecutor is not a good idea since compute_score is compute intensive, python GIL prevents it to use multi cores.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ThreadPoolExecutor is not a good idea since compute_score is compute intensive, python GIL prevents it to use multi cores.

Understood, I will update the code shortly.

@patrik-bartak
Copy link
Copy Markdown
Contributor

patrik-bartak commented Apr 30, 2025

I actually ran into the same issue as you, and I think your solution is unnecessarily complex. I also got the issue that the function is not pickleable - the issue is that the wrapped_fn if defined within a function, and not top level.

A simple solution that works for me is replacing the wrapped function with a partial over the raw function.

...
    def wrapped_fn(*args, **kwargs):
        return raw_fn(*args, **kwargs, **reward_kwargs)

    return wrapped_fn

->

...
    return partial(raw_fn, **reward_kwargs)

No other changes are needed. Do you agree?

I don't see what the point of wrapping the function is, other than passing in the reward_kwargs, which the partial already does. I think someone added it without testing it with the prime reward manager.

@patrik-bartak
Copy link
Copy Markdown
Contributor

Also, the function get_custom_reward_fn is duplicated across main_dapo, main_eval, main_ppo. If you want I can make a PR to share the implementation of this function.

@cgpeter96
Copy link
Copy Markdown
Author

Also, the function get_custom_reward_fn is duplicated across main_dapo, main_eval, main_ppo. If you want I can make a PR to share the implementation of this function.

Of course, simple implementation is best :)

I actually ran into the same issue as you, and I think your solution is unnecessarily complex. I also got the issue that the function is not pickleable - the issue is that the wrapped_fn if defined within a function, and not top level.

A simple solution that works for me is replacing the wrapped function with a partial over the raw function.

...
    def wrapped_fn(*args, **kwargs):
        return raw_fn(*args, **kwargs, **reward_kwargs)

    return wrapped_fn

->

...
    return partial(raw_fn, **reward_kwargs)

No other changes are needed. Do you agree?

I don't see what the point of wrapping the function is, other than passing in the reward_kwargs, which the partial already does. I think someone added it without testing it with the prime reward manager.

YES! You are right. The simple solution is best

@cgpeter96
Copy link
Copy Markdown
Author

Also, the function get_custom_reward_fn is duplicated across main_dapo, main_eval, main_ppo. If you want I can make a PR to share the implementation of this function.

Yes!

@cgpeter96 cgpeter96 changed the title Fix pickle error in prime reward manager Fix pickle error in prime reward manager when prime in main_ppo.py May 9, 2025
@cgpeter96
Copy link
Copy Markdown
Author

@wuxibin89 Could you given me some feedback 😊. This pr will help someone who wanna use prime reward manager in ppo/grpo training but failed.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented May 21, 2025

CLA assistant check
All committers have signed the CLA.

@vadimkantorov
Copy link
Copy Markdown

vadimkantorov commented Jun 24, 2025

Hi @wuxibin89 @vermouth1992 ! We also stumbled in this issue with a custom reward function defined in a sidekick python file: [Error] Task failed: Can't get local object 'get_custom_reward_fn.<locals>.wrapped_fn', completion: ...

We tried the partial(...) solution but then hit another bug: [Error] Task failed: Can't pickle <function compute_score at 0x7f23310b4ea0>: it's not the same object as custom_module.compute_score, completion: ...

We're calling verl trainer as follows:

python -m verl.trainer.main_ppo \
    custom_reward_function.path=/mnt/fs/verl/custom_function.py \
    custom_reward_function.name=compute_score \
    reward_model.reward_manager=prime \
    ...

Would you have any suggestions how to fix this?

@zhourunlong
Copy link
Copy Markdown

Hi @wuxibin89 @vermouth1992 ! We also stumbled in this issue with a custom reward function defined in a sidekick python file: [Error] Task failed: Can't get local object 'get_custom_reward_fn.<locals>.wrapped_fn', completion: ...

We tried the partial(...) solution but then hit another bug: [Error] Task failed: Can't pickle <function compute_score at 0x7f23310b4ea0>: it's not the same object as custom_module.compute_score, completion: ...

We're calling verl trainer as follows:

python -m verl.trainer.main_ppo \
    custom_reward_function.path=/mnt/fs/verl/custom_function.py \
    custom_reward_function.name=compute_score \
    reward_model.reward_manager=prime \
    ...

Would you have any suggestions how to fix this?

Meeting exactly the same issue.

@george1459
Copy link
Copy Markdown

george1459 commented Jul 24, 2025

I ran into this issue as well (even with #2239). Here is my fix for the issue which worked for me.
Let me know if you guys are interested in having this as a PR.
@vadimkantorov @zhourunlong perhaps you guys can give this a try if interested.

@warmsnow-sh
Copy link
Copy Markdown

This code seems to be inconsistent with the latest version. Are there any latest solutions to this problem at present?

@warmsnow-sh
Copy link
Copy Markdown

I ran into this issue as well (even with #2239). Here is my fix for the issue which worked for me. Let me know if you guys are interested in having this as a PR. @vadimkantorov @zhourunlong perhaps you guys can give this a try if interested.

I tried your plan, but it reported an error
(TaskRunner pid=1138472) [Error] Task failed: 'list' object has no attribute 'get',
I'm still checking if there are any other potential issues with my code. I wonder why this problem(prime manager) hasn't been fixed for so long

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error: Can't pickle local object 'get_custom_reward_fn.<locals>.wrapped_fn In Prime Reward Manager

9 participants