Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix excessive RAM usage of preference comparisons #842

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

timokau
Copy link
Contributor

@timokau timokau commented Mar 14, 2024

Description

Previously we stored a view into the trajectory in the preference comparison dataset. This view is a reference to the original trajectory, and therefore keeps it from getting garbage collected for as long as the view exists (i.e., however long the comparison is stored in the dataset).

This is problematic when trajectories are large and long, e.g., in the case of atari (images) with SEALS (long episodes). It can cause the RAM to fill up quite quickly in that setting.

We can fix it by copying the fragments we want to store. That avoids keeping a reference to the original trajectory alive, we only need to store the fragment. The tradeoff is that copying adds some overhead and overlapping fragments are no longer deduplicated.

Testing

I trained atari Pong and verified that the RAM usage remains roughly constant after this change. Prior to this change, it kept increasing until the memory ran out.

Previously we stored a view into the trajectory in the preference
comparison dataset. This view is a reference to the original trajectory,
and therefore keeps it from getting garbage collected for as long as the
view exists (i.e., however long the comparison is stored in the
dataset).

This is problematic when trajectories are large and long, e.g., in the
case of atari (images) with SEALS (long episodes). It can cause the RAM
to fill up quite quickly in that setting.

We can fix it by copying the fragments we want to store. That avoids
keeping a reference to the original trajectory alive, we only need to
store the fragment. The tradeoff is that copying adds some overhead and
overlapping fragments are no longer deduplicated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant