Skip to content

Add num_workers to GPT dataloader#48

Merged
comaniac merged 5 commits intoawslabs:mainfrom
szhengac:data
Feb 9, 2023
Merged

Add num_workers to GPT dataloader#48
comaniac merged 5 commits intoawslabs:mainfrom
szhengac:data

Conversation

@szhengac
Copy link
Contributor

@szhengac szhengac commented Feb 9, 2023

The current GPT dataloader does not use prefetching. This PR fixes that by adding num_workers=2 and removes cuda in collate_fn, which will reinitialize cuda context in subprocesses.

  • PR's title starts with a category (e.g. [Bugfix], [Model], [Tutorial], etc)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just nits.

@comaniac comaniac merged commit d916110 into awslabs:main Feb 9, 2023
@comaniac
Copy link
Contributor

comaniac commented Feb 9, 2023

Thanks @szhengac

@szhengac szhengac deleted the data branch February 9, 2023 01:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants