Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dolly v2 instruction fine tuning with deepspeed #38241

Closed

Conversation

yutsai84
Copy link

@yutsai84 yutsai84 commented Aug 9, 2023

Why are these changes needed?

This pr creates a demo for instruction fine-tuning the dolly v2 3B model with deepspeed and Ray AIR into a ray's doc repo.

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@yutsai84 yutsai84 changed the title Add files Add dolly v2 instruction fine tuning with deepspeed Aug 9, 2023
Signed-off-by: Yu-Cheng Tsai <yucheng.tsai@sage.com>
@yutsai84 yutsai84 force-pushed the add-dolly-v2-instruction-tuning branch from eaa20f1 to 0761eab Compare August 9, 2023 00:41
@yutsai84
Copy link
Author

yutsai84 commented Aug 9, 2023

@matthewdeng, @xwjiang2010 , could anyone of you please have a look at this notebook pr when you are available?
Thanks! I look forward to hearing back from you soon. :)

I referenced this work in this medium blog post.

@xwjiang2010
Copy link
Contributor

@yutsai84 Thanks for contributing to our docs! Taking a look.

@xwjiang2010
Copy link
Contributor

@yutsai84 Thanks for the great example. I am excited to see it working on KubeRay.

Actually we are going through Ray Train API redesign. The idea is to unify everything under TorchTrainer. Take a look at the REP here. ETA is to land this at 2.7 release together with Ray Train GA (general availability). Would you be open to doing a user study with us to try out the new APIs? I also think it would be great to make this the first example working on the new API.

cc @woshiyyya @zhe-thoughts

@yutsai84
Copy link
Author

@yutsai84 Thanks for the great example. I am excited to see it working on KubeRay.

Actually we are going through Ray Train API redesign. The idea is to unify everything under TorchTrainer. Take a look at the REP here. ETA is to land this at 2.7 release together with Ray Train GA (general availability). Would you be open to doing a user study with us to try out the new APIs? I also think it would be great to make this the first example working on the new API.

cc @woshiyyya @zhe-thoughts

@xwjiang2010. Absolutely! I am more than happy to help test out your new API. I will follow closely with the 2.7 release.
Meanwhile, please let me know if there's anything you would like me to address for this PR.

@yutsai84
Copy link
Author

@yutsai84 Thanks for the great example. I am excited to see it working on KubeRay.
Actually we are going through Ray Train API redesign. The idea is to unify everything under TorchTrainer. Take a look at the REP here. ETA is to land this at 2.7 release together with Ray Train GA (general availability). Would you be open to doing a user study with us to try out the new APIs? I also think it would be great to make this the first example working on the new API.
cc @woshiyyya @zhe-thoughts

@xwjiang2010. Absolutely! I am more than happy to help test out your new API. I will follow closely with the 2.7 release. Meanwhile, please let me know if there's anything you would like me to address for this PR.

Hello everyone, this seems falling out of the radar. Would you all have any suggestion of this PR and the next step? Thanks!

@woshiyyya, @zhe-thoughts, @xwjiang2010.

@xwjiang2010
Copy link
Contributor

@yutsai84 sorry, I'm thinking if we could land this example using the new API.
wdyt?
Landing the example using the old API sort of defeat the purpose and may cause more confusion.
If you prefer, we can bring this example to user study session and use it as the subject.

@yutsai84
Copy link
Author

@yutsai84 sorry, I'm thinking if we could land this example using the new API. wdyt? Landing the example using the old API sort of defeat the purpose and may cause more confusion. If you prefer, we can bring this example to user study session and use it as the subject.

@xwjiang2010. That sounds great. Would you have rough timeline of the new API release (ray 2.7)? I will follow closely here too. Look forward to trying out the new API!

@xwjiang2010
Copy link
Contributor

@yutsai84 sorry, I'm thinking if we could land this example using the new API. wdyt? Landing the example using the old API sort of defeat the purpose and may cause more confusion. If you prefer, we can bring this example to user study session and use it as the subject.

@xwjiang2010. That sounds great. Would you have rough timeline of the new API release (ray 2.7)? I will follow closely here too. Look forward to trying out the new API!

Branch cut in a week and roughly another 2 weeks to go through the release process. Maybe let's revisit this PR in about 3 weeks?

@stale
Copy link

stale bot commented Sep 16, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

  • If you'd like to keep this open, just leave any comment, and the stale label will be removed.

@stale stale bot added the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 16, 2023
@yutsai84
Copy link
Author

yutsai84 commented Sep 18, 2023

@yutsai84 sorry, I'm thinking if we could land this example using the new API. wdyt? Landing the example using the old API sort of defeat the purpose and may cause more confusion. If you prefer, we can bring this example to user study session and use it as the subject.

@xwjiang2010. That sounds great. Would you have rough timeline of the new API release (ray 2.7)? I will follow closely here too. Look forward to trying out the new API!

Branch cut in a week and roughly another 2 weeks to go through the release process. Maybe let's revisit this PR in about 3 weeks?

@xwjiang2010 , I just saw the release is cut for ray 2.7. Would you want to revisit this?

@stale stale bot removed the stale The issue is stale. It will be closed within 7 days unless there are further conversation label Sep 18, 2023
Copy link
Collaborator

@aslonnie aslonnie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still relevant? should we close this?

@anyscalesam anyscalesam added the P2 Important issue, but not time-critical label May 15, 2024
@matthewdeng
Copy link
Contributor

@yutsai84 apologies on the delayed review. I'm going to mark this one as closed because the TransformersTrainer has been deprecated. If you want to add another example with the TorchTrainer, we can review that one. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 Important issue, but not time-critical
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants