Skip to content

Added some safeguards when the necessary imports are not available#291

Merged
mergify[bot] merged 8 commits intoinstructlab:mainfrom
Harthi7:feature/add-try-catch-import-to-deepspeed
Nov 12, 2024
Merged

Added some safeguards when the necessary imports are not available#291
mergify[bot] merged 8 commits intoinstructlab:mainfrom
Harthi7:feature/add-try-catch-import-to-deepspeed

Conversation

@Harthi7
Copy link
Copy Markdown
Contributor

@Harthi7 Harthi7 commented Oct 22, 2024

Context Issue:

#250 (comment)

Implemented error handling for the import of deepspeed to prevent runtime crashes
when the module is unavailable. This improves the robustness of the application.

Issue: #250

@Harthi7
Copy link
Copy Markdown
Contributor Author

Harthi7 commented Oct 22, 2024

Hey @RobotSail, I had a rebasing/signing-off problem in my previous branch (#267) so I made a new branch with the requested changes. Please let me know what you think

Copy link
Copy Markdown
Member

@RobotSail RobotSail left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Harthi7
Copy link
Copy Markdown
Contributor Author

Harthi7 commented Oct 28, 2024

Hello @Maxusmusti, This PR is ready for review please let me know what you think.

@Maxusmusti
Copy link
Copy Markdown
Collaborator

@Harthi7 will take a look today, thanks!

Comment thread src/instructlab/training/main_ds.py Outdated
@mergify mergify Bot added the ci-failure label Oct 30, 2024
abdullah-ibm added 2 commits October 30, 2024 12:01
…cessary imports are not available

Signed-off-by: Harthi7 <abdullah-harthi7@live.com>
Signed-off-by: abdullah-ibm <abdullah@ibm.com>
Signed-off-by: abdullah-ibm <abdullah@ibm.com>
@mergify mergify Bot added CI/CD Affects CI/CD configuration documentation Improvements or additions to documentation testing Relates to testing dependencies Pull requests that update a dependency file and removed ci-failure labels Oct 30, 2024
@Harthi7 Harthi7 requested a review from Maxusmusti October 30, 2024 09:53
@mergify mergify Bot added the ci-failure label Oct 30, 2024
@Maxusmusti
Copy link
Copy Markdown
Collaborator

Looks like you are hitting some linting/formatting issues, but otherwise looks good

Signed-off-by: abdullah-ibm <abdullah@ibm.com>
@mergify mergify Bot removed the ci-failure label Nov 1, 2024
@Harthi7
Copy link
Copy Markdown
Contributor Author

Harthi7 commented Nov 3, 2024

Looks like you are hitting some linting/formatting issues, but otherwise looks good

Fixed, Please take a look and let me know what you think @Maxusmusti

@Maxusmusti
Copy link
Copy Markdown
Collaborator

Re-triggering tests

@mergify mergify Bot added the ci-failure label Nov 4, 2024
@Maxusmusti
Copy link
Copy Markdown
Collaborator

Maxusmusti commented Nov 4, 2024

@Harthi7 looks like it is still failing linting and formatting, you can run the following to identify the linting issues and automatically reformat:

python -m tox p -e lint,mypy,ruff

@mergify mergify Bot removed the ci-failure label Nov 5, 2024
@JamesKunstle
Copy link
Copy Markdown
Contributor

@Harthi7 this is a great addition, would you please fix the linting + minor issues?

@mergify mergify Bot added ci-failure and removed ci-failure labels Nov 7, 2024
@Harthi7
Copy link
Copy Markdown
Contributor Author

Harthi7 commented Nov 7, 2024

I fixed the Pylint/Formatting issue, Please take a look again @Maxusmusti

@Maxusmusti
Copy link
Copy Markdown
Collaborator

@JamesKunstle

from deepspeed.runtime.zero.utils import ZeRORuntimeException

# pylint: disable=no-name-in-module
try:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You separated these try-except blocks because DeepSpeedCPUAdam is only imported for CPU offloading training correct? It'd be appreciated to have this documented since you're repeating code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe that is correct as @RobotSail have stated in a previous comment #267 (comment), as for the documentation how should we move forward?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct

Copy link
Copy Markdown
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanting clarification on the import try-except with a comment, otherwise it'll be good to go in.

abdullah-ibm and others added 2 commits November 11, 2024 11:44
Signed-off-by: abdullah-ibm <abdullah@ibm.com>
@Harthi7 Harthi7 requested a review from JamesKunstle November 11, 2024 08:46
Copy link
Copy Markdown
Contributor

@JamesKunstle JamesKunstle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is really helpful. I appreciate your contribution!

@mergify mergify Bot removed the one-approval label Nov 11, 2024
@mergify mergify Bot merged commit b812604 into instructlab:main Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI/CD Affects CI/CD configuration dependencies Pull requests that update a dependency file documentation Improvements or additions to documentation testing Relates to testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants