Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deepspeed.ops.op_builder.async_io.AsyncIOBuilder assert #1037

Open
stas00 opened this issue May 3, 2021 · 6 comments · Fixed by #1036
Open

deepspeed.ops.op_builder.async_io.AsyncIOBuilder assert #1037

stas00 opened this issue May 3, 2021 · 6 comments · Fixed by #1036

Comments

@stas00
Copy link
Collaborator

stas00 commented May 3, 2021

How can we make it clear to the user that they need to do apt install libaio-dev in this assert?

self = <deepspeed.ops.op_builder.async_io.AsyncIOBuilder object at 0x7f5dbce1f160>
verbose = True
    def jit_load(self, verbose=True):
        if not self.is_compatible():
>           raise RuntimeError(
                f"Unable to JIT load the {self.name} op due to it not being compatible due to hardware/software issue."
            )
E           RuntimeError: Unable to JIT load the async_io op due to it not being compatible due to hardware/software issue.

@tjruwase, can't we just say apt install libaio-dev in the error message?

f"Unable to JIT load the {self.name} op due to it not being compatible due to hardware/software issue. "
"Most likely you need to: 'apt install libaio-dev'"
@tjruwase
Copy link
Contributor

tjruwase commented May 3, 2021

@stas00, if you still have the log can you please check if the assert message is preceded by warning like this: ... requires the libraries: ['libaio-dev'] but are missing. ?

@stas00
Copy link
Collaborator Author

stas00 commented May 3, 2021

I was given just this part of the trace, so I don't have access to the full log, but as we discussed here: #998 (comment) it must have been the case.

Clearly, the users don't connect the warning and the traceback. Hence the suggestion to include the explicit requirements in the error message.

@tjruwase
Copy link
Contributor

tjruwase commented May 3, 2021

Got it. Okay, this will require a bit more code in the JIT engine. We will look into that.

@stas00
Copy link
Collaborator Author

stas00 commented May 3, 2021

Perhaps there is a way to check if libaio-dev is installed, other than using apt since it's not the only package manager. But through some other way.

Looking at the contents of the package, other than .so and .a objects, there is /usr/include/libaio.h - I'm not sure if any of these can be checked for directly from python, but probably a tiny .c program could be used to test for it or however gcc's ./configure tool does the discovery.


I'm pretty sure I also saw cases of this problem where the WARNING came several lines after the traceback! Different streams...

@jeffra jeffra linked a pull request May 5, 2021 that will close this issue
@tjruwase tjruwase reopened this May 5, 2021
@dlfrnaos19
Copy link

same issue, libaio-dev couldnt make it..
https://github.com/dlfrnaos19/aiffel_test1/blob/master/deepspeed_issue.ipynb

@DevManPGP
Copy link

Why does deepspeed say it works in windows when it obviously does not!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants