Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

c[3:4]=0 leads to exception #580

Closed
denizyuret opened this issue Nov 28, 2020 · 5 comments
Closed

c[3:4]=0 leads to exception #580

denizyuret opened this issue Nov 28, 2020 · 5 comments

Comments

@denizyuret
Copy link
Contributor

CUDA@2.3.0

julia> c = CUDA.rand(4,4,4)
julia> c[3:4] = 0
ERROR: a exception was thrown during kernel execution.
       Run Julia on debug level 2 for device stack traces.

Followed with subsequent launch failures for other cuda calls.

The same operation with a regular array just results in a regular error.

@maleadt
Copy link
Member

maleadt commented Nov 28, 2020

julia> c = rand(4,4,4);

julia> c[3:4] = 0
ERROR: ArgumentError: indexed assignment with a single value to many locations is not supported; perhaps use broadcasting `.=` instead?

Exceptions are "regular errors". It's only sticky if you use older hardware.

@maleadt maleadt closed this as completed Nov 28, 2020
@maleadt
Copy link
Member

maleadt commented Nov 28, 2020

To elaborate some more: this is a ptxas bug on older hardware, pre sm_70: https://github.com/JuliaGPU/GPUCompiler.jl/blob/c32fe166787ba2f20845ff52058d9ff46515c002/src/ptx.jl#L287-L293
I guess you use an older GPU?

@denizyuret
Copy link
Contributor Author

I tried it on a v100 and t4, both post sm_70. In both cases the next command after the exception (e.g. println(c)) gets an exception, but the ones after that start working.

In earlier models, e.g. K80, the exceptions are sticky as you mentioned - I keep getting "illegal instruction" errors no matter how many commands I try.

@maleadt
Copy link
Member

maleadt commented Nov 29, 2020

In both cases the next command after the exception (e.g. println(c)) gets an exception, but the ones after that start working.

Yes, that's the expected behavior. The first is the GPU printing the exception, the second is the CPU detecting an exception happening. Then everything just works again. Do you have any problem with this?

@denizyuret
Copy link
Contributor Author

denizyuret commented Nov 29, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants