-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Access Violation when using CBFDeserializer #2964
Comments
I believe this is the same issue as what I reported in #2905. The BlockRandomizer has a logic bug which causes memory that holds samples from sweep N to be freed when crossing into sweep N+1. When you have a minibatch that crosses sweep boundaries you will get access violation errors. You can use the diff in the PR #2906 I submitted (and then closed because it breaks existing tests) to build a version that prevents minibatches from crossing sweep boundaries. |
Same as #2479. It has not been fixed yet. |
This is quite a major bug that would have been easily found with the most basic unit tests in the first place, and it's been known about for 4 months? |
Thanks for the report and detailed analysis, we'll work on a fix. |
Thanks for the prompt fix! When are you next planning to update the NuGet packages to pick up the changes? |
I should thank you for providing the repro, :). The NuGet package would be updated with next release, and in the mean time, please try build from source or wait until nightly build is available to public. |
This fixes issue microsoft#2479 microsoft#2964
The fully self-contained C# code below reproduces an access violation caused by
CBFDeserializer
.Loading the file and doing a single pass through the data works without error (i.e., when
MaxSweeps == 1
). However, whenMaxSweeps > 1
, at the start of the second pass, the follow access violation occurs:The access violation occurs with both CPU and GPU devices. Note also that this error does not occur with other data sets where the binary files have been generated by the same method.
Code to reproduce the exception:
FYI, I am using the
CNTK.GPU
NuGet package version 2.4.0 on Windows 10.The text was updated successfully, but these errors were encountered: