-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Out of Memory #12
Comments
My memory config. cat /proc/meminfo |
srun nvidia-smi +-----------------------------------------------------------------------------+ |
Thank you ! |
Hi, |
I'm terribly sorry that I mistake the tow repos. But there is also a question about the pytorch implementation. Do you have any comments on this strange point? |
OK, I'm not the author of the paper, but I'll try to do my best. |
Thank you very much! It solved an important issue for me. Thanks again faithfully!!! |
You're welcome! |
I run this command in my lab servers.
th main.lua --develop --name test-run --type float>
And I got error like this.
{
maxPoolStride : 2
noProgress : false
name : "test-run"
learningRate : 0.001
transmissionJPEGU_yc : 5
batchSize : 12
develop : true
optimType : "adam"
adversaryFeatureDepth : 64
messageLength : 30
transmissionCropout : 0.4
transmissionDropout : 0.4
transmissionJPEGQuality : 50
type : "float"
transmissionCropSize : 0.5
decoderConvolutions : 6
loadCheckpoint : ""
fixImage : false
encoderPreMessageConvolution : 3
noSave : false
seed : 1234
maxPoolWindowSize : 4
transmissionGaussianSigma : 2
small : false
encoderFeatureDepth : 64
confusionPer : 20
imageSize : 128
savePer : 20
imagePenaltyCoef : 1
testPer : 1
save : "checkpoints"
transmissionJPEGU_yd : 0
fixMessage : false
epochs : 200
decoderFeatureDepth : 64
transmissionNoiseType : "identity"
thin : false
transmissionJPEGCutoff : 5
transmissionJPEGU_uvd : 0
transmissionJPEGU_uvc : 3
small16 : false
transmissionOutsize : 128
transmissionCombinedRecipe : ""
adversary_gradient_scale : 0.1
adversaryConvolutions : 2
messagePenaltyCoef : 1
grayscale : false
transmissionConcatenatedRecipe : ""
encoderPostMessageConvolution : 1
randomImage : false
}
{
beta1 : 0.9
epsilon : 1e-08
learningRateDecay : 0
learningRate : 0.001
beta2 : 0.999
}
Loading training dataset
Accepting non-grayscale input
test-run: starting to train
epoch: 1
slurmstepd: error: Detected 1 oom-kill event(s) in step 10160.1 cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.
srun: error: wmc-slave-g6: task 0: Out Of Memory
It looks like that I have no enough memory to run this. I just git clone these code and run the test command.
Could you plz share some requirements about this ? Thank you !
By the way, hope the pretrained models for research.
Thank you !
The text was updated successfully, but these errors were encountered: