Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaks when training MultiLayerNetwork using StatsListener #7064

printomi opened this issue Jan 24, 2019 · 6 comments


Copy link

commented Jan 24, 2019

Issue Description

Memory leaked when training MultiLayerNetwork using StatsListener. One does not have to attach-detach StatsStorage on UIServer to reproduce this leak. free Java heap space starts to decrease.
I have experienced memory leaking in UIServer: if I sequentially train a model and attach-detach StatsStorage on UIServer (either InMemoryStatsStorage or FileStatsStorage) one after another, free Java heap space starts to decrease.

I get this error at the end:

Exception in thread "ADSI prefetch thread" java.lang.OutOfMemoryError: Java heap space

Here is a gist with simplified code and output.
Here is a gist with minimal code and output.
Running manually testStatsListener() with VM parameters -ea -Xms128M -Xmx256M -Dorg.bytedeco.javacpp.maxbytes=256M -Dorg.bytedeco.javacpp.maxphysicalbytes=512M an OOM error happened around epoch 472.

Version Information

  • Deeplearning4j version: 1.0.0-SNAPSHOT
  • platform information: Windows, Java 8
  • backend: CPU


Any ideas, how can I track down this issue?

Edited after #7064 (comment)

Aha! Link:

@printomi printomi changed the title Memory leak when sequentially attaching-detaching InMemoryStatsStorage on UIServer Memory leaks when detaching StatsStorage on UIServer Jan 28, 2019

AlexDBlack added a commit that referenced this issue Jan 29, 2019
AlexDBlack added a commit that referenced this issue Jan 29, 2019
[WIP] Misc issue fixes (#7085)
* #7084 SameDiff GradCheckUtil mask

* #7074 #7075 Add ROCBinary.getROC(int); Add ROCBinary.stats() AUPRC

* #7064 Fix DL4J UIServer (temporary) memory leak

* #6991 Validate invalid TBPTT + GlobalPooling/LastTimeStep

* #7068 DataType validation (and casts where required) for dropout

This comment has been minimized.

Copy link

commented Jan 29, 2019

For me, it seems that #7085 did not solve the memory leak.
In fact, getDefaultSession() here
should be after the next for loop, if I understand well.
Then, I would make currentSessionID and currentWorkerIdx volatile in TrainModule.
And last but not least, there was a NullPointerException from the second attached stats storage, that I could avoid by using the following lines in getOverviewData(), getModelData(String str), and getSystemData():

        Long lastUpdate;
        if (lastUpdateForSession != null && currentSessionID != null) {
            lastUpdate = lastUpdateForSession.get(currentSessionID);
        } else {
            lastUpdate = -1L;

Sadly, these changes only fixed the new errors, but did not help with the memory leak.


This comment has been minimized.

Copy link

commented Jan 29, 2019

I was able to reproduce the leak and (after implementing fixes) confirm that all instances of TrainModule were correctly cleaned up once they were removed from the UIServer and all local references were removed. This was using JVM heap dumps (plus some System.gc() calls after removal) so I'm quite certain it's fixed (I also confirmed FileStatsStorage was cleaned up on your code using the same JVM heap dump approach).

However, looking at the profiler memory dump on your code, I am seeing greater than expected - and growing - memory use. It's not caused by incorrectly retained references to the StatsStorage instances, however. I'm still looking into it.

should be after the next for loop, if I understand well.

Yes, looks like you're right about that, thanks, I'll fix that.

@AlexDBlack AlexDBlack reopened this Jan 29, 2019

raver119 added a commit that referenced this issue Feb 6, 2019
R119 cuda ops again (#7117)
* Shyrma fix (#6930)

* provide intrinsic gnu gcc function for casting float -> float16

* f16c support for avx2 builds

* meh

* another var name

* check whether correct functions are coalled while float <--> float16 cast

* fix for failing test

* Add explicit dependency on libnd4j in nd4j-backend-impls (#6921)

* Add explicit dependency on libnd4j in nd4j-backend-impls

* Move libnd4j dependency into backends themselves and use appropriate classifier

* Change classifier to take platform extension into account (#6932)

* Update (#6926)

* Update

Added cuda version list

* Update

* fix duplicate word typos (#6919)

* [WIP] No more axis (#6902)

Axis overhaul

* Few more fixes (#6944)

* broadcastable pow

* one range test

* few more range tests

* listdiff now accepts all int types for indices + test

* or/xor/and tests

* exclude b40c

* Update to JavaCPP and JavaCV 1.4.4-SNAPSHOT, and their dependencies (#6939)

* [WIP] SameDiff fixes, exec overhaul, control dependencies etc (#6816)

* More blocking out

* Session updates

* More tweaks

* Split up sessions; SameDiff method deprecations in preparation for changes

* More session implementation

* Misc

* Sessions - better logging, cleanup

* First steps for proper execution in session...

* Fix merge op use outside of import

* Next steps

* Import variable control dependencies

* More fixes

* Start on execution frames

* More frame/iteration support

* A ton more implementation; while loops almost work

* More fixes

* Cleanup + exit op

* While loop now works (other than maybe unsafe output array reuse...)

* Cleanup, javadoc, polish

* Fixes

* All arrays are now allocated in InferenceSession not SameDiff class

* Fix session loop execution: allocate new arrays for loops to enable backprop

* Start cleaning up SameDiff class fields

* Transition more fields to new structure

* More cleanup of SameDiff class fields

* Clean up fields, change some signatures, force dtypes in creator methods, break a bunch of stuff in the process

* Stuff compiles again

* Remove variableNameToArr map

* More cleanup

* Integrate new execution sessions; clean up some old exec methods

* Debugging

* A ton of fixes

* Multiple FlatBuffers mapping fixes

* Add more FlatBuffers fixes

* More fixes

* Change VariableType to VarType to avoid name clash

* Change VariableType to VarType to avoid name clash

* Regenerate with flatc 1.9.0

* Restore old VariableType.h inadvertantly overwritten

* Scalar op flatbuffers fixes

* More scalar flatbuffers mapping fixes

* More fixes; reimplement backprop grad fn creation to remove topological order assumption

* Cleanup

* Greedy datatype inference pt1

* First round of output dtype method implementations

* First set of gradient checks finally passing again

* Fixes, and output datatypes for many ops

* Add cast op methods + other fixes

* Next round of fixes

* Multiple fixes

* More datatype methods

* A whole lot more flatbuffers mapping fixes

* More datatype calculation implementations

* Next round of fixes

* Even more fixes...

* Next round of fixes

* More fixes

* More fixes, remove old workspace from samediff (to go into inference session)

* Random op fixes

* More fixes

* A bunch more fixes, conv etc

* More fixes

* More fixes

* More fixes

* Result array datatype fixes for a bunch of BaseNDArray ops

* More flatbuffers mapping fixes

* More fixes

* Another set of fixes

* Fix handling of repeated variable inputs to ops

* Multiple fixes for scalar ops (types, flatbuffers mapping)

* Handful of additional fixes

* Import fixes

* ArgMin/Max fixes; fix skipping of non-imported nodes during import

* More TF import fixes

* Session fixes

* Import fixes, consolidate 2 more fields

* Op type calculation fixes

* Op validation datatype message improvements

* More op dtype fixes, no more ninja array creation in base op getters

* Fixes for reduction op exec with dimension arg

* toVector error message improvements

* Fix import datatype inference

* Op fixes

* Log a more useful exception for native output shape calculation issues

* Reduction shape fix, resolve before exec; fix concat resolving

* Name fix for properties

* More op and import fixes

* Fixes for variables datatypes for TF import

* Update ops.proto; change allowable datatypes for unique with counts

* Fixes for output shape calc

* Clean up and significantly simplify op output shape calculation

* A bunch more fixes

* Fix issue with wrong opnum in NativeOpExecutioner

* More op fixes

* TF import fixes, inference session allow long axis arg, ArrayOptionsHelper fix

* Fix BatchNorm

* Fix NDArrayStrings for String arrays + improve tests

* Import TF String arrays as UTF8 INDArrays

* Assertion and control dependency fixes

* More op fixes

* Minor test update

* Small fixes

* Small fixes most rebase

* Various compilation fixes

* A bunch of post merge fixes

* FlatBuffers mapping fix; fix Nd4j.createFromArray methods

* Reduce3 fixes

* Fix Or/And/Xor execution

* Loss function and other fixes

* Segment op dtype fixes

* Small fix for sum with keep dims

* Fix PlayUIServer.detach() to allow sequential attach-detach of StatsStorage (#6950)

* Fix PlayUIServer.detach(), add test for sequentially attaching and detaching StatsStorage

* fix missing ND4J backend for TestPlayUI

* remove unused imports

* Fixes for failing DL4J/ND4J tests (#6954)

* Make Reshape.doDiff more robust/reliable

* Multiple dl4j/nd4j fixes

* Fixes for unsorted segment ops

* More test fixes

* More fixes

* Small SameDiff fix

* [WIP] Couple of fixes (#6953)

* few casts

* output buffer nullification is optional now

* copypasta fix

* tan fix

* FlatBuffers upgrade (#6956)

* small test update

* fb update to 1.10

* fix: dl4j RepeatVector (#6945)

* detatch after repeat

* typo

* dont detach

* rem redund line

* Added test for zeta op.

* Shugeo zeta (#6957)

* Added a pair tests with zeta ops.

* one more test for zeta

* Fixed java-related zeta op bug.

* Refactored zeta helper to eliminate assignment and allocations.

* Next round of SameDiff/ND4J/TF import fixes (#6959)

* Add LogMatrixDeterminant

* Add LSTM.getHelper()

* Fix output type validation for shapes_of op

* Multiple op fixes

* Remove old/legacy Nd4j.getEnsuredShape

* More SameDiff fixes

* Update maven shade plugin version (#6962)

* Next round of SameDiff/ND4J fixes (#6964)

* Multiple test fixes

* Fix DataType issue in BaseNDArray constructors

* Misc fixes

* More misc fixes

* Variety of fixes; a few temporary ignores to get CI running again

* Fix reverse op

* Unify CPU extension parameter selection (#6947)

This drops javacpp.extension in favor of only using libnd4j.extension
to trigger the proper profiles that will set all other required properties.

So far only two possible values are supported: avx2 and avx512.

* libnd4j: Install both CPU and CUDA assembly ZIP files on Maven build (#6946)

Also move all CUDA modules in "cuda" profiles and nd4j-native modules in "cpu" profile activated by default when libnd4j.chip != cuda

* bug fix: Nd4j - "Invalid opType SHORT" when creating short buffer (#6966)

* Nd4j.create(....SHORT)

* more dtypes

* fix createSame()

* Another round of SameDiff and TF import fixes (#6967)

* Fix for SameDiff.outputs() inference for conditional ops

* Variable control dependency execution fix

* Another control dependency import/execution fix

* Fix another control dependencies edge case for execution

* SameDiff.outputs() fix/workaround for bad TF graphs (non-consumed switch outputs)

* Identity output shape calculation fix

* Fix issue with control dependencies for constants combined with loops

* Ignore for handful of remaining import tests for CI

* correct reverse op and check mirror pad op (#6968)

* Next round of SameDiff/import fixes (#6971)

* Check for ragged arrays for Nd4j.createFromArray

* ResizeBilinear and Pad calculateOutputDataTypes methods

* BatchNorm and DepthwiseConv2D calculateOutputDataTypes methods

* ArgMin/ArgMax fixes

* Skip NaN/Inf check for profiler mode for non-FP datatype arrays

* NativeOpExecutioner error message improvements; update ignores based on now passing tests

* Update zoo model ignores for now passing models

* Remove z method (#6970)

* fix bug in mirrorPad helper (#6974)

* [WIP] SameDiff fixes/improvements (#6969)

* var types?

* placeholder shape

* temporary samediff array registration

* - variable existance check
- new exception
- lots of legacy fb files removed

* legacy rng removed

* sort draft

* - castTo during association
- allow placeholder to contain array

* resolved PLACEHOLDER becomes NDARRAY

* strings draft

* strings equality shortcut

* schema update

* flat string array ser/de

* schema update

* schema update

* - fb deserialization draft
- new NDArrayFactory::empty signature

* - byte strings
- bitswap for be/le

* NDArray::asByteVector for strings

* more strings tweaks

* SameDiff: Add TensorArray support for new execution, new tests (#6976)

* Small TFGraphMapper node name fix

* TensorOp next steps

* More TensorArray implementation

* TensorArray fixes/implementation

* Gather, scatter TensorArray support

* Small fix

* TensorArraySplit and fixes

* Cleanup

* Test fixes

* Fix typo/build issue in CnnSentenceDataSetIterator (#6978)

* [WIP] More SameDiff fixes (#6979)

* Fix typo/build issue in CnnSentenceDataSetIterator

* Multiple fixes

* TensorArray fixes

* Small test fix

* Update dependencies to just released JavaCPP and JavaCV 1.4.4

* [WIP] Some libnd4j exec fixes (#6982)

* pad graph

* scatter_nd_update graph

* print out variable dtype

* correct pad op

* conv2d failure

* add basic test for scatter_nd_update op

* add important test for relu op

* Node scalar changes

* one test commented out

* Forgotten commit (#6983)

* pad graph

* scatter_nd_update graph

* print out variable dtype

* correct pad op

* conv2d failure

* add basic test for scatter_nd_update op

* add important test for relu op

* Node scalar changes

* bunch of fixes

* Fix excludes for libnd4j build artifacts (#6973)

Update assembly manifests (both for cpu and cuda backend)
to add excludes of googletest folder, CmakeCache.txt and
fix excludes for CmakeFiles.

* two compilation fixes (#6989)

* [WIP] TensorArray fixes (#6996)

* proper list ops names

* create_list fix

* - tensorarray graph + test
- bunch of tweaks/fixes for tensorarray

* - split_list graph + test + small tweak

* uncomment execution

* - bunch of small tweaks for tensorarray stuff
- couple of new graphs + tests

* [WIP] Misc DL4J/ND4J fixes (#6980)

* #6941 BaseDataSetIterator preprocessor fix

* #5594 KerasModelImport InputStream support

* #6948 Fix UTF-8 character issues

* Cleanup: SameDiffOpExecutioner (not used)

* Cleanup

* More cleanup

* Small fixes

* Multiple fixes

* More cleanup, more fixes

* Next round of cleanup and fixes

* Small fixes

* #6985 Vstack rank 1 edge case fix

* Delete OpFactory

* Misc fixes and cleanup

* Test fixes and more cleanup

* More cleanup

* Import cleanup

* Deprecate MultiLayerNetwork/ComputationGraph evalutaeRoc methods with defaults, to avoid usability issues

* TensorArray fixes (when created in SameDiff), execution control dep fixes, cleanup

* Small control dependency fix

* LogSigmoid fix

* Shyrma lock (#6998)

* provide additional bool argument for scatter and scatter_nd ops in order to be able switch on/off parallelization during computation

* shift lock statement to critical section

* scatter use_locking

* [WIP] few more fixes (#6999)

* broadcastable ops output for non-experimental build

* linspace mapped into LegacyRandomOp

* multi output exports

* - INDArray.toFlatArray fix for empty data type
- one more graph + test

* Update SerializableCSVParser copyright header (#6977)

* Builder properties (#6990)

* Added properties to layer Builders

* Added javadocs

* maven format

* missed one

* Fixes.  Moved to class-level @getter @Setter, custom setters use already-available validation

* license fix

* [WIP] More libnd4j exec tests (#7009)

* new op: evaluate_reduction_shape

* couple of tests for new op

* bunch of commented out tests re-enabled

* - axis handling for legacy reduce ops wrappers
- one more graph + test

* unsorted_segment_* tweaked for last argument optional source

* histogram tweaks

* more skips removed

* make lock argument (in scatter ops) behave in accordance to raver/alex conception (#7010)

* keras autogen docs license (#6997)

* keras autogen docs license

* add copyrights

* Update copyright headers for dl4j modules (#6918)

* Update copyright headers for arbiter

* Update copyright headers for deeplearning4j

* Update copyright headers for jumpy

* Update copyright headers for libnd4j

* [WIP] more small fixes (#7014)

* couple of ops declared for ci

* more tests activated

* bunch of small fixes

* Fix Spark how-to page config typo (#7016)

* [WIP] SameDiff TF import fixes (#7011)

* Properly support dynamic reduction indices in TF import

* Small ignored test update (now fixed)

* Update ignores

* Add ReductionShape

* SDVariable convenience methods

* #7001 Fix SameDiff pairwise distances calculation

* ReductionShape test

* few lines removed

* bargs

* BARGS propagation

* BArgs fixes

* Fix shape calculation in BaseReduceSameOp

* Add broadcast pow op; fix pow TF import

* Pow constructors, small cleanup of InferenceSession code duplication

* Update list of ignored tests - fixed and now passing

* Fix Relu6 import (scalar type)

* Update ignores list - details and now passing

* Make test zoo model location configurable

* 'n' constructors and init cleanup

* 'n' constructors and init cleanup - round 2

* 'n' constructors and init cleanup - round 3

* 'n' constructors and init cleanup - round 4

* 'n' constructors and init cleanup - round 5

* Fixes post cleanup


* Distance op + test fixes

* Remove LocallyConnected1d debug code

* legacy reduce ops updated

* ND4S fixes given upstream change

* Re-enable one more passing test

* Small DL4J fixes

* [WIP] SameDiff loss functions backprop (#7020)

* Loss backprop: abs diff, cosine, hinge

* Next round of loss backprop methods

* Log level change

* Loss fix

* Add libnd4j exec to TFGraphTestZooModels

* get rid of TADs in gather op helper and make correction in process of output shape calculation (#7017)

* [WIP] More loss function implementation and tests (#7022)

* Loss fixes

* L2 loss backprop, tests, convenience methods

* Remove SoftmaxCrossEntropyWithLogits as mathematically the same as SoftmaxCrossEntropy (and not used for import)

* Sparse softmax cross entropy loss function + tests

* Re-add softmax cross entropy with logits (actually not identical behaviour to softmax cross entropy)

* Small cleanup

* Placeholders shape fix (#7025)

* PLACEHOLDER shape export

* PLACEHOLDER shape export

* small fix for null placeholder shape

* Update

* Transpose fix

* [WIP] Various SameDiff/ND4J/DL4J Fixes (#7026)

* Small softmaxCrossEntropy datatype validation fix

* #5299 improve error message for invalid/missing placeholders

* #7007 remove legacy java execution BooleanIndexing methods

* #6030 Remove OpExecutioner.iterateOverAllRows/Columns

* Cleanup

* #6723 Linspace with dynamic args

* Small fixes

* #6668 Add ExtractImagePatches op (java)

* Fix for transpose import

* #7027 Fix compgraph vertex issues

* #6860 DataSet.detach() - inherit javadoc from inteface

* #6940 Remove NDArraiIndex.empty() / NDArrayIndexEmpty

* Small fixes give ND4J changes

* Small fixes

* Preconditions message fix

* Final tweak

* Fix loss function output assignment in reductions (#7031)

* Fix loss function output assignment in reductions

* Fix execution issue with while loops - enter ops use in multiple iterations (#7033)

* Partial fix for nested while loop case (#7035)

* Fix wrong derivative reference (#7037)

* Avoid manual array copy (#6924)

* [WIP] Multiple DL4J/ND4J fixes (#7047)

* #7032 CheckpointListener static methods

* #7024 Nd4j.exec convenience methods

* #6884 ND4J Cleanup

* #7029 Fix Iteration/Epoch termination condition JSON deserialization

* #6812 CenterLossOutputLayer default config fix

* Use Uri.toString() instead of getPath() in (Multi)DataSet export

* #6955 UNet configuration fix

* Re-enable passing LRN import tests

* [WIP] few execution issues (#7030)

* variable debugging

* bunch of fixes

* one special test

* one special test

* one special test

* java interop test

* mkldnn

* nd4j_eq

* mkldnn

* ceil gone

* apply alternative formula for ceil in case of integer argument

* Fixed deadlock in VocabConstructor (#7048)

* Fix for ParallelTokenization==false

* Intermediate changes

* Revert "Intermediate changes"

This reverts commit 2bf7794.

* Fixed not-parallel builder

* Fixed not-parallel builder

* Non-parallel execution moved to main thread

* [WIP] TF zoo model validation (#7052)

* TF zoo model validation - Mobilenet

* Resnet v2

* [WIP] Migrate JS/font etc dependencies to webjars (#7046)

* Migrate to webjars, pt1

* Migrate to webjars, pt2

* Migrate to webjars, pt3

* Migrate to webjars, pt4

* TSNE fixes, remove deeplearning4j-ui-resources

* Arbiter UI cleanup

* Arbiter UI cleanup

* More cleanup/fixes

* More cleanup

* More cleanup

* Font cleanup, pt1

* CSS and fonts cleanup

* Delet unused/old JS resources

* Fix DL4J UI model page (cytoscape)

* Fix DL4J UI model page (cytoscape)

* Small TSNE fix

* Remove debug logging from UI assets serving

* Final arbiter fix

* Update LICENSE for OpenCSV (#7015)

* Minor cmake change (#7058)

* cmall cmake tweak

* cmall cmake tweak

* Shugeo ops2 (#6711)

* LogSumExp op with custom ops.

* reduce_logsumexp. The first working revision.

* fake_quant_with_min_max_vars implementation. Initial version.

* Implementation for crop_and_resize op. Initial version.

* Added helper for fake_quant_with_min_max_vars op.

* Working merge.

* Implemented crop_and_resize helper.

* Added tests for resize_and_crop op.

* Implementation of helper. Next stage.

* fake_quant_with_min_max_vars op implementation. Added tests.

* Fixed scalar applying and new ops with types.

* Finished with fake_quant_with_min_max_vars op and tests.

* Finished with various image and quantization ops and tests.

* Fixed type bug with logsumexp op.

* Eliminate warning with ISO C++ create routine.

* [WIP] SameDiff Graph Visualization (#7057)

* Initial schemas

* Schema fixes

* Generate classes

* First draft

* Static info writing

* Event writing - name/scalars

* Move, clean up, javadoc

* Front end first steps

* More UI implementation

* Flatbuffers reading in JS

* Export graph variables/ops in LogFileWriter

* Basic graph rendering complete

* Add shape/datatype, regenerate using flatbuffers 1.10

* Recompile main classes with flatc 1.10.0

* Basic styling

* More fixes and style tweaks

* Add ignores to graph tests

* MmulHelper::mmulMxV: provide for input arrays continuous buffers when initially they have strided ones (#7061)

* [WIP] Aggregates gone (#7059)

* initial commit

* initial commit

* sg skeleton

* next step

* fix for #7051

* next step

* next step

* next step

* couple of tests

* back to configurable

* require inplace

* next step

* first test passes

* one more test

* ns test

* cbow

* missed arg

* temp commit

* next step

* small fix

* cbow sg numeric test passes

* cbow ns numeric test passes

* java time

* skipgramm java wrapper

* cbow java wrapper

* next step

* idx abs

* CBOW tests pass

* SG tests pass

* SameDiff graph visualization layout improvements (#7062)

* [WIP] Small ND4J fix, SameDiff graph visualization fixes/improvements (#7065)

* Control dopendencies, extra UI info

* No-arg constructors for CbowRound/SkipGramRound for reflections scanning

* More node info and better formatting to help debugging

* Remove bad layout options

* [WIP] SameDiff nested while execution fix (#7077)

* Track parent frames during SameDiff execution for nested loop/enter cases

* Check for existing element when adding to abstract session availableForExec

* Working towards a nested while fix

* SameDiff: Fix nested loop java execution

* Cleanup

* Remove debug class

* Small fix

* Add missing loss functions (#7042)

* Add gradient for hinge loss

* Add huber loss backprop

* Fix Poisson spelling

* Change input order for log_poisson_loss to match those of other loss functions

* Add Poisson Loss

* Add description on how optimized implementation of MPWSE is derived

* Initial implementation of MPWSE as defined in the paper

* Add gradient calculation for mean pairwise squared error loss

* Use new API in Datavec Data Image

* Fix ScalNet test using old API

Use argMax instead of directly calling OP

* [WIP] Misc issue fixes (#7085)

* #7084 SameDiff GradCheckUtil mask

* #7074 #7075 Add ROCBinary.getROC(int); Add ROCBinary.stats() AUPRC

* #7064 Fix DL4J UIServer (temporary) memory leak

* #6991 Validate invalid TBPTT + GlobalPooling/LastTimeStep

* #7068 DataType validation (and casts where required) for dropout

* fix reduction modes and misc issues (#7082)

* Fix reduction modes for Hinge Loss

* Fix reductions on huber loss

* Fix LogPoissonLoss reduction modes & full mode switching

* Update mean pairwise square error tests in libnd4j to match match those
in java land

Expected values are calculated using the nested loop method, as
implemented in

* Use Gradient Check Mask in Loss OP Validation Tests for MEAN_BY_NONZERO_WEIGHT_COUNT

MEAN_BY_NONZERO_WEIGHT_COUNT is non differentiable for weight=0 so those
points have to be masked out.

* Add reduction mode support to MPWSE Loss

* fix gradient check numerical issues for softmax losses

* Fix calculation of weights gradient in cases where label smoothing is applied

* All LossOpValidation Tests are passing

* Fix build for Java 9, 10, and 11

* Remove deprecated method.

* Infer kernel size from weights if needed (#7098)

* Infer kernel size from weights if needed

If the kernel size isn't set, we can try inferring the proper sizes from the given weights.

Fixes #7008

* Fix expected weight gradient in softmax cross entropy test

Weight gradient calculation was changed in 65f2313.

* Fixed edge bug with unstack op and tests. (#7099)

* Remove incorrect copyright header from

* [WIP] DL4J: L2/Weight decay (#7097)

* Add Regularization interface; add L2 and WeightDecay

* Support schedules in L2/WeightDecay

* L1 regularization; cleanup, JSON etc

* Regularization API change, round 1

* Regularization API change, round 2

* Regularization API change, round 3

* Fixes

* Multiple fixes

* Javadoc and additional builder/layer methods

* Handle passed-in 0.0 value for l1/l2/weight decay

* Improvements for config duplicate regularization checks; remove legacy batch scaled l2

* Legacy JSON format loading for regularization

* alignment fix + test

* order matters

* Test fixes; Add warning when removing L2 on WeightDecay addition and vice-versa

* Fixes + test fixes

* Small transfer learning fix

* avoid using exception workflow in cases where they are easily avoidable (#7106)

* ND4J Tests (#7060)

* #7054 Set default datatype in Zoo import tests

* Ignores for last remaining tests for CI

* Ignores for last remaining tests for CI

* Small TF import resource loading fix

* Temporarily disable failing test until PR merged

* Clean up test verbose mode config etc

* BaseNDArray.equals fix for compressed arrays

* Update (#6925)

* Update

* Update

The redist artifacts were not released for CUDA 9.0:

* Treat center like non-center on 1x1 images when resizing using resize_bilinear (#7111)

* [WIP] ND4J: a few more test fixes (#7113)

* Workaround for pullrows issue

* GSON fixes

* nd4j-tests-tensorflow fix

* Small dependency fix

* Updated serialization/deserialization for NLP (#7072)

* Vocabulary serialization

* Serialization

* Serialization

* Serialization

* test update

* test update

* test update

* In process

* minor tweaks

* minor tweaks

* In process

* In process

* Cleanup

* Cleanup

* Vectors serialization

* Added matrices to serializer

* Intermediate

* Rewritten serialization

* Intermediate

* Intermediate

* Intermediate

* flush

* layerSize fix

* Corrected version

* Cleanup

* [WIP] SameDiff TF import fixes (#7088)

* Apply styling to UI

* Cleanup and more styling

* POC for line chart rendering from flatbuffers data

* Non-max suppression op fixes

* TensorArrayRead: get datatype during import

* Various fixes for SSD import

* Handle TensorArrayWrite + enter edge case

* Fix slice op

* InferenceSession shape check

* Cleanup

* Make graph rendering more useful for debugging

* Basic search functionality, v1

* Search works

* Workaround for INDArray.get rank1/2 issue (#7092)

* Fix some of the broken layout issues

* Workaround for logged issues

* #7100 Clarify behaviour of SameDiff.execBackwards

* Another workaround for get issue; remove unnecessary Reshape op resolve properties method

* Fix calculateOutputShape handling of empty arrays

* Small InferenceSession fix; LogFileWriter histogram writing

* Reshape empty array fix

* UI tweaks

* Fix libnd4j gather op for scalar case; clean up test ignores + reenable one test

* Additional info in NativeOpExecutioner errors on failed op exec

* Fix issue with cast op of empty arrays returning scalar arrays

* Gather op: add support for empty indices array

* Gather fix, round 2

* Split op: add support for empty arrays for TF import compatibility

* Misc UI

* Nd4j.create(LongShapeDescriptor, boolean) check for empty shape info

* Small UI fixes

* Broadcastable ops: follow TF convensions for broadcasting with empty input array(s)

* Align concat op empty array handling with TF for import

* Fix LongShapeDescriptor.asDataType for empty arrays

* Gather empty array fix

* Small fixes

* UI dep fix

* Final fixes

* cpu compiles

* cuda compiles

* - disable asan for cuda/linux
- cuda_exception instead of runtime_error

* ExtraArgs for legacy wrapper

This comment has been minimized.

Copy link

commented Feb 19, 2019

Similar to this issue #7125 , I saw that LongBuffer instances fill my heap after several training sessions attached to UIServer. I created heap dumps with jmap, and analyzed them with Eclipse MAT.


This comment has been minimized.

Copy link

commented Mar 11, 2019

I saw circular references similar to the ones described by @SchmaR in #7125 . It seems, that it is not the sequential attach-detach of StatsStorage to UIServer that causes the memory leak. I simplified my example, and I suspect that using StatsListener with MultiLayerNetwork causes the leak.

@printomi printomi changed the title Memory leaks when detaching StatsStorage on UIServer Memory leaks when training MultiLayerNetwork using StatsListener Mar 11, 2019


This comment has been minimized.

Copy link

commented Apr 3, 2019

I close this issue, because #7316 seems to have fixed this. This issue is likely to be a duplicate of #7125 .
I checked this fix by running the code that previously failed with OOM, but now passes the test: code, previous and current output.

@printomi printomi closed this Apr 3, 2019


This comment has been minimized.

Copy link

commented May 3, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators May 3, 2019

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
None yet
2 participants
You can’t perform that action at this time.