You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is running on an RTX 3090. It sits on "Running DSO for 1 seeds" for a bit then spikes to over 24 GBs of memory and then crashes.
python -m dso.run config.json
Running DSO for 1 seeds
-- BUILDING PRIOR -------------------
WARNING: Skipping invalid 'RelationalConstraint' with arguments {'targets': [], 'effectors': [], 'relationship': None}. Reason: Prior disabled.
WARNING: Skipping invalid 'RepeatConstraint' with arguments {'tokens': 'const', 'min_': None, 'max_': 3}. Reason: Uses Tokens not in the Library.
WARNING: Skipping invalid 'TrigConstraint' with arguments {}. Reason: There are no target Tokens.
WARNING: Skipping invalid 'ConstConstraint' with arguments {}. Reason: Uses Tokens not in the Library.
WARNING: Skipping invalid 'NoInputsConstraint' with arguments {}. Reason: All terminal tokens are input variables, so allsequences will have an input variable.
WARNING: Skipping invalid 'LanguageModelPrior' with arguments {'weight': None}. Reason: Prior disabled.
LengthConstraint: Sequences have minimum length 4.
LengthConstraint: Sequences have maximum length 30.
RelationalConstraint: [neg] cannot be a child of [neg].
UniformArityPrior: Activated.
SoftLengthPrior: No description available.
-------------------------------------
2021-09-16 02:49:06.352759: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
Traceback (most recent call last):
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1356, in _do_call
return fn(*args)
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1341, in _run_fn
options, feed_dict, fetch_list, target_list, run_metadata)
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1429, in _call_tf_sessionrun
run_metadata)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas GEMM launch failed : a.shape=(1000, 57), b.shape=(57, 128), m=1000, n=128, k=57
[[{{node controller/policy/rnn/while/LinearWrapper/LinearWrapper/multi_rnn_cell/cell_0/lstm_cell/MatMul}}]]
[[controller/policy/rnn/while/Exit_6/_39]]
(1) Internal: Blas GEMM launch failed : a.shape=(1000, 57), b.shape=(57, 128), m=1000, n=128, k=57
[[{{node controller/policy/rnn/while/LinearWrapper/LinearWrapper/multi_rnn_cell/cell_0/lstm_cell/MatMul}}]]
0 successful operations.
0 derived errors ignored.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "...\deep-symbolic-optimization\dso\dso\run.py", line 124, in <module>
main()
File "C:\Python37\lib\site-packages\click\core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "C:\Python37\lib\site-packages\click\core.py", line 1062, in main
rv = self.invoke(ctx)
File "C:\Python37\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Python37\lib\site-packages\click\core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "...\deep-symbolic-optimization\dso\dso\run.py", line 109, in main
result, summary_path = train_dso(config)
File "...\deep-symbolic-optimization\dso\dso\run.py", line 31, in train_dso
result = model.train()
File "...\deep-symbolic-optimization\dso\dso\core.py", line 90, in train
**self.config_training))
File "...\deep-symbolic-optimization\dso\dso\train.py", line 259, in learn
actions, obs, priors = controller.sample(batch_size)
File "...\deep-symbolic-optimization\dso\dso\controller.py", line 626, in sample
actions, obs, priors = self.sess.run([self.actions, self.obs, self.priors], feed_dict=feed_dict)
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 950, in run
run_metadata_ptr)
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1173, in _run
feed_dict_tensor, options, run_metadata)
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1350, in _do_run
run_metadata)
File "C:\Python37\lib\site-packages\tensorflow\python\client\session.py", line 1370, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
(0) Internal: Blas GEMM launch failed : a.shape=(1000, 57), b.shape=(57, 128), m=1000, n=128, k=57
[[node controller/policy/rnn/while/LinearWrapper/LinearWrapper/multi_rnn_cell/cell_0/lstm_cell/MatMul (defined at ...\deep-symbolic-optimization\dso\dso\controller.py:25) ]]
[[controller/policy/rnn/while/Exit_6/_39]]
(1) Internal: Blas GEMM launch failed : a.shape=(1000, 57), b.shape=(57, 128), m=1000, n=128, k=57
[[node controller/policy/rnn/while/LinearWrapper/LinearWrapper/multi_rnn_cell/cell_0/lstm_cell/MatMul (defined at ...\deep-symbolic-optimization\dso\dso\controller.py:25) ]]
0 successful operations.
0 derived errors ignored.
Original stack trace for 'controller/policy/rnn/while/LinearWrapper/LinearWrapper/multi_rnn_cell/cell_0/lstm_cell/MatMul':
File "C:\Python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "...\deep-symbolic-optimization\dso\dso\run.py", line 124, in <module>
main()
File "C:\Python37\lib\site-packages\click\core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "C:\Python37\lib\site-packages\click\core.py", line 1062, in main
rv = self.invoke(ctx)
File "C:\Python37\lib\site-packages\click\core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Python37\lib\site-packages\click\core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "...\deep-symbolic-optimization\dso\dso\run.py", line 109, in main
result, summary_path = train_dso(config)
File "...\deep-symbolic-optimization\dso\dso\run.py", line 31, in train_dso
result = model.train()
File "...\deep-symbolic-optimization\dso\dso\core.py", line 82, in train
self.setup()
File "...\deep-symbolic-optimization\dso\dso\core.py", line 62, in setup
self.controller = self.make_controller()
File "...\deep-symbolic-optimization\dso\dso\core.py", line 134, in make_controller
**self.config_controller)
File "...\deep-symbolic-optimization\dso\dso\controller.py", line 438, in __init__
_, _, loop_state = tf.nn.raw_rnn(cell=cell, loop_fn=loop_fn)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\rnn.py", line 1252, in raw_rnn
swap_memory=swap_memory)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3501, in while_loop
return_same_structure)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 3012, in BuildLoop
pred, body, original_loop_vars, loop_vars, shape_invariants)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\control_flow_ops.py", line 2937, in _BuildLoop
body_result = body(*packed_vars_for_body)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\rnn.py", line 1201, in body
(next_output, cell_state) = cell(current_input, state)
File "...\deep-symbolic-optimization\dso\dso\controller.py", line 25, in __call__
outputs, state = self.cell(inputs, state, scope=scope)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 248, in __call__
return super(RNNCell, self).__call__(inputs, state)
File "C:\Python37\lib\site-packages\tensorflow\python\layers\base.py", line 537, in __call__
outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 634, in __call__
outputs = call_fn(inputs, *args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 146, in wrapper
), args, kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 446, in converted_call
return _call_unconverted(f, args, kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 253, in _call_unconverted
return f(*args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 1719, in call
cur_inp, new_state = cell(cur_inp, cur_state)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 385, in __call__
self, inputs, state, scope=scope, *args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\layers\base.py", line 537, in __call__
outputs = super(Layer, self).__call__(inputs, *args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 634, in __call__
outputs = call_fn(inputs, *args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 146, in wrapper
), args, kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 446, in converted_call
return _call_unconverted(f, args, kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\autograph\impl\api.py", line 253, in _call_unconverted
return f(*args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\rnn_cell_impl.py", line 1027, in call
array_ops.concat([inputs, m_prev], 1), self._kernel)
File "C:\Python37\lib\site-packages\tensorflow\python\util\dispatch.py", line 180, in wrapper
return target(*args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\math_ops.py", line 2647, in matmul
a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
File "C:\Python37\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 6295, in mat_mul
name=name)
File "C:\Python37\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
op_def=op_def)
File "C:\Python37\lib\site-packages\tensorflow\python\util\deprecation.py", line 507, in new_func
return func(*args, **kwargs)
File "C:\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 3616, in create_op
op_def=op_def)
File "C:\Python37\lib\site-packages\tensorflow\python\framework\ops.py", line 2005, in __init__
self._traceback = tf_stack.extract_stack()
I had it running on my CPU (was extremely slow, so I stopped it) earlier with a lot more data so I assume it's a bug somewhere. It's possible I'm doing something wrong. Googling that error leads me to stuff like: https://stackoverflow.com/a/52132342/254381
Maybe you're familiar with this already. I'll investigate more tomorrow.
Also for future reference since I can't find any FAQ. Does this project work with large datasets? I have some problems I want to try on it where my data ranges from 330 MBs to upwards of 15+ TB. I assume that is outside the scope of this project?
The text was updated successfully, but these errors were encountered:
I should have just looked into this more first. The RTX 3090 isn't compatible with Cuda 10.0 and thus can't run Tensorflow 1.14 projects. Can you update this to work with Cuda 11 or the latest one? Switching to tensorflow-gpu 1.15 would be required. I'll close this an open a new issue.
I have data with 9 variables and a result. So I have a csv file with samples that look like:
Using a simple config:
This is running on an RTX 3090. It sits on "Running DSO for 1 seeds" for a bit then spikes to over 24 GBs of memory and then crashes.
The data for reference:
I had it running on my CPU (was extremely slow, so I stopped it) earlier with a lot more data so I assume it's a bug somewhere. It's possible I'm doing something wrong. Googling that error leads me to stuff like: https://stackoverflow.com/a/52132342/254381
Maybe you're familiar with this already. I'll investigate more tomorrow.
Also for future reference since I can't find any FAQ. Does this project work with large datasets? I have some problems I want to try on it where my data ranges from 330 MBs to upwards of 15+ TB. I assume that is outside the scope of this project?
The text was updated successfully, but these errors were encountered: