Failed copying input tensor #20

csgomezg0 · 2021-09-13T15:16:48Z

Hi, I get data for language spanish through of files *.conll, I tranform this data *.conll in format *.ann, and I try train coreferee with this data, after of change the rules for my langage. This data are 3.000 files approximately, but when I try train this model I get a error:

Traceback (most recent call last):
  File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/creangel/info/image/coreferee/__main__.py", line 51, in <module>
    TrainingManager(
  File "/home/creangel/info/image/coreferee/training/train.py", line 409, in train_models
    self.train_model(config_entry_name, config_entry, temp_log_file)
  File "/home/creangel/info/image/coreferee/training/train.py", line 378, in train_model
    keras_ensemble = self.generate_keras_ensemble(
  File "/home/creangel/info/image/coreferee/training/train.py", line 219, in generate_keras_ensemble
    keras_history = model_generator.train_keras_model(training_docs, tendencies_analyzer,
  File "/home/creangel/info/image/coreferee/training/model.py", line 288, in train_keras_model
    #print('keras_inputs: ',keras_inputs)
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/training.py", line 1134, in fit
    data_handler = data_adapter.get_data_handler(
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/data_adapter.py", line 1383, in get_data_handler
    return DataHandler(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/data_adapter.py", line 1138, in __init__
    self._adapter = adapter_cls(
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/data_adapter.py", line 230, in __init__
    x, y, sample_weights = _process_tensorlike((x, y, sample_weights))
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/data_adapter.py", line 1031, in _process_tensorlike
    inputs = tf.nest.map_structure(_convert_numpy_and_scipy, inputs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/nest.py", line 869, in map_structure
    structure[0], [func(*x) for x in entries],
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/nest.py", line 869, in <listcomp>
    structure[0], [func(*x) for x in entries],
  File "/usr/local/lib/python3.8/dist-packages/keras/engine/data_adapter.py", line 1026, in _convert_numpy_and_scipy
    return tf.convert_to_tensor(x, dtype=dtype)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 1430, in convert_to_tensor_v2_with_dispatch
    return convert_to_tensor_v2(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 1436, in convert_to_tensor_v2
    return convert_to_tensor(
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/profiler/trace.py", line 163, in wrapped
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/ops.py", line 1566, in convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/tensor_conversion_registry.py", line 52, in _default_conversion_function
    return constant_op.constant(value, dtype, name=name)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/constant_op.py", line 271, in constant
    return _constant_impl(value, dtype, shape, name, verify_shape=False,
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/constant_op.py", line 283, in _constant_impl
    return _constant_eager_impl(ctx, value, dtype, shape, verify_shape)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/constant_op.py", line 308, in _constant_eager_impl
    t = convert_to_eager_tensor(value, ctx, dtype)
  File "/usr/local/lib/python3.8/dist-packages/tensorflow/python/framework/constant_op.py", line 106, in convert_to_eager_tensor
    return ops.EagerTensor(value, ctx.device_name, dtype)

tensorflow.python.framework.errors_impl.InternalError: Failed copying input tensor from /job:localhost/replica:0/task:0/device:CPU:0 to /job:localhost/replica:0/task:0/device:GPU:0 in order to run _EagerConst: Dst tensor is not initialized.

I think that is the memory. I have a GPU of 8 GB.

Note: I try with 100 files from my data and the train of coreferee work very well, but with a
with 3000 or 200 files I get the error.

Maybe, someone know about this error and which is the solution?. Thanks.

The text was updated successfully, but these errors were encountered:

richardpaulhudson · 2021-09-13T16:57:38Z

Hi Carlos,

from the output it looks as though you are not using the latest version of Coreferee. Please update your Python version to 3.9 and your spaCy and coreferee packages as well as the spaCy and coreferee models, then try again and let me know if the problem still occurs.

Best wishes,

Richard

csgomezg0 · 2021-09-13T20:35:09Z

I upgrade spacy to version 3.1.0 and coreferee, i have python 3.8.1, I cant upgrade python in the container because I don't have access of admin, but the error continue. I would know what specifications have the machine for train coreferee (GPU) and how many files you use for train?

richardpaulhudson · 2021-09-14T05:14:21Z

Hi Carlos, unfortunately only Python 3.9 is supported and there is no way the latest version of Coreferee will work with Python 3.8. I believe the problem you are having with tensorflow is specific to the previous version of Coreferee. Perhaps you can download Python 3.9 and install it as a local user?

There are no specific hardware requirements for training: the more hardware you have, the quicker training will be. Equally, the more training examples you have the better, although there is no specific minimum.

csgomezg0 · 2021-09-14T20:05:40Z

Thanks, I upgrade python (3.9) and tensorflow (2.5.0) and the train work. How many sentence use for training model in English, I know that your model in English use 393.564 words for train language English, but how many sentence represent it?

csgomezg0 closed this as completed Sep 24, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failed copying input tensor #20

Failed copying input tensor #20

csgomezg0 commented Sep 13, 2021 •

edited

richardpaulhudson commented Sep 13, 2021

csgomezg0 commented Sep 13, 2021

richardpaulhudson commented Sep 14, 2021

csgomezg0 commented Sep 14, 2021

Failed copying input tensor #20

Failed copying input tensor #20

Comments

csgomezg0 commented Sep 13, 2021 • edited

richardpaulhudson commented Sep 13, 2021

csgomezg0 commented Sep 13, 2021

richardpaulhudson commented Sep 14, 2021

csgomezg0 commented Sep 14, 2021

csgomezg0 commented Sep 13, 2021 •

edited