Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The test TrainableStateTest.testForCore fails consistently #156

Open
loopylangur opened this issue Jan 27, 2020 · 3 comments
Open

The test TrainableStateTest.testForCore fails consistently #156

loopylangur opened this issue Jan 27, 2020 · 3 comments

Comments

@loopylangur
Copy link

Hi,

The test TrainableStateTest.testForCore in sonnet/src/recurrent_test.py fails consistently when I run it with either tensorflow 1.5 or 2.0

Is this expected or is there a dependency issue? I am on the v2 branch. Please find details below:

Error log:

=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.7.6, pytest-5.3.2, py-1.8.1, pluggy-0.13.1
rootdir: sonnet
collected 1 item

sonnet/src/recurrent_test.py F                                                                                                                                                                              [100%]

==================================================================================================== FAILURES =====================================================================================================
_________________________________________________________________________________________ TrainableStateTest.testForCore __________________________________________________________________________________________

self = <recurrent_test.TrainableStateTest testMethod=testForCore>

    def testForCore(self):
      core = recurrent.LSTM(hidden_size=16)
      trainable_state = recurrent.TrainableState.for_core(core)
      self.assertAllClose(
>         trainable_state(batch_size=42), core.initial_state(batch_size=42))

sonnet/src/recurrent_test.py:667: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
python3.7/site-packages/tensorflow_core/python/framework/test_util.py:1153: in decorated
    return f(*args, **kwds)
python3.7/site-packages/tensorflow_core/python/framework/test_util.py:2495: in assertAllClose
    self._assertAllCloseRecursive(a, b, rtol=rtol, atol=atol, msg=msg)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <recurrent_test.TrainableStateTest testMethod=testForCore>
a = _TupleWrapper([<tf.Tensor: shape=(42, 16), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0... 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
      dtype=float32)>])
b = OrderedDict([('hidden', <tf.Tensor: shape=(42, 16), dtype=float32, numpy=
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., ...0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]],
      dtype=float32)>)])
rtol = 1e-06, atol = 1e-06, path = [], msg = ''

    def _assertAllCloseRecursive(self,
                                 a,
                                 b,
                                 rtol=1e-6,
                                 atol=1e-6,
                                 path=None,
                                 msg=None):
      path = path or []
      path_str = (("[" + "][".join([str(p) for p in path]) + "]") if path else "")
      msg = msg if msg else ""
    
      # Check if a and/or b are namedtuples.
      if hasattr(a, "_asdict"):
        a = a._asdict()
      if hasattr(b, "_asdict"):
        b = b._asdict()
      a_is_dict = isinstance(a, collections_abc.Mapping)
      if a_is_dict != isinstance(b, collections_abc.Mapping):
        raise ValueError("Can't compare dict to non-dict, a%s vs b%s. %s" %
>                        (path_str, path_str, msg))
E       ValueError: Can't compare dict to non-dict, a vs b.

python3.7/site-packages/tensorflow_core/python/framework/test_util.py:2418: ValueError
---------------------------------------------------------------------------------------------- Captured stderr call -----------------------------------------------------------------------------------------------
2020-01-26 22:31:35.915223: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2020-01-26 22:31:36.400027: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1555] Found device 0 with properties: 
pciBusID: 0000:03:00.0 name: Quadro K620 computeCapability: 5.0
coreClock: 1.124GHz coreCount: 3 deviceMemorySize: 1.95GiB deviceMemoryBandwidth: 26.82GiB/s
2020-01-26 22:31:36.400117: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.1'; dlerror: libcudart.so.10.1: cannot open shared object file: No such file or directory
2020-01-26 22:31:36.400176: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10'; dlerror: libcublas.so.10: cannot open shared object file: No such file or directory
2020-01-26 22:31:36.400229: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2020-01-26 22:31:36.400281: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10'; dlerror: libcurand.so.10: cannot open shared object file: No such file or directory
2020-01-26 22:31:36.400333: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10'; dlerror: libcusolver.so.10: cannot open shared object file: No such file or directory
2020-01-26 22:31:36.400385: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10'; dlerror: libcusparse.so.10: cannot open shared object file: No such file or directory
2020-01-26 22:31:36.403448: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2020-01-26 22:31:36.403468: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1592] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2020-01-26 22:31:36.403721: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-01-26 22:31:36.410367: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2993350000 Hz
2020-01-26 22:31:36.411301: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556c3fef6550 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-01-26 22:31:36.411321: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-01-26 22:31:36.487384: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556c3ff5c500 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-01-26 22:31:36.487419: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Quadro K620, Compute Capability 5.0
2020-01-26 22:31:36.487550: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1096] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-01-26 22:31:36.487560: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102]      
================================================================================================ warnings summary =================================================================================================
python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py:15
  python3.7/site-packages/tensorflow_core/python/pywrap_tensorflow_internal.py:15: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

sonnet/src/recurrent_test.py::TrainableStateTest::testForCore
sonnet/src/recurrent_test.py::TrainableStateTest::testForCore
sonnet/src/recurrent_test.py::TrainableStateTest::testForCore
  python3.7/site-packages/tree/__init__.py:258: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3,and in 3.9 it will stop working
    return _tree.flatten(structure)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================================================================================== 1 failed, 4 warnings in 2.96s ==========================================================================================

Environment:

Ubuntu 16.04
python 3.7
numpy==1.17.4                                                                                                                                                                                                      
tensorboard==2.0.2                                                                                                                                                                                                 
tensorflow==2.0.0                                                                                                                                                                                                  
tensorflow-datasets==1.2.0                                                                                                                                                                                         
tensorflow-estimator==2.0.1                                                                                                                                                                                        
tensorflow-gpu==2.1.0                                                                                                                                                                                              
tensorflow-metadata==0.14.0                                                                                                                                                                                        
tensorflow-probability==0.8.0rc0  
@malcolmreynolds
Copy link
Collaborator

Thanks for your report. @superbobry is looking into this.

@superbobry
Copy link
Collaborator

Hi @loopylangur, the failure you're seeing is a result of switching Sonnet 2 to deepmind/tree which did not (until recently, see google-deepmind/tree@66ace75) work well with wrapt.ObjectProxy objects used by tf.AutoTrackable.

I will do a bugfix release of tree in the coming days which should fix the issue.

tree-copybara pushed a commit to google-deepmind/tree that referenced this issue Jan 29, 2020
This should fix google-deepmind/sonnet#156.

PiperOrigin-RevId: 292111439
Change-Id: I071294be508d61ce0faecc3198d0f41baf8f949a
@superbobry
Copy link
Collaborator

@loopylangur could you upgrade tree and let us know if this helps?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants