Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Conda Release seems broken #1145

Closed
site-g opened this issue Sep 14, 2021 · 3 comments
Closed

[BUG] Conda Release seems broken #1145

site-g opened this issue Sep 14, 2021 · 3 comments
Assignees
Labels
bug reproduced This bug has been reproduced by developers upstream

Comments

@site-g
Copy link

site-g commented Sep 14, 2021

Summary
deepmd-kit installed using conda cannot pass the examples. The package requires tensorflow-base 2.5.0.*, whose dependencies have some issues. ContinuumIO/anaconda-issues#12604

Deepmd-kit version, installation way, input file, running commands, error log, etc.

  • OS: openSUSE Leap 15.3
  • Deepmd-kit version: v2.0.1
  • installation way: conda create -n deepmd deepmd-kit=*=*cpu libdeepmd=*=*cpu lammps-dp -c deepmodeling
  • input file: /path/to/deepmd-kit/examples/water_tensor/polar/
  • running commands: dp train polar_input.json
  • error log:
WARNING:tensorflow:From /home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/compat/v2_compat.py:96: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.

Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:Environment variable KMP_BLOCKTIME is empty. Use the default value 0
WARNING:root:Environment variable KMP_AFFINITY is empty. Use the default value granularity=fine,verbose,compact,1,0
WARNING:deepmd.train.run_options:Switch to serial execution due to lack of horovod module.
OMP: Info #155: KMP_AFFINITY: Initial OS proc set respected: 0-31
OMP: Info #216: KMP_AFFINITY: decoding x2APIC ids.
OMP: Info #157: KMP_AFFINITY: 32 available OS procs
OMP: Info #158: KMP_AFFINITY: Uniform topology
OMP: Info #287: KMP_AFFINITY: topology layer "LL cache" is equivalent to "socket".
OMP: Info #192: KMP_AFFINITY: 1 socket x 16 cores/socket x 2 threads/core (16 total cores)
OMP: Info #218: KMP_AFFINITY: OS proc to physical thread map:
OMP: Info #172: KMP_AFFINITY: OS proc 0 maps to socket 0 core 0 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 16 maps to socket 0 core 0 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 1 maps to socket 0 core 1 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 17 maps to socket 0 core 1 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 2 maps to socket 0 core 4 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 18 maps to socket 0 core 4 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 3 maps to socket 0 core 5 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 19 maps to socket 0 core 5 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 4 maps to socket 0 core 8 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 20 maps to socket 0 core 8 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 5 maps to socket 0 core 9 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 21 maps to socket 0 core 9 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 6 maps to socket 0 core 12 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 22 maps to socket 0 core 12 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 7 maps to socket 0 core 13 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 23 maps to socket 0 core 13 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 8 maps to socket 0 core 16 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 24 maps to socket 0 core 16 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 9 maps to socket 0 core 17 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 25 maps to socket 0 core 17 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 10 maps to socket 0 core 20 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 26 maps to socket 0 core 20 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 11 maps to socket 0 core 21 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 27 maps to socket 0 core 21 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 12 maps to socket 0 core 24 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 28 maps to socket 0 core 24 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 13 maps to socket 0 core 25 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 29 maps to socket 0 core 25 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 14 maps to socket 0 core 28 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 30 maps to socket 0 core 28 thread 1 
OMP: Info #172: KMP_AFFINITY: OS proc 15 maps to socket 0 core 29 thread 0 
OMP: Info #172: KMP_AFFINITY: OS proc 31 maps to socket 0 core 29 thread 1 
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197877 thread 1 bound to OS proc set 1
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197880 thread 2 bound to OS proc set 2
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197881 thread 3 bound to OS proc set 3
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197882 thread 4 bound to OS proc set 4
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197883 thread 5 bound to OS proc set 5
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197884 thread 6 bound to OS proc set 6
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197886 thread 8 bound to OS proc set 8
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197885 thread 7 bound to OS proc set 7
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197887 thread 9 bound to OS proc set 9
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197888 thread 10 bound to OS proc set 10
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197889 thread 11 bound to OS proc set 11
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197890 thread 12 bound to OS proc set 12
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197891 thread 13 bound to OS proc set 13
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197892 thread 14 bound to OS proc set 14
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197893 thread 15 bound to OS proc set 15
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197894 thread 16 bound to OS proc set 16
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197895 thread 17 bound to OS proc set 17
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197896 thread 18 bound to OS proc set 18
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197897 thread 19 bound to OS proc set 19
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197898 thread 20 bound to OS proc set 20
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197899 thread 21 bound to OS proc set 21
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197900 thread 22 bound to OS proc set 22
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197901 thread 23 bound to OS proc set 23
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197902 thread 24 bound to OS proc set 24
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197903 thread 25 bound to OS proc set 25
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197904 thread 26 bound to OS proc set 26
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197905 thread 27 bound to OS proc set 27
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197906 thread 28 bound to OS proc set 28
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197907 thread 29 bound to OS proc set 29
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197908 thread 30 bound to OS proc set 30
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197909 thread 31 bound to OS proc set 31
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197910 thread 32 bound to OS proc set 0
DEEPMD INFO    training data with min nbor dist: 0.7534431439000763
DEEPMD INFO    training data with max nbor size: [38, 74]
DEEPMD INFO     _____               _____   __  __  _____           _     _  _   
DEEPMD INFO    |  __ \             |  __ \ |  \/  ||  __ \         | |   (_)| |  
DEEPMD INFO    | |  | |  ___   ___ | |__) || \  / || |  | | ______ | | __ _ | |_ 
DEEPMD INFO    | |  | | / _ \ / _ \|  ___/ | |\/| || |  | ||______|| |/ /| || __|
DEEPMD INFO    | |__| ||  __/|  __/| |     | |  | || |__| |        |   < | || |_ 
DEEPMD INFO    |_____/  \___| \___||_|     |_|  |_||_____/         |_|\_\|_| \__|
DEEPMD INFO    Please read and cite:
DEEPMD INFO    Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO    installed to:         /tmp/pip-req-build-f1yyifq_/_skbuild/linux-x86_64-3.9/cmake-install
DEEPMD INFO    source :              v2.0.1
DEEPMD INFO    source brach:         HEAD
DEEPMD INFO    source commit:        2f6020b
DEEPMD INFO    source commit at:     2021-09-10 22:12:15 +0800
DEEPMD INFO    build float prec:     double
DEEPMD INFO    build with tf inc:    /home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/include
DEEPMD INFO    build with tf lib:    
DEEPMD INFO    ---Summary of the training---------------------------------------
DEEPMD INFO    running on:           w001
DEEPMD INFO    computing device:     cpu:0
DEEPMD INFO    CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO    Count of visible GPU: 0
DEEPMD INFO    num_intra_threads:    0
DEEPMD INFO    num_inter_threads:    0
DEEPMD INFO    -----------------------------------------------------------------
DEEPMD INFO    ---Summary of DataSystem: training     -----------------------------------------------
DEEPMD INFO    found 2 system(s):
DEEPMD INFO                                        system  natoms  bch_sz   n_bch   prob  pbc
DEEPMD INFO                 ./training_data/atomic_system     192       1      80  0.500    T
DEEPMD INFO                 ./training_data/global_system     192       1      80  0.500    T
DEEPMD INFO    --------------------------------------------------------------------------------------
DEEPMD INFO    ---Summary of DataSystem: validation   -----------------------------------------------
DEEPMD INFO    found 2 system(s):
DEEPMD INFO                                        system  natoms  bch_sz   n_bch   prob  pbc
DEEPMD INFO               ./validation_data/atomic_system     192       1      80  0.500    T
DEEPMD INFO               ./validation_data/global_system     192       1      80  0.500    T
DEEPMD INFO    --------------------------------------------------------------------------------------
DEEPMD INFO    training without frame parameter
OMP: Info #254: KMP_AFFINITY: pid 197810 tid 197810 thread 0 bound to OS proc set 0
DEEPMD INFO    built lr
Traceback (most recent call last):
  File "/home/user/opt/miniconda3/envs/deepmd/bin/dp", line 10, in <module>
    sys.exit(main())
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/entrypoints/main.py", line 437, in main
    train_dp(**dict_args)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/entrypoints/train.py", line 102, in train
    _do_work(jdata, run_opt, is_compress)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/entrypoints/train.py", line 158, in _do_work
    model.build(train_data, stop_batch)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/train/trainer.py", line 329, in build
    self._build_network(data)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/train/trainer.py", line 353, in _build_network
    = self.model.build (self.place_holders['coord'], 
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/model/tensor.py", line 150, in build
    output = self.fitting.build (dout, 
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/deepmd/fit/polar.py", line 362, in build
    final_layer = final_layer + self.constant_matrix[sel_type_idx] * tf.eye(3, batch_shape=[tf.shape(inputs)[0], natoms[2+type_i]], dtype = GLOBAL_TF_FLOAT_PRECISION)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/ops/linalg_ops.py", line 237, in eye
    return linalg_ops_impl.eye(num_rows,
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/ops/linalg_ops_impl.py", line 75, in eye
    diag_ones = array_ops.ones(diag_shape, dtype=dtype)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py", line 3212, in ones
    output = _constant_if_small(one, shape, dtype, name)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py", line 2896, in _constant_if_small
    if np.prod(shape) < 1000:
  File "<__array_function__ internals>", line 5, in prod
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 3030, in prod
    return _wrapreduction(a, np.multiply, 'prod', axis, dtype, out,
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 87, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
  File "/home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/framework/ops.py", line 867, in __array__
    raise NotImplementedError(
NotImplementedError: Cannot convert a symbolic Tensor (strided_slice_29:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported

Further Information, Files, and Links
The problem seems coming from the conda version Tensorflow 2.5 requires an incompatible Numpy>=1.20. Is it possible for deepmd-kit to rely on early version Tensorflow? Thank you.

@site-g site-g added the bug label Sep 14, 2021
@njzjz njzjz added reproduced This bug has been reproduced by developers upstream labels Sep 14, 2021
@njzjz
Copy link
Member

njzjz commented Sep 14, 2021

I notice this bug has been fixed in tensorflow/tensorflow#47957 with just one line modification. I think an easy way to fix it before the new release is to apply the patch to your local file (i.e. /home/user/opt/miniconda3/envs/deepmd/lib/python3.9/site-packages/tensorflow/python/ops/array_ops.py).

@site-g
Copy link
Author

site-g commented Sep 14, 2021

The fix just ignores the exception, but it does work. Thanks!

@site-g site-g closed this as completed Sep 14, 2021
@njzjz njzjz reopened this Oct 30, 2021
@njzjz njzjz self-assigned this Nov 1, 2021
@njzjz
Copy link
Member

njzjz commented Nov 28, 2021

A new package built with TensorFlow 2.7 has been uploaded,

@njzjz njzjz closed this as completed Nov 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug reproduced This bug has been reproduced by developers upstream
Projects
Archived in project
Development

No branches or pull requests

2 participants