Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU integration for MacOs12.3 M1 Max #884

Open
doric35 opened this issue Aug 11, 2022 · 0 comments
Open

GPU integration for MacOs12.3 M1 Max #884

doric35 opened this issue Aug 11, 2022 · 0 comments

Comments

@doric35
Copy link

doric35 commented Aug 11, 2022

I ran the Quickstart.py example, and I get the following error;

Metal device set to: Apple M1 Max

systemMemory: 32.00 GB
maxCacheSize: 10.67 GB

WARNING:root:Infinite min_value bound for state.
Episodes: 0%| | 0/200 [00:00, return=0.00, ts/ep=0, sec/ep=0.00, ms/ts=0.0, agent=0.0%]Traceback (most recent call last):
File "/Users/dominikrichard/workspace/minesweeping/minesweepingpython/main/tensforce_testing.py", line 53, in
main()
File "/Users/dominikrichard/workspace/minesweeping/minesweepingpython/main/tensforce_testing.py", line 46, in main
runner.run(num_episodes=200)
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorforce/execution/runner.py", line 649, in run
self.handle_act(parallel=n)
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorforce/execution/runner.py", line 697, in handle_act
actions = self.agent.act(states=self.states[parallel], parallel=parallel)
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorforce/agents/agent.py", line 415, in act
return super().act(
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorforce/agents/recorder.py", line 262, in act
actions, internals = self.fn_act(
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorforce/agents/agent.py", line 462, in fn_act
actions, timesteps = self.model.act(
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorforce/core/module.py", line 136, in decorated
output_args = function_graphsstr(graph_params)
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/opt/homebrew/anaconda3/envs/TensFenv/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 54, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx.handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation agent/VerifyFinite/CheckNumerics: Could not satisfy explicit device specification '' because the node {{colocation_node agent/VerifyFinite/CheckNumerics}} was colocated with a group of nodes that required incompatible device '/job:localhost/replica:0/task:0/device:GPU:0'. All available devices [/job:localhost/replica:0/task:0/device:CPU:0, /job:localhost/replica:0/task:0/device:GPU:0].
Colocation Debug Info:
Colocation group had the following types and supported devices:
Root Member(assigned_device_name_index
=1 requested_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' assigned_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' resource_device_name_='/job:localhost/replica:0/task:0/device:GPU:0' supported_device_types_=[CPU] possible_devices_=[]
Identity: GPU CPU
Switch: GPU CPU
CheckNumerics: CPU
_Arg: GPU CPU

Colocation members, user-requested devices, and framework assigned devices, if any:
args_0 (_Arg) framework assigned device=/job:localhost/replica:0/task:0/device:GPU:0
agent/VerifyFinite/CheckNumerics (CheckNumerics)
agent/VerifyFinite/control_dependency (Identity)
agent/assert_greater_equal/Assert/AssertGuard/args_0/_16 (Switch)
agent/assert_less_equal/Assert/AssertGuard/args_0/_26 (Switch)
Func/agent/StatefulPartitionedCall/input/_80 (Identity) /job:localhost/replica:0/task:0/device:GPU:0
Func/agent/assert_greater_equal/Assert/AssertGuard/then/_10/input/_153 (Identity)
Func/agent/assert_greater_equal/Assert/AssertGuard/else/_11/input/_159 (Identity)
Func/agent/assert_less_equal/Assert/AssertGuard/then/_20/input/_165 (Identity)
Func/agent/assert_less_equal/Assert/AssertGuard/else/_21/input/_171 (Identity)
Func/agent/StatefulPartitionedCall/state_preprocessing/PartitionedCall/input/_260 (Identity) /job:localhost/replica:0/task:0/device:GPU:0
Func/agent/StatefulPartitionedCall/state_preprocessing/PartitionedCall/linear_normalization0/PartitionedCall/input/_356 (Identity) /job:localhost/replica:0/task:0/device:GPU:0

     [[{{node agent/VerifyFinite/CheckNumerics}}]] [Op:__inference_act_1848]

Episodes: 0%| | 0/200 [00:00, return=0.00, ts/ep=0, sec/ep=0.00, ms/ts=0.0, agent=0.0%]


I installed Tensorforce using this guide; https://tensorforce.readthedocs.io/en/latest/basics/installation.html

for M1 Mac in a new Conda environment.
I also had to upgrade numpy to 1.22 to run the code.

My Conda env is build as follow;

Name Version Build Channel

absl-py 1.2.0 pypi_0 pypi
astunparse 1.6.3 pypi_0 pypi
blas 1.0 openblas
bzip2 1.0.8 h620ffc9_4
c-ares 1.18.1 h1a28f6b_0
ca-certificates 2022.07.19 hca03da5_0
cachetools 5.2.0 pypi_0 pypi
certifi 2022.6.15 py310hca03da5_0
charset-normalizer 2.1.0 pypi_0 pypi
cloudpickle 2.1.0 pypi_0 pypi
cycler 0.11.0 pypi_0 pypi
flatbuffers 1.12 pypi_0 pypi
fonttools 4.34.4 pypi_0 pypi
gast 0.4.0 pypi_0 pypi
google-auth 2.10.0 pypi_0 pypi
google-auth-oauthlib 0.4.6 pypi_0 pypi
google-pasta 0.2.0 pypi_0 pypi
grpcio 1.42.0 py310h95c9599_0
gym 0.21.0 pypi_0 pypi
h5py 3.6.0 py310h181c318_0
hdf5 1.12.1 h160e8cb_2
idna 3.3 pypi_0 pypi
keras 2.9.0 pypi_0 pypi
keras-preprocessing 1.1.2 pypi_0 pypi
kiwisolver 1.4.4 pypi_0 pypi
krb5 1.19.2 h3b8d789_0
libclang 14.0.6 pypi_0 pypi
libcurl 7.84.0 hc6d1d07_0
libcxx 12.0.0 hf6beb65_1
libedit 3.1.20210910 h1a28f6b_0
libev 4.33 h1a28f6b_1
libffi 3.4.2 hc377ac9_4
libgfortran 5.0.0 11_2_0_he6877d6_26
libgfortran5 11.2.0 he6877d6_26
libnghttp2 1.46.0 h95c9599_0
libopenblas 0.3.20 hea475bc_0
libssh2 1.10.0 hf27765b_0
llvm-openmp 12.0.0 haf9daa7_1
markdown 3.4.1 pypi_0 pypi
markupsafe 2.1.1 pypi_0 pypi
matplotlib 3.5.1 pypi_0 pypi
msgpack 1.0.3 pypi_0 pypi
msgpack-numpy 0.4.7.1 pypi_0 pypi
ncurses 6.3 h1a28f6b_3
numpy 1.22.0 pypi_0 pypi
oauthlib 3.2.0 pypi_0 pypi
openssl 1.1.1q h1a28f6b_0
opt-einsum 3.3.0 pypi_0 pypi
packaging 21.3 pypi_0 pypi
pillow 9.2.0 pypi_0 pypi
pip 22.1.2 py310hca03da5_0
protobuf 3.19.4 pypi_0 pypi
pyasn1 0.4.8 pypi_0 pypi
pyasn1-modules 0.2.8 pypi_0 pypi
pyparsing 3.0.9 pypi_0 pypi
python 3.10.4 hbdb9e5c_0
python-dateutil 2.8.2 pypi_0 pypi
readline 8.1.2 h1a28f6b_1
requests 2.28.1 pypi_0 pypi
requests-oauthlib 1.3.1 pypi_0 pypi
rsa 4.9 pypi_0 pypi
setuptools 61.2.0 py310hca03da5_0
six 1.15.0 pypi_0 pypi
sqlite 3.39.2 h1058600_0
tensorboard 2.9.1 pypi_0 pypi
tensorboard-data-server 0.6.1 pypi_0 pypi
tensorboard-plugin-wit 1.8.1 pypi_0 pypi
tensorflow-deps 2.8.0 0 apple
tensorflow-estimator 2.9.0 pypi_0 pypi
tensorflow-macos 2.9.2 pypi_0 pypi
tensorflow-metal 0.5.0 pypi_0 pypi
tensorforce 0.6.5 pypi_0 pypi
termcolor 1.1.0 pypi_0 pypi
tk 8.6.12 hb8d0fd4_0
tqdm 4.62.3 pypi_0 pypi
typing-extensions 4.3.0 pypi_0 pypi
tzdata 2022a hda174b7_0
urllib3 1.26.11 pypi_0 pypi
werkzeug 2.2.2 pypi_0 pypi
wheel 0.37.1 pyhd3eb1b0_0
wrapt 1.14.1 pypi_0 pypi
xz 5.2.5 h1a28f6b_1
zlib 1.2.12 h5a0b063_2


Is there any way to dares this issue?
I also tried downgrading python to 3.9 with did not work.
Is Mac OS not supposed to be supported using TensorFlow-metal?

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant