Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle the so loading in dev mode #1909

Merged
merged 9 commits into from Jan 20, 2021
Merged

Conversation

bhack
Copy link
Contributor

@bhack bhack commented Jun 3, 2020

No description provided.

@boring-cyborg boring-cyborg bot added the test-cases Related to Addons tests label Jun 4, 2020
@@ -34,5 +34,5 @@ if ! [ -x "$(command -v nvidia-smi)" ]; then
EXTRA_ARGS="-n auto"
fi


bazel clean
Copy link
Contributor Author

@bhack bhack Jun 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanpmorgan I want to remove this but if I remove this it find some so in the cache that I don't understands where it comes from:
`

tensorflow.python.framework.errors_impl.NotFoundError: /addons/bazel-bin/tensorflow_addons/custom_ops/layers/lib_correlation_cost_ops_gpu.so: undefined symbol: _ZNK10tensorflow6Tensor21CheckTypeAndIsAlignedENS_8DataTypeE

In the local build I don't find any lib_correlation_cost_ops_gpu.so

@bhack
Copy link
Contributor Author

bhack commented Jun 4, 2020

Just as a reminder note:
I don't know how this PR works in windows and osx build release wheels.
But probably if it works there we could try to totally remove https://github.com/tensorflow/addons/blob/master/tools/install_so_files.sh

@@ -9,4 +9,4 @@ docker build \
--build-arg TF_VERSION=2.2.0 \
--build-arg PY_VERSION=3.5 \
-t tfa_gpu_tests ./
docker run --rm -t -v cache_bazel:/root/.cache/bazel --gpus=all tfa_gpu_tests
docker run --rm -t --gpus=all tfa_gpu_tests
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is working so I don't know if anybody remember why the bazel cache volume was required here but we could revert this change.

@gabrieldemarmiesse
Copy link
Member

Could you give a bit of background to help us understand what you want to enable with this pull request?

@bhack
Copy link
Contributor Author

bhack commented Jun 6, 2020

@seanpmorgan knows a little bit more the context of this cause it is partially related to #1888 but it doesn't depend on that.

When you develop in a container env generally and/or you are integrating in the IDO you don't want to have to much manual steps on bash scripts or have in you source tree files produced in the container environment like when you mount in the container your tf_addons checkout volume on the host and you need to run internally install_so_files.sh in the container.

With this we are trying to detect if we are in "developer mode" so that we don't need to continuously manually run install_so_files.sh when we are developing on custom_op cause it will enable so loading from the bazel-bin. With that you will also not have .so file in the source tree on the host when you exit from the container.

Copy link
Member

@seanpmorgan seanpmorgan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks. Apologies for the very slow review. I was debating if we could avoid users paying this dev tool directory lookup, but it's minimal and done only when a custom op is called for the first time.

@seanpmorgan seanpmorgan merged commit 955ad99 into tensorflow:master Jan 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes test-cases Related to Addons tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants