Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support other backends than PyTorch using autoray #137

Merged
merged 99 commits into from
Apr 12, 2022
Merged

Conversation

FHof
Copy link
Collaborator

@FHof FHof commented Mar 1, 2022

Description

The main change is a big code rewrite so that Trapezoid, Simpson, Boole and MonteCarlo can be used with NumPy, PyTorch, JAX and Tensorflow, and VEGAS with NumPy and PyTorch.
Other changes include, for example:

  • More code vectorisation for VEGAS; a performance comparison is shown at those Snakeviz visualisations
  • Changes to the VEGAS algorithm behaviour, e.g. avoidance of calculations with nan values
  • Extension of the tests code
  • More strict code linting with flake8 and a corresponding small code cleanup
  • The use of the TORCHQUAD_LOG_LEVEL environment variable
  • Support for (JIT) compilation of the integration, except VEGAS

How Has This Been Tested?

Additional tests for all backends

FHof added 29 commits March 14, 2022 17:36
Numpy as numerical backend is now supported in Trapezoid, Simpson, Boole and MonteCarlo.
The numerical backend is determined by the type of the integration_domain argument.
JAX and Tensorflow now work with the Newton Cotes rules and MonteCarlo
Since JAX has a special syntax for in-place changes, I replaced the zeros array initialisation with stacking.
To not break the torch gradients I used stack instead of array for the h values.
MonteCarlo now no longer initialises a tensor full of zeros; I haven't tested the runtime and memory impact of this change yet.
For Tensorflow I replaced the way ravel is called. Tensorflow's ravel only works if numpy behaviour is enabled, otherwise it is missing in the latest tensorflow version.
This is used if integration_domain is, for example, a list.
I added a RNG helper class to maintain a state for random number generation if the backend supports it. This should reduce problems when an integrand function itself changes a global RNG state.
For consistency between backends, if seed is None, the RNG is initialized with a random state where possible. For the torch backend this means that a previous call to torch.random.manual_seed no longer affects future random number generations in MonteCarlo where the seed argument is unsed or None.
* Add the default argument for the seed and adjust the parameter order
* Add a uniform dummy method as a place to document this backend-specific function, which is defined in the constructor
I moved the imports into the functions so that set_precision can work if torch is not installed.
…ntation mistake

* Fix _linspace_with_grads doc: N is int

* Cast the integration_domain to float in _setup_integration_domain so that tensorflow cannot use integers for the domain tensor
  This is required if integration_domain is a list and the backend argument is specified.

* Change _linspace_with_grads so that it does not fail with tensorflow and requires_grad set to True
…DME and docs

I used "conda" and a wildcard in build versions in environment.yml to install numerical backends with CUDA support
jaxlib with CUDA support seems to be only available with pip
With the pip installation of tensorflow, the ~/miniconda3/envs/torchquad/lib/python3.9/site-packages/tests folder appears and breaks the imports in torchquads tests.
I tried prepending "../" to sys.path instead of appending it but this did not fix the problem.
I also added a run_example_functions function which in comparison to compute_test_errors additionally returns the functions.
Furthermore, the example functions are not generated on import but when calling run_example_functions. The tests runtime difference due to this change is negligible in comparison to the time required to import a numerical backend.
* Check the integration domain with a _check_integration_domain utils function
  To calculate gradients with tensorflow, integration_domain needs to be a tf.Variable, and len() does not work on this type.
  I changed the input check code so that it uses shape instead of len for tensors.

* Support JIT compilation of an integrator.integrate function over integration_domain with JAX, Torch and Tensorflow
  MonteCarlo does not yet work with all of them.
Now numpy and tensorflow support both float32 and float64 precision
@gomezzz
Copy link
Collaborator

gomezzz commented Mar 29, 2022

Running the torchquad._deployment_test() , I get

18:36:48|TQ-WARNING| Cannot update the VEGASMap. This can happen with an integrand which evaluates to zero everywhere.
18:36:48|TQ-WARNING| Cannot update the VEGASMap. This can happen with an integrand which evaluates to zero everywhere.
18:36:48|TQ-WARNING| Cannot update the VEGASMap. This can happen with an integrand which evaluates to zero everywhere.
18:36:48|TQ-WARNING| Cannot update the VEGASMap. This can happen with an integrand which evaluates to zero everywhere.
18:36:48|TQ-WARNING| Cannot update the VEGASMap. This can happen with an integrand which evaluates to zero everywhere.

Is this expected?

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 29, 2022

Installing the all_env jax does not work and produces the following error:

ModuleNotFoundError: jax requires jaxlib to be installed. See https://github.com/google/jax#installation for installation instructions.

I think this may be a windows problem. 🤔 I think jax is not natively available on win?

Solved it by following https://github.com/cloudhan/jax-windows-builder
and installing jax==0.3.2 matching the wheels here https://whls.blob.core.windows.net/unstable/index.html

Not a huge problem, but it may be worth pointing out that it was tested on linux? 🤔

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 29, 2022

Sending in an array with wrong backend currently leads to quite cryptic messages.

E.g.

import torchquad as tq
import jax.numpy as jnp

tq.set_up_backend("jax", data_type="float32")

def some_function(x):
    return jnp.sin(x[:, 0]) + j.numpy.exp(x[:, 1])

trap = tq.Trapezoid()
# Set the backend argument to "tensorflow" instead of "torch"
integral_value = trap.integrate(
    some_function,
    dim=2,
    N=10000,
    integration_domain=[[0, 1], [-1, 1]],
    backend="tensorflow",
)

leads to

'EagerTensor' object has no attribute 'ravel'.
        If you are looking for numpy-related methods, please run the following:
        from tensorflow.python.ops.numpy_ops import np_config
        np_config.enable_numpy_behavior()

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 29, 2022

Similarly if I specify no backend I also get errors. Not sure if we leave that problem for the user to fix though 🤔 However, intuitively, if I call set_up_backend I would expect naively that that should be enough.

import torchquad as tq
import jax.numpy as jnp

tq.set_up_backend("jax", data_type="float32")

def some_function(x):
    return jnp.sin(x[:, 0]) + j.numpy.exp(x[:, 1])

trap = tq.Trapezoid()
# Set the backend argument to "tensorflow" instead of "torch"
integral_value = trap.integrate(
    some_function,
    dim=2,
    N=10000,
    integration_domain=[[0, 1], [-1, 1]],
)

produces

Argument 'tensor([0., 0., 0.,  ..., 1., 1., 1.])' of type <class 'torch.Tensor'> is not a valid JAX type.

@FHof
Copy link
Collaborator Author

FHof commented Mar 30, 2022

I've tested the installations only on GNU/Linux operating systems and forgot to mention this.
Both the default tensorflow from conda-forge and tensorflow-gpu are built without CUDA, so I've added the cuda* version wildcard to the conda-forge tensorflow so that it installs the CUDA instead of CPU version from https://anaconda.org/conda-forge/tensorflow/files?page=1. Unfortunately this doesn't work on Windows.
The JAX installation instructions, which I took from https://github.com/google/jax#pip-installation-gpu-cuda, also don't work on Windows.
If I remember correctly, there are also some TensorFlow versions which do not support compilation with XLA.

Here are some commands to test for CUDA support:

python3 -c 'import tensorflow as tf; print(tf.test.is_built_with_cuda(), tf.config.list_physical_devices("GPU"))'
  # Show if TensorFlow supports CUDA and a list of found GPU Devices
python3 -c 'import tensorflow as tf; tf.function(lambda x: x, jit_compile=True)(1.0)'
  # Fails if TensorFlow does not support compilation with XLA
CUDA_VISIBLE_DEVICES= python3 -c 'import tensorflow as tf; tf.function(lambda x: x, jit_compile=True)(1.0)'
  # Fails if TensorFlow does not support compilation with XLA on CPU

python3 -c 'import torch; print(torch.version.cuda, torch.cuda.is_available(), torch.backends.cudnn.enabled, torch.backends.cudnn.version(), torch.cuda.is_initialized())'
  # Show the CUDA version supported by PyTorch, a bool if CUDA works,
  # a bool if cudnn works, the cudnn version,
  # and a bool if CUDA is already initialised (which is usually False)

python3 -c 'import jax; print(jax.devices(), jax.devices()[0].device_kind)'
  # Fails if JAX cannot find a GPU; otherwise, it lists the available GPUs
  # and shows the name of the first GPU
  # Set the TF_CPP_MIN_LOG_LEVEL=0 environment variable for more information

The deployment test uses N=101, which is very small, so VEGAS executes warmups without points. This leads to the Cannot update the VEGASMap. warnings, which show that the VEGASMap updates do not work in the warmup.

The cryptic error message appears because TensorFlow's NumPy behaviour is not enabled.
With an additional set_up_backend("tensorflow"), it should instead show an error for some_function, which executes JAX functions on TensorFlow tensors.
Inferring the backend from the some_function function is difficult and a user may intentionally use different backends at once (e.g. NumPy and PyTorch), so I don't think it is easily possible to check if a user accidentally used a wrong backend argument.

Instead of the backend argument, the user can also set the integration_domain to a backend-specific tensor, in which case backend is ignored.
If backend is omitted and integration_domain is a list, the backend defaults to "torch" for backwards compatibility. I think it should be possible to change this default backend to the latest string passed to set_up_backend.

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 30, 2022

python3 -c 'import tensorflow as tf; tf.function(lambda x: x, jit_compile=True)(1.0)'

leads to

Python 3.9.12 | packaged by conda-forge | (main, Mar 24 2022, 23:17:03) [MSC v.1929 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import tensorflow as tf
>>> tf.function(lambda x: x, jit_compile=True)(1.0)
WARNING:tensorflow:AutoGraph could not transform <function <lambda> at 0x000002432A190160> and will run it as-is.
Cause: could not parse the source code of <function <lambda> at 0x000002432A190160>: no matching AST found
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function <lambda> at 0x000002432A190160> and will run it as-is.
Cause: could not parse the source code of <function <lambda> at 0x000002432A190160>: no matching AST found
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
2022-03-30 12:37:41.640332: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-30 12:37:43.118897: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5979 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2070 SUPER, pci bus id: 0000:26:00.0, compute capability: 7.5
2022-03-30 12:37:43.215911: I tensorflow/compiler/xla/service/service.cc:171] XLA service 0x2435fa99e00 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-03-30 12:37:43.216099: I tensorflow/compiler/xla/service/service.cc:179]   StreamExecutor device (0): NVIDIA GeForce RTX 2070 SUPER, Compute Capability 7.5
2022-03-30 12:37:43.329674: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:75] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2022-03-30 12:37:43.329932: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] Searched for CUDA in the following directories:
2022-03-30 12:37:43.330013: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   ./cuda_sdk_lib
2022-03-30 12:37:43.330089: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.3
2022-03-30 12:37:43.330163: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   /usr/local/cuda
2022-03-30 12:37:43.330262: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   .
2022-03-30 12:37:43.330380: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:81] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-03-30 12:37:43.953703: I tensorflow/compiler/jit/xla_compilation_cache.cc:363] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.
<tf.Tensor: shape=(), dtype=float32, numpy=1.0>

or directly in in terminal:

2022-03-30 12:39:52.370777: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-30 12:39:52.871178: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 5979 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 2070 SUPER, pci bus id: 0000:26:00.0, compute capability: 7.5
2022-03-30 12:39:52.941648: I tensorflow/compiler/xla/service/service.cc:171] XLA service 0x18567e53500 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2022-03-30 12:39:52.942203: I tensorflow/compiler/xla/service/service.cc:179]   StreamExecutor device (0): NVIDIA GeForce RTX 2070 SUPER, Compute Capability 7.5
2022-03-30 12:39:52.949725: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:75] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2022-03-30 12:39:52.950364: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] Searched for CUDA in the following directories:
2022-03-30 12:39:52.950603: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   ./cuda_sdk_lib
2022-03-30 12:39:52.950844: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.3
2022-03-30 12:39:52.951101: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   /usr/local/cuda
2022-03-30 12:39:52.952019: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:79]   .
2022-03-30 12:39:52.952332: W tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:81] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2022-03-30 12:39:53.121560: I tensorflow/compiler/jit/xla_compilation_cache.cc:363] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 30, 2022

The deployment test uses N=101, which is very small, so VEGAS executes warmups without points. This leads to the Cannot update the VEGASMap. warnings, which show that the VEGASMap updates do not work in the warmup.

Then let's increase N! :)

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 30, 2022

The cryptic error message appears because TensorFlow's NumPy behaviour is not enabled.
With an additional set_up_backend("tensorflow"), it should instead show an error for some_function, which executes JAX functions on TensorFlow tensors.
Inferring the backend from the some_function function is difficult and a user may intentionally use different backends at once (e.g. NumPy and PyTorch), so I don't think it is easily possible to check if a user accidentally used a wrong backend argument.

Instead of the backend argument, the user can also set the integration_domain to a backend-specific tensor, in which case backend is ignored.
If backend is omitted and integration_domain is a list, the backend defaults to "torch" for backwards compatibility. I think it should be possible to change this default backend to the latest string passed to set_up_backend.

Then let's add a global / env variable to track selected backend :)

@gomezzz
Copy link
Collaborator

gomezzz commented Mar 30, 2022

If I run locally the Monte Carlo test now fails for me

FAILED monte_carlo_test.py::test_integrate_torch - assert 28.591666666666697 < 28.0

I also get many, many warnings exactly like this one

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\core\framework\tensor_shape_pb2.py:18
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\core\framework\tensor_shape_pb2.py:18: DeprecationWarning: Call to deprecated create function FileDescriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
    DESCRIPTOR = _descriptor.FileDescriptor(

and a few other ones after that

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\core\framework\attr_value_pb2.py:33
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\core\framework\attr_value_pb2.py:33: DeprecationWarning: Call to deprecated create function Descriptor(). Note: Create unlinked descriptors is going to go away. Please use get/find descriptors from generated code or query the descriptor_pool.
    _ATTRVALUE_LISTVALUE = _descriptor.Descriptor(

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\dtypes.py:585
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\dtypes.py:585: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    np.object,

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\dtypes.py:627
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\dtypes.py:627: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    np.object,

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\dtypes.py:637
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\dtypes.py:637: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    np.bool,

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\tensor_util.py:176
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\tensor_util.py:176: DeprecationWarning: `np.object` is a deprecated alias for the builtin `object`. To silence this warning, use `object` by itself. Doing this will not modify any behavior and is safe.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    np.object: SlowAppendObjectArrayToTensorProto,

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\tensor_util.py:177
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\framework\tensor_util.py:177: DeprecationWarning: `np.bool` is a deprecated alias for the builtin `bool`. To silence this warning, use `bool` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.bool_` here.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    np.bool: SlowAppendBoolArrayToTensorProto,

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\autograph\impl\api.py:22
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\autograph\impl\api.py:22: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

..\..\..\..\..\..\(...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\ops\numpy_ops\np_random.py:110
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\tensorflow\python\ops\numpy_ops\np_random.py:110: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information.
  Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
    def randint(low, high=None, size=None, dtype=onp.int):  # pylint: disable=missing-function-docstring

torchquad/tests/boole_test.py::test_integrate_torch
  (...)\miniconda3\envs\torchquad_all\lib\site-packages\torch\functional.py:568: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\TensorShape.cpp:2228.)
    return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]

Copy link
Collaborator

@gomezzz gomezzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments :) almost all minor things, I think

environment_all_backends.yml Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
docs/source/install.rst Outdated Show resolved Hide resolved
torchquad/integration/newton_cotes.py Show resolved Hide resolved
torchquad/integration/utils.py Outdated Show resolved Hide resolved
torchquad/integration/utils.py Outdated Show resolved Hide resolved
torchquad/tests/integrator_types_test.py Outdated Show resolved Hide resolved
torchquad/utils/set_log_level.py Show resolved Hide resolved
FHof and others added 7 commits March 31, 2022 10:57
… on GNU/Linux

Co-authored-by: Pablo Gómez <pablo.gomez@gmx.de>
This avoids the 'Cannot update the VEGASMap.' warning, which was shown because there were too few points for the warmup.
* Move the get_jit_compiled_integrate methods to the bottom
* Move the calculate_result methods below the integrate methods
@FHof
Copy link
Collaborator Author

FHof commented Mar 31, 2022

Then let's increase N! :)

Then let's add a global / env variable to track selected backend :)

Done.

If I run locally the Monte Carlo test now fails for me

The y = 7x^4-3x^3+2x^2-x+3 Polynomial has a large reference solution of 44648.0 / 15.0, so I think the error threshold was too tight. I've changed the thresholds now.

I haven't seen the deprecated FileDescriptor() warning before.

For the deprecation warnings about np.object, np.bool, np.int and the imp module, I have initially added a pyproject.toml file to hide them:
https://github.com/FHof/torchquad/blob/30a0b38afa2ee5e5eb8e94e2475a01882912112c/torchquad/tests/pyproject.toml
However, after updating some packages these warnings disappeared by themselves even in the develop branch, so I think they may be caused by an old version of TensorFlow or new version of NumPy.
Since the deprecation warnings are annoying but harmless and disappear in new versions of backends, I haven't added the pyproject.toml to the develop branch.

The torch.meshgrid warning appears with torch 1.10.0 but not torch 1.9.1.post3 since the indexing argument was added recently. Using this argument hides the deprecation warning with a new PyTorch version but makes torchquad incompatible with older versions (unless I add a case distinction). torchquad works with both indexing orders.

The test executions on GPU currently may require environment variables which change the memory allocation behaviour of the backends since all backends are imported one after another and some of them can reserve the whole GPU memory.
These environment variables are, for example, XLA_PYTHON_CLIENT_PREALLOCATE=false, TF_FORCE_GPU_ALLOW_GROWTH=true and TF_GPU_ALLOCATOR=cuda_malloc_async. It is also possible to execute the tests on the CPU with CUDA_VISIBLE_DEVICES="".

Copy link
Collaborator

@gomezzz gomezzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice work! 💪 :) I'll be looking to create a new release specifically for this asap (probably early May)

@gomezzz gomezzz merged commit 32a22f2 into esa:develop Apr 12, 2022
This was referenced Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants