Implement multi-GPU C/Fortran methods #254

ashao · 2022-05-12T23:20:30Z

Convenience functions for multiple GPU systems had previously been
implemented in C++. This adds support for C and Fortran clients
for [set,delete,run]_[model,script]_multigpu. This also adds a variant
of the existing MNIST test that uses these new methods. The running
of this test can be toggled by setting the new environment variable
SMARTREDIS_TEST_DEVICE=gpu. This is by default set (noisly) in
setup_test_env.sh to CPU. Tests pass on osprey.

The C++ client referred to the parameter `first_cpu` instead of `first_gpu`. This does not have any effect on the code itself.

The C-client now has the multigpu-function documentation and signatures. The methods themselves still need to be implemented.

- Refactor some of the parameter checking - Remove repeated code that determined if the backend was TF or TFLite - Implement all set/run multigpu methods in C++ client

The MNIST tester has been modified to use the multigpu interfaces. This is simultaneously used to test both the Fortran and C implementations.

…model

Adds the multigpu methods to delete scripts and models from the database. This also adds the variable SMARTREDIS_TEST_DEVICE that can be used to toggle the availability of GPU-related tests.

codecov-commenter · 2022-05-12T23:31:00Z

Codecov Report

Merging #254 (c23e318) into develop (a99a714) will increase coverage by 0.76%.
The diff coverage is n/a.

@@             Coverage Diff             @@
##           develop     #254      +/-   ##
===========================================
+ Coverage    79.39%   80.15%   +0.76%     
===========================================
  Files           52       54       +2     
  Lines         3091     3326     +235     
===========================================
+ Hits          2454     2666     +212     
- Misses         637      660      +23

Impacted Files	Coverage Δ
include/client.h	`100.00% <ø> (ø)`
src/cpp/commandlist.cpp	`87.80% <0.00%> (-3.63%)`	⬇️
include/dataset.h	`100.00% <0.00%> (ø)`
include/metadata.h	`100.00% <0.00%> (ø)`
include/tensorpack.h	`100.00% <0.00%> (ø)`
include/redisserver.h	`33.33% <0.00%> (ø)`
include/sharedmemorylist.h	`100.00% <0.00%> (ø)`
src/cpp/pipelinereply.cpp	`36.84% <0.00%> (ø)`
include/pipelinereply.h	`50.00% <0.00%> (ø)`
... and 4 more

src/c/c_client.cpp

src/fortran/client.F90

src/fortran/client/script_interfaces.inc

tests/fortran/test_fortran_client.py

utils/create_cluster/slurm_cluster.py

billschereriii · 2022-05-13T19:42:44Z

Generally looks good. A few cosmetic issues and a few questions for you to address

utils/create_cluster/local_cluster.py

- Rearrange order of comments - Remove unnecessary string construction - Reorder function declarations - Add brief documentation for testing `SMARTREDIS_TEST_DEVICE` - Check for allocated status before deallocating

billschereriii

Looks great, thanks for taking care of this!

ashao added 6 commits May 2, 2022 16:28

First typo in docstring

63eab1a

The C++ client referred to the parameter `first_cpu` instead of `first_gpu`. This does not have any effect on the code itself.

Add signatures and docs for C-client multigpu

376608f

The C-client now has the multigpu-function documentation and signatures. The methods themselves still need to be implemented.

Implement set_model multigpu methods for C and Fortran

e85d2b4

- Refactor some of the parameter checking - Remove repeated code that determined if the backend was TF or TFLite - Implement all set/run multigpu methods in C++ client

Add multigpu Fortran MNIST test

94d9829

The MNIST tester has been modified to use the multigpu interfaces. This is simultaneously used to test both the Fortran and C implementations.

Merge branch 'develop' of github.com:CrayLabs/SmartRedis into c_multi…

60dd4bb

…model

Add delete_[script,model]_multigpu methods

2d48896

Adds the multigpu methods to delete scripts and models from the database. This also adds the variable SMARTREDIS_TEST_DEVICE that can be used to toggle the availability of GPU-related tests.

ashao requested a review from billschereriii May 12, 2022 23:20