Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement multi-GPU C/Fortran methods #254

Merged
merged 7 commits into from May 19, 2022
Merged

Conversation

ashao
Copy link
Member

@ashao ashao commented May 12, 2022

Convenience functions for multiple GPU systems had previously been
implemented in C++. This adds support for C and Fortran clients
for [set,delete,run]_[model,script]_multigpu. This also adds a variant
of the existing MNIST test that uses these new methods. The running
of this test can be toggled by setting the new environment variable
SMARTREDIS_TEST_DEVICE=gpu. This is by default set (noisly) in
setup_test_env.sh to CPU. Tests pass on osprey.

ashao added 6 commits May 2, 2022 16:28
The C++ client referred to the parameter `first_cpu` instead of
`first_gpu`. This does not have any effect on the code itself.
The C-client now has the multigpu-function documentation and signatures.
The methods themselves still need to be implemented.
- Refactor some of the parameter checking
- Remove repeated code that determined if the backend was TF
  or TFLite
- Implement all set/run multigpu methods in C++ client
The MNIST tester has been modified to use the multigpu interfaces.
This is simultaneously used to test both the Fortran and C
implementations.
Adds the multigpu methods to delete scripts and models from the
database. This also adds the variable SMARTREDIS_TEST_DEVICE that
can be used to toggle the availability of GPU-related tests.
@ashao ashao requested a review from billschereriii May 12, 2022 23:20
@codecov-commenter
Copy link

codecov-commenter commented May 12, 2022

Codecov Report

Merging #254 (c23e318) into develop (a99a714) will increase coverage by 0.76%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #254      +/-   ##
===========================================
+ Coverage    79.39%   80.15%   +0.76%     
===========================================
  Files           52       54       +2     
  Lines         3091     3326     +235     
===========================================
+ Hits          2454     2666     +212     
- Misses         637      660      +23     
Impacted Files Coverage Δ
include/client.h 100.00% <ø> (ø)
src/cpp/commandlist.cpp 87.80% <0.00%> (-3.63%) ⬇️
include/dataset.h 100.00% <0.00%> (ø)
include/metadata.h 100.00% <0.00%> (ø)
include/tensorpack.h 100.00% <0.00%> (ø)
include/redisserver.h 33.33% <0.00%> (ø)
include/sharedmemorylist.h 100.00% <0.00%> (ø)
src/cpp/pipelinereply.cpp 36.84% <0.00%> (ø)
include/pipelinereply.h 50.00% <0.00%> (ø)
... and 4 more

src/c/c_client.cpp Outdated Show resolved Hide resolved
src/c/c_client.cpp Outdated Show resolved Hide resolved
src/c/c_client.cpp Outdated Show resolved Hide resolved
src/c/c_client.cpp Outdated Show resolved Hide resolved
src/c/c_client.cpp Outdated Show resolved Hide resolved
@billschereriii
Copy link
Contributor

Generally looks good. A few cosmetic issues and a few questions for you to address

- Rearrange order of comments
- Remove unnecessary string construction
- Reorder function declarations
- Add brief documentation for testing `SMARTREDIS_TEST_DEVICE`
- Check for allocated status before deallocating
Copy link
Contributor

@billschereriii billschereriii left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for taking care of this!

@ashao ashao merged commit fea2289 into CrayLabs:develop May 19, 2022
@ashao ashao deleted the c_multimodel branch May 19, 2022 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants