Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SINGA-487 Add Sparsification Algorithms #566

Merged
merged 3 commits into from Dec 19, 2019

Conversation

chrishkchris
Copy link
Contributor

@chrishkchris chrishkchris commented Dec 5, 2019

This PR implements some sparsification schemes, we transfer only gradient elements which are significant. When we make use of cuda thrust parallel algorithm to convert the dense array into sparse array, the overhead is relatively low.

It supports two mode, controlled by the flag topK:

  1. When topK is False, it transmits the gradient elements which are greater than an absolute threshold value.
  2. When topK is True, it transmits the K largest gradient element, where K equals the total number of elements multiplies the spars factor.
    Moreover, there is a flag corr to use the local accumulate gradient for correction. The flag is true by default, because it is common to use the local accumulate gradient correction in sparsification.

Some reference papers for the Sparsification:
[1] N. Strom. Scalable distributed dnn training using commodity gpu cloud computing. In Proceedings of gpu cloud computing. In Proceedings of the InterSpeech 2015. International Speech
Communication Association (ISCA), September 2015.
[2] A. F. Aji and K. Hea
eld. Sparse communication for distributed gradient descent. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP 2017), pages 440{445. Association for Computational Linguistics (ACL), September 2017.

I have added an examples file sparsification_mnist.py to test the accuracy. The following results is based on a 8 GPUs AWS instance p2.x8large of the GPU model K80.

ubuntu@ip-172-31-18-216:~/singa/examples/autograd$ python3 sparsification_mnist.py
Starting Epoch 0:
Training loss = 1237.824951, training accuracy = 0.537627
Evaluation accuracy = 0.831209, Elapsed Time = 1.364238s
Starting Epoch 1:
Training loss = 468.859161, training accuracy = 0.835053
Evaluation accuracy = 0.931229, Elapsed Time = 0.687484s
Starting Epoch 2:
Training loss = 329.488220, training accuracy = 0.887604
Evaluation accuracy = 0.949424, Elapsed Time = 0.713595s
Starting Epoch 3:
Training loss = 220.463303, training accuracy = 0.925731
Evaluation accuracy = 0.955592, Elapsed Time = 0.686450s
Starting Epoch 4:
Training loss = 171.178146, training accuracy = 0.942141
Evaluation accuracy = 0.961760, Elapsed Time = 0.686534s
Starting Epoch 5:
Training loss = 149.635681, training accuracy = 0.950237
Evaluation accuracy = 0.974198, Elapsed Time = 0.686791s
Starting Epoch 6:
Training loss = 124.092453, training accuracy = 0.958300
Evaluation accuracy = 0.973376, Elapsed Time = 0.686136s
Starting Epoch 7:
Training loss = 115.288582, training accuracy = 0.961205
Evaluation accuracy = 0.968647, Elapsed Time = 0.686174s
Starting Epoch 8:
Training loss = 99.048584, training accuracy = 0.966864
Evaluation accuracy = 0.981188, Elapsed Time = 0.685848s
Starting Epoch 9:
Training loss = 84.038574, training accuracy = 0.972239
Evaluation accuracy = 0.981188, Elapsed Time = 0.685568s

@chrishkchris chrishkchris changed the title SINGA-487 Add Sparsification Algorithm: Threshold Quantization SINGA-487 Add Sparsification Algorithms Dec 6, 2019
@chrishkchris
Copy link
Contributor Author

I will need more time to study it, so I close it first and reopen when it is ready. I will do more important thing first.

@chrishkchris
Copy link
Contributor Author

chrishkchris commented Dec 18, 2019

I have improved the code.
A Resnet-50 throughput test using AWS p2.x8large up to 4 instances or 32 K80 GPUs:
sgd.backward_and_spars_update(loss, spars = 0.05, topK = False, corr = True)
test

@chrishkchris chrishkchris reopened this Dec 18, 2019
@nudles nudles merged commit 695d9da into apache:master Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants