You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
about the input space size and the number of entries of the softmax for selecting state/node: Do you make an assumption about the maximum concurrent runnable nodes/stages in the system? From the paper (Figure 6), it seems that this value (n) needs to be predefined and the softmax should have the same number of input/output entries, is that correct?
It seems the softmax function for selecting the maximum parallelism also needs to fix the number of input/output entries beforehand. You have stated in the paper " Since the number of possible limits can be as large as the number of executors," , so if we want to apply your solution to a new larger cluster, the number of entries for the softmax function should be increased proportionally to the new cluster size. Is my understanding correct?
Answers from Hongzi:
No we don't restrict the number of total nodes. Note that the softmax operation is scale-free --- the input to softmax can have arbitrary size (it's just exponentials with normalization of the their sum). Check out the softmax function in tensorflow or pytorch and how they apply to the input vector.
I think your understanding is correct. We were being lazy and use an output node to represent a parallelism limit --- so you need n nodes if you have n executors. But a more scalable way is to just output the parallelism limit as a number. You can express such continuous (round it afterwards) number by a Gaussian distribution. The neural network output the mean and you sample from the Gaussian distribution (similar to how you sample from softmax output).
Thanks!
The text was updated successfully, but these errors were encountered:
@hongzimao
After going through your code, I think I can understand your explanation for question 1. Just want to confirm with you about my understanding:
The softmax (surrounded by the green box in the following figure)may have varying number of input entries for different scheduling events. This is because the last fully-connected layer of the actor network has output size 1, the output shape of the Actor network after reshape will be [batchSize, numberOfNodes], where numberOfNodes is a varying number.
Questions:
about the input space size and the number of entries of the softmax for selecting state/node: Do you make an assumption about the maximum concurrent runnable nodes/stages in the system? From the paper (Figure 6), it seems that this value (n) needs to be predefined and the softmax should have the same number of input/output entries, is that correct?
It seems the softmax function for selecting the maximum parallelism also needs to fix the number of input/output entries beforehand. You have stated in the paper " Since the number of possible limits can be as large as the number of executors," , so if we want to apply your solution to a new larger cluster, the number of entries for the softmax function should be increased proportionally to the new cluster size. Is my understanding correct?
Answers from Hongzi:
No we don't restrict the number of total nodes. Note that the softmax operation is scale-free --- the input to softmax can have arbitrary size (it's just exponentials with normalization of the their sum). Check out the softmax function in tensorflow or pytorch and how they apply to the input vector.
I think your understanding is correct. We were being lazy and use an output node to represent a parallelism limit --- so you need n nodes if you have n executors. But a more scalable way is to just output the parallelism limit as a number. You can express such continuous (round it afterwards) number by a Gaussian distribution. The neural network output the mean and you sample from the Gaussian distribution (similar to how you sample from softmax output).
Thanks!
The text was updated successfully, but these errors were encountered: