Skip to content

Commit

Permalink
remove distributed requirement from model building (microsoft#31)
Browse files Browse the repository at this point in the history
* remove distributed requirement from model building, this is needed for mpi/aml support for this model

* Update modelingpreln.py

Removed the print. It would print tons of times after removing the rank check otherwise

Co-authored-by: Samyam Rajbhandari <samyamr@microsoft.com>
  • Loading branch information
jeffra and samyam committed Aug 13, 2020
1 parent 9e2c34e commit 71ab29d
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 7 deletions.
3 changes: 0 additions & 3 deletions bing_bert/deepspeed_train.py
Original file line number Diff line number Diff line change
Expand Up @@ -368,9 +368,6 @@ def prepare_optimizer_parameters(args, model):


def prepare_model_optimizer(args):
# Initialize torch distributed
torch.distributed.init_process_group(backend="nccl")

# Loading Model
model = BertMultiTask(args)

Expand Down
7 changes: 3 additions & 4 deletions bing_bert/nvidia/modelingpreln.py
Original file line number Diff line number Diff line change
Expand Up @@ -739,10 +739,9 @@ def init_bert_weights(self, module):
num_layers = self.config.num_hidden_layers
std = self.config.initializer_range
if hasattr(module, 'bert_output_layer'):
if torch.distributed.get_rank() == 0:
print("Accounting for accumulation on the residual path")
std = self.config.initializer_range / math.sqrt(
2.0 * num_layers)
#print("Accounting for accumulation on the residual path")
std = self.config.initializer_range / math.sqrt(
2.0 * num_layers)
module.weight.data.normal_(mean=0.0, std=std)
elif isinstance(module, BertLayerNorm):
module.bias.data.zero_()
Expand Down

0 comments on commit 71ab29d

Please sign in to comment.