add gradient normalization and accumulation in supervised_training_step_* functions

## 🚀 Feature

After #1589 , `grad_norm` is left to add (#419), and also [gradient accumulation](https://pytorch.org/ignite/master/faq.html#gradients-accumulation) is also common in supervised training, we should also add that to have full supervised training.