This Intuition and Visualization is inspired after finishing the first Courser of Deep Learning Specializtion by Prof. Andrew NG
In this video he exlpains the derivatives of the most common used Activation functions in training the NN
Actually we need non linear acivations functions because non linearity gives our model the ability to represent more complex functions,also if we only use linear activations fucntions our model would be able to represent lines no matter how deep we make the model You can find more here