It is a supervised algorithm. (Labels given) It is used generally for classification. It performs better than other algorithm when there are outliers. It creates a hyper plane between the datasets. It can be used for both linear and non linear classification.
Maximize the distance of plane from nearest points of both sides. (This is how plane is considered)
For non linear datasets we use non linear kernel.