In this notebook, we introduce the Support Vector Machine (SVM) algorithm, a powerful, but simple supervised learning approach to predicting data. For classification tasks, the SVM algorithm attempts to divide data in the feature space into distinct categories. By default, this division is performed by constructing hyperplanes that optimally divide the data. For regression, the hyperplanes are constructed to map the distribution of data. In both cases, these hyperplanes map linear structures in a non-probabilistic manner. By employing a kernel trick, however, we can transform non-linear data sets into linear ones, thus enabling SVM to be applied to non-linear problems. SVMs are powerful algorithms that have gained widespread popularity. This is partly due to the fact that they are effective in high dimensional feature spaces, including those problems where the number of features is similar to or slightly exceeds the number of instances. Unlike KNN, which has high demand on memory with large dataset, SVMs can be memory efficient since only the support vectors are needed to compute the hyperplanes. Finally, by using different kernels, SVM can be applied to a wide range of learning tasks. On the other hand, these models are black boxes, and it can be difficult to explain how they operate, especially on new instances. They do not, by default, provide probability estimates since the hyperplane is constructed to cleanly divide the training data. In this notebook, we first explore the basic formalism of the SVM algorithm, including the construction of hyperplane and the kernel trick, which enables SVM to be applied to non-linear problems. Next, we explore the application of SVM to classification problems, which is known as support vector classification, or SVC. To introduce this topic, we will once again use the Iris data to construct an SVC estimator, plot the calculated hyperplane, explore the resulting performance. Next, we will switch to a more complex data set, the adult data. Finally, we will apply SVM to regression problems, which is known as support vector regression. For this we will use the MPG data introduced in previous lessons.
degr8noble/Support-Vector-Machine-_with_python
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|