In this lesson we will built this Support Vector Machine for classification using scikit-learn and the Radial Basis Function (RBF) Kernel. Our training data set contains continuous and categorical data from the UCI Machine Learning Repository to predict whether or not a patient has heart disease.
- Importing data into, and manipulating a pandas dataframe.
- Identifying and dealing with missing data.
- Formatting the data for a support vector machine, including One-Hot Encoding
- Optimizing parameters for the radial basis function and classification
- Building, evaluating, drawing and interpreting a support vector machine
- Task 1: Import the modules that will do all the work
- Task 2: Import the data
- Task 3: Missing Data Part 1: Identifying Missing Data
- Task 4: Missing Data Part 2: Dealing With Missing Data
- Task 5: Format Data Part 1: Split the Data into Dependent and Independent Variables
- Task 6: Format the Data Part 2: One-Hot Encoding
- Task 7: Format the Data Part 3: Centering and Scaling
- Task 8: Build A Preliminary Support Vector Machine
- Task 9: Optimize Parameters with Cross Validation
- Task 10: Building, Evaluating, Drawing, and Interpreting the Final Support Vector Machine