<h2 align="center">Codebasics ML Course: Support Vector Machine Tutorial</h2>

### Problem Statement:  Classify raisins into one of the two categories,
1. Kecimen
1. Besni

### Dataset Citation
This dataset is used under citation guidelines from the original authors. For detailed study and dataset description, see the following references:

- **Citation**: Cinar, I., Koklu, M., & Tasdemir, S. (2020). Classification of Raisin Grains Using Machine Vision and Artificial Intelligence Methods. *Gazi Journal of Engineering Sciences, 6*(3), 200-209. DOI: [10.30855/gmbd.2020.03.03](https://doi.org/10.30855/gmbd.2020.03.03)
- **Dataset available at**: [Murat Koklu's Dataset Page](https://www.muratkoklu.com/datasets/)
- **Article download**: [DergiPark](https://dergipark.org.tr/tr/download/article-file/1227592)


In [1]:
import pandas as pd

df = pd.read_excel("Raisin_Dataset.xlsx")
df.sample(5)

Unnamed: 0,Area,MajorAxisLength,MinorAxisLength,Eccentricity,ConvexArea,Extent,Perimeter,Class
812,235047,772.956877,388.201507,0.864735,239093,0.711673,1942.05,Besni
787,105961,497.70146,275.971726,0.832189,109992,0.697562,1347.989,Besni
281,67754,349.197138,251.679638,0.693208,69536,0.655261,1032.358,Kecimen
402,51304,350.042582,189.765438,0.8403,52949,0.660998,897.111,Kecimen
31,41809,307.532739,175.085568,0.822114,43838,0.697444,828.697,Kecimen


In [2]:
df.shape

(900, 8)

There are total 900 records and using all the features that we have available, we will build a classification model by using support vector machine 

### Train Test Split

In [3]:
X = df[["Area", "MajorAxisLength", "MinorAxisLength", "Eccentricity", "ConvexArea", "Extent", "Perimeter"]]
y = df["Class"]

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=10)

### Model Training Using SVM: RBF Kernel

In [4]:
from sklearn.svm import SVC

model = SVC(kernel="rbf")
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

from sklearn.metrics import classification_report

report = classification_report(y_test, y_pred)
print(report)

model.n_iter_

              precision    recall  f1-score   support

       Besni       0.86      0.75      0.80        83
     Kecimen       0.81      0.90      0.85        97

    accuracy                           0.83       180
   macro avg       0.83      0.82      0.82       180
weighted avg       0.83      0.83      0.83       180



array([229], dtype=int32)

### Model Training Using SVM: Linear Kernel

In [5]:
from sklearn.svm import SVC

model = SVC(kernel="linear")
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

from sklearn.metrics import classification_report

report = classification_report(y_test, y_pred)
print(report)

model.n_iter_

              precision    recall  f1-score   support

       Besni       0.90      0.87      0.88        83
     Kecimen       0.89      0.92      0.90        97

    accuracy                           0.89       180
   macro avg       0.90      0.89      0.89       180
weighted avg       0.89      0.89      0.89       180



array([60953539], dtype=int32)

In [9]:
from sklearn.ensemble import VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
log_model = LogisticRegression()
svc_model = SVC(kernel='rbf',probability=True)
dt_model = DecisionTreeClassifier()

vc = VotingClassifier(estimators=[('log',log_model),
                                  ('svc',svc_model),
                                  ('dt',dt_model)],
                      voting='soft')

vc.fit(X_train,y_train)

STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


In [10]:
y_pred = vc.predict(X_test)

print(classification_report(y_test,y_pred))

              precision    recall  f1-score   support

       Besni       0.89      0.81      0.85        83
     Kecimen       0.85      0.92      0.88        97

    accuracy                           0.87       180
   macro avg       0.87      0.86      0.86       180
weighted avg       0.87      0.87      0.87       180

