### sklearn.svm.SVC
_class_ sklearn.svm.SVC(_*_, _C=1.0_, _kernel='rbf'_, _degree=3_, _gamma='scale'_, _coef0=0.0_, _shrinking=True_, _probability=False_, _tol=0.001_, _cache_size=200_, _class_weight=None_, _verbose=False_, _max_iter=-1_, _decision_function_shape='ovr'_, _break_ties=False_, _random_state=None_)[[source]](https://github.com/scikit-learn/scikit-learn/blob/7db5b6a98/sklearn/svm/_classes.py#L554)[¶](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC "Permalink to this definition")
C-Support Vector Classification.

The implementation is based on libsvm. The fit time scales at least quadratically with the number of samples and may be impractical beyond tens of thousands of samples. For large datasets consider using  [`LinearSVC`](https://scikit-learn.org/stable/modules/generated/sklearn.svm.LinearSVC.html#sklearn.svm.LinearSVC "sklearn.svm.LinearSVC")  or  [`SGDClassifier`](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.SGDClassifier.html#sklearn.linear_model.SGDClassifier "sklearn.linear_model.SGDClassifier")  instead, possibly after a  [`Nystroem`](https://scikit-learn.org/stable/modules/generated/sklearn.kernel_approximation.Nystroem.html#sklearn.kernel_approximation.Nystroem "sklearn.kernel_approximation.Nystroem")  transformer or other  [Kernel Approximation](https://scikit-learn.org/stable/modules/kernel_approximation.html#kernel-approximation).

The multiclass support is handled according to a one-vs-one scheme.

For details on the precise mathematical formulation of the provided kernel functions and how  `gamma`,  `coef0`  and  `degree`  affect each other, see the corresponding section in the narrative documentation:  [Kernel functions](https://scikit-learn.org/stable/modules/svm.html#svm-kernels).

Read more in the  [User Guide](https://scikit-learn.org/stable/modules/svm.html#svm-classification).

In [2]:
from sklearn import datasets
from sklearn.model_selection import train_test_split

iris = datasets.load_iris()

X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split (X, y, test_size=.5)

In [4]:
from sklearn import svm

clf = svm.SVC(gamma=0.001, C=100.)
clf.fit(X_train, y_train)
pred = clf.predict(X_test)
print(y_test)
from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, pred))

[1 1 2 2 2 2 0 1 0 2 1 1 0 0 0 0 2 2 2 0 2 1 0 2 2 2 0 1 1 0 1 2 2 1 2 1 2
 1 0 0 0 2 2 0 2 2 2 1 1 1 1 0 0 2 0 0 0 2 0 1 0 1 0 0 1 1 1 0 2 2 2 0 2 2
 0]
0.9733333333333334


In [7]:
from sklearn.model_selection import GridSearchCV

params = {
    'gamma' : [0.1, 0.05, 0.001],
    'C' : [1, 10, 100]
         }

grid_cv = GridSearchCV(clf, param_grid=params, scoring='accuracy', cv=5, verbose=1)
grid_cv.fit(X_train, y_train)
print('GridSearchCV 최고 평균 정확도 수치: ',round(grid_cv.best_score_, 4))
print('GridSearchCV 최적 하이퍼파라미터: ', grid_cv.best_params_)

Fitting 5 folds for each of 9 candidates, totalling 45 fits
GridSearchCV 최고 평균 정확도 수치:  0.9867
GridSearchCV 최적 하이퍼파라미터:  {'C': 10, 'gamma': 0.05}


In [28]:
import pandas as pd

feature_name_df = pd.read_csv("./datasets/features.txt",sep='\s+',
           header=None,names=['column_index','column_name'])

feature_name = feature_name_df.iloc[:, 1].values.tolist()
print(feature_name[:10])

['tBodyAcc-mean()-X', 'tBodyAcc-mean()-Y', 'tBodyAcc-mean()-Z', 'tBodyAcc-std()-X', 'tBodyAcc-std()-Y', 'tBodyAcc-std()-Z', 'tBodyAcc-mad()-X', 'tBodyAcc-mad()-Y', 'tBodyAcc-mad()-Z', 'tBodyAcc-max()-X']


In [29]:
def get_new_feature_name_df(old_feature_name_df):
    feature_dup_df = pd.DataFrame(data=old_feature_name_df.groupby('column_name').cumcount(),columns=['dup_cnt'])
    feature_dup_df = feature_dup_df.reset_index()
    new_feature_name_df = pd.merge(old_feature_name_df.reset_index(),feature_dup_df, how='outer')
    new_feature_name_df['column_name'] = new_feature_name_df[['column_name','dup_cnt']].apply(lambda x : x[0]+'_'+str(x[1])
                                                                                              if x[1]>0 else x[0], axis=1)
    new_feature_name_df = new_feature_name_df.drop(['index'], axis=1)

    return new_feature_name_df

In [30]:
import pandas as pd



def get_human_dataset():
    
    feature_name_df = pd.read_csv('./datasets/features.txt', sep='\s+',
                                                     header=None, names=['column_index', 'column_name'])
    new_feature_name_df = get_new_feature_name_df(feature_name_df)
    feature_name = new_feature_name_df.iloc[:, 1].values.tolist()
    
    X_train = pd.read_csv('./datasets/train/X_train.txt', sep='\s+', names=feature_name)
    X_test = pd.read_csv('./datasets/test/X_test.txt', sep='\s+', names=feature_name)
    
    y_train = pd.read_csv('./datasets/train/y_train.txt', sep='\s+', header=None, names=['action'])
    y_test = pd.read_csv('./datasets/test/y_test.txt', sep='\s+', header=None, names=['action'])
    
    return X_train, X_test, y_train, y_test

X_train, X_test, y_train, y_test = get_human_dataset()


In [39]:
from sklearn import svm

X_train, X_test, y_train, y_test = get_human_dataset()

h_clf = svm.SVC(gamma=0.05, C=10.)
h_clf.fit(X_train, y_train)
pred = clf.predict(X_test)

from sklearn.metrics import accuracy_score
print(accuracy_score(y_test, pred))

  y = column_or_1d(y, warn=True)


0.9633525619273838
