#### `Q1`: What is the relationship between polynomial functions and kernel functions in machine learning algorithms?



* Polynomial functions and kernel functions are both used as mapping functions in machine learning algorithms, but they serve different purposes. 
* Polynomial functions are a type of mapping function used to transform the input data to a higher-dimensional feature space.
* Kernel functions, on the other hand, are used to compute the similarity between data points in the original or transformed feature space. 
* In some cases, polynomial functions can be used as kernel functions, such as the polynomial kernel in support vector machines (SVMs). However, not all kernel functions are polynomial functions, as there are other types of kernel functions, such as radial basis function (RBF) kernels, which are commonly used in SVMs and other machine learning algorithms

#### `Q2`:How can we implement an SVM with a polynomial kernel in Python using Scikit-learn?


In [None]:
from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

In [None]:
iris = datasets.load_iris()
X = iris.data
y = iris.target

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
svm_poly = SVC(kernel='poly', degree=3, C=1000)

svm_poly.fit(X_train, y_train)

In [None]:
y_pred = svm_poly.predict(X_test)

print("Accuracy:", accuracy_score(y_test, y_pred))

#### `Q3`: How does increasing the value of epsilon affect the number of support vectors in SVR?


* In Support Vector Regression (SVR), epsilon is a hyperparameter that controls the width of the margin and the tolerance for errors in the training set. Specifically, epsilon determines the maximum distance between the predicted and actual output values of the training samples that are considered as errors.

* Increasing the value of epsilon in SVR will typically result in an increase in the number of support vectors. This is because a larger value of epsilon allows more training samples to be within the margin, and hence more support vectors will be required to define the boundary.

#### `Q4`: How does the choice of kernel function, C parameter, epsilon parameter, and gamma parameter affect the performance of Support Vector Regression (SVR)? Can you explain how each parameter works and provide examples of when you might want to increase or decrease its value?



* The choice of kernel function, C parameter, epsilon parameter, and gamma parameter can all significantly affect the performance of Support Vector Regression (SVR).

  * **Kernel Function:** The kernel function determines how the input data is transformed into a high-dimensional feature space, where linear separation is possible. The choice of kernel function depends on the problem at hand and the characteristics of the data. For example, the linear kernel is suitable for linearly separable data, while the RBF kernel is more suitable for nonlinear data.

  * **C Parameter:** The C parameter controls the trade-off between the complexity of the model and the degree to which errors are tolerated. A smaller value of C will result in a more flexible model, which allows more errors but may lead to overfitting. A larger value of C will result in a less flexible model, which tolerates fewer errors but may lead to underfitting. Increasing C makes the model more rigid and decreases the number of support vectors.

  * **Epsilon Parameter:** The epsilon parameter determines the margin of error allowed for the predictions. A smaller value of epsilon will result in a smaller margin of error, which means the model will try to fit the training data more closely. A larger value of epsilon will result in a larger margin of error, which means the model will be less sensitive to noise and outliers.

  * **Gamma Parameter:** The gamma parameter determines the shape of the decision boundary. A smaller value of gamma will result in a more linear decision boundary, while a larger value of gamma will result in a more complex decision boundary. A high value of gamma means that the model will only consider data points close to the support vectors, while a low value of gamma means that the model will consider data points farther away.

### `Q5`:
- Import the necessary libraries and load the dataset.
- Split the dataset into training and testing sets.
- Preprocess the data using any technique of your choice (e.g. scaling, normalization).
- Create an instance of the SVC classifier and train it on the training data.
- Use the trained classifier to predict the labels of the testing data.
- Evaluate the performance of the classifier using any metric of your choice (e.g. accuracy, precision, recall, F1-score).
- Tune the hyperparameters of the SVC classifier using GridSearchCV or RandomizedGridSearchCV to improve its performance.
- Train the tuned classifier on the entire dataset.
- Save the trained classifier to a file for future use.

In [None]:
import pandas as pd
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import accuracy_score

In [None]:
data = pd.read_csv('/content/collegePlace.csv')
data.head()

In [None]:
data=pd.get_dummies(data,columns=["Gender","Stream"])

In [None]:
data["PlacedOrNot"].value_counts()

In [None]:
x_train,x_test,y_train,y_test=train_test_split(data.drop(["PlacedOrNot"],axis=1),data["PlacedOrNot"],test_size=0.2,random_state=47,stratify=data["PlacedOrNot"])

In [None]:
scl=StandardScaler()
x_train=scl.fit_transform(x_train)
x_test=scl.transform(x_test)

In [None]:
svc = SVC()
svc.fit(x_train, y_train)

In [None]:
y_pred = svc.predict(x_test)

In [None]:
print(f"Accuracy: {accuracy_score(y_test, y_pred)}")

In [None]:
param_grid = {'C': [0.1, 1, 10, 100],
              'gamma': [0.1, 1, 10], 
              'kernel': ['linear', 'rbf']}

In [None]:
grid = GridSearchCV(SVC(), param_grid)
grid.fit(x_train, y_train)

In [None]:
grid.best_params_

In [None]:
y_pred_n = grid.predict(x_test)
print(f"after hypertuning accuracy Accuracy: {accuracy_score(y_pred_n , y_test)}")

In [None]:
import pickle
pickle.dump(grid, open('model.pkl', 'wb'))