In [None]:
from sklearn.model_selection import train_test_split
from sklearn import datasets

iris = datasets.load_iris()

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3)

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

cancer = load_breast_cancer()
X = cancer.data
y = cancer.target

# Add some noise to the data to make it harder to classify
random_state = np.random.RandomState(10)
n_samples, n_features = X.shape
X = X + (random_state.randn(n_samples, n_features) * 10)

# Train-Test-Split the data
X_train, X_test, y_train, y_test = train_test_split( X, y, test_size=0.5, random_state=1)

#Transform data
scaler = StandardScaler()
scaler.fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)


Brain Station\Reference

## Logistic Regression

- **Intuition behind test**: Logistic regression is a statistical model that uses a logistic function to model a binary dependent variable. It's a way to predict the probability of a certain event happening, which makes it a very good fit for binary classification problems.

- **Use case for test**: Logistic regression is used when the dependent variable is binary. It's widely used for binary classification problems like spam detection, churn prediction, or health diagnosis.

- **Intuition for using it for classification**: Logistic regression outputs probabilities. If the probability is greater than 0.5, it assigns the instance to the positive class, otherwise it assigns it to the negative class.

- **Intuition for using it for regression**: Logistic regression is not typically used for regression tasks as it's designed for binary classification tasks.

- **How to code it**:



In [7]:
from sklearn.linear_model import LogisticRegression

model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `C` (Inverse of regularization strength), `penalty` (Specifies the norm used in the penalization), `solver` (Algorithm to use in the optimization problem)

- **Code for iterating through one example of a hyperparameter**:


In [8]:
from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000] }
clf = GridSearchCV(LogisticRegression(penalty='l2'), param_grid)
GridSearchCV(cv=None,
            estimator=LogisticRegression(C=1.0, intercept_scaling=1,   
              dual=False, fit_intercept=True, penalty='l2', tol=0.0001),
            param_grid={'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]})



- **Assumptions of the algorithm**: Logistic regression assumes that there is a linear relationship between the logit of the response and the predictors, requires the dependent variable to be binary and assumes no error in measurement.



## K-Nearest Neighbors

- **Intuition behind test**: K-Nearest Neighbors (KNN) is a type of instance-based learning where the function is only approximated locally and all computation is deferred until function evaluation.

- **Use case for test**: KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry.

- **Intuition for using it for classification**: KNN works by finding the distances between a query and all the examples in the data, selecting the specified number examples (K) closest to the query, then votes for the most frequent label (in the case of classification) or averages the labels (in the case of regression).

- **Intuition for using it for regression**: For regression, KNN takes the average of the numerical target of the K nearest neighbors.

- **How to code it**:


In [11]:
from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)
predictions = model.predict(X_test)


array([2, 1, 2, 2, 0, 1, 1, 2, 1, 2, 2, 1, 1, 2, 2, 1, 2, 0, 2, 1, 1, 1,
       2, 2, 1, 1, 0, 1, 0, 0, 0, 1, 2, 0, 0, 2, 1, 2, 0, 2, 0, 0, 2, 0,
       0])


- **The most important hyperparameters to iterate through**: `n_neighbors` (Number of neighbors to use), `weights` (Weight function used in prediction), `p` (Power parameter for the Minkowski metric)

- **Code for iterating through one example of a hyperparameter**:



In [10]:
from sklearn.model_selection import GridSearchCV

param_grid =

{'n_neighbors': [3, 5, 7, 9, 11]}
clf = GridSearchCV(KNeighborsClassifier(), param_grid)
clf.fit(X_train, y_train)


SyntaxError: invalid syntax (3830891156.py, line 3)


- **Assumptions of the algorithm**: KNN assumes that similar things exist in close proximity. In other words, similar things are near to each other.



## Decision Trees

- **Intuition behind test**: Decision Trees is a type of algorithm that makes decisions based on conditions. It's like playing a game of 20 questions to predict the class or value of the target variable.

- **Use case for test**: Decision Trees are used for both classification and regression tasks. They are widely used in customer segmentation, detection of fraudulent transactions, or prediction of diseases.

- **Intuition for using it for classification**: The tree is constructed in a way that the most important features appear at the top of the tree. It splits the data into subsets based on the feature that provides the most information gain. This process is repeated recursively until it makes a prediction for every subset.

- **Intuition for using it for regression**: Decision tree regression observes features of an object and trains a model in the structure of a tree to predict data in the future to produce meaningful continuous output.

- **How to code it**:



In [None]:
from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `max_depth` (The maximum depth of the tree), `min_samples_split` (The minimum number of samples required to split an internal node), `min_samples_leaf` (The minimum number of samples required to be at a leaf node)

- **Code for iterating through one example of a hyperparameter**:

```python


In [None]:

from sklearn.model_selection import GridSearchCV

param_grid = {'max_depth': [3, 5, 7, 9, 11]}
clf = GridSearchCV(DecisionTreeClassifier(), param_grid)
clf.fit(X_train, y_train)



- **Assumptions of the algorithm**: Decision tree algorithm assumes that the training data is noise-free, it assumes that missing values are at random, and the most crucial assumption is that the training set is a sample from the actual population.


## Linear Regression

- **Intuition behind test**: Linear regression is a statistical model that examines the linear relationship between two (Simple Linear Regression ) or more (Multiple Linear Regression) variables — a dependent variable and independent variable(s).

- **Use case for test**: Linear regression is used when we want to predict the value of a variable based on the value of another variable. The variable we want to predict is called the dependent variable (or sometimes, the outcome variable).

- **Intuition for using it for classification**: Linear regression is not typically used for classification tasks. It is more suited for estimating values.

- **Intuition for using it for regression**: Linear regression creates a model that predicts the dependent variable as a linear function of the independent variables. It finds the line of best fit that minimizes the sum of the residuals.

- **How to code it**:


In [None]:

from sklearn.linear_model import LinearRegression

model = LinearRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `fit_intercept` (Whether to calculate the intercept for this model), `normalize` (This parameter is ignored when fit_intercept is set to False)

- **Code for iterating through one example of a hyperparameter**: Linear regression does not typically require hyperparameter tuning.

- **Assumptions of the algorithm**: Linear regression assumes that there is a linear relationship between the dependent and independent variables, the residuals are normally distributed and have constant variance, and there is no multicollinearity among independent variables.



## Support Vector Machines (SVMs)

- **Intuition behind test**: SVM is a supervised machine learning algorithm which can be used for both classification or regression challenges. However, it is mostly used in classification problems. In the SVM algorithm, we plot each data item as a point in n-dimensional space (where n is the number of features you have) with the value of each feature being the value of a particular coordinate.

- **Use case for test**: SVMs are helpful in text and hypertext categorization, classification of images, and in the biological and other sciences.

- **Intuition for using it for classification**: SVMs are based on the idea of finding a hyperplane that best separates the features into different classes.

- **Intuition for using it for regression**: In the case of regression, SVMs find the hyperplane that deviates from the most of the data points by no more than a certain amount, and for the rest of the data points, tries to minimize the deviation.

- **How to code it**:


In [None]:

from sklearn import svm

model = svm.SVC()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `C` (Penalty parameter C of the error term), `kernel` (Specifies the kernel type to be used in the algorithm), `gamma` (Kernel coefficient for 'rbf', 'poly' and 'sigmoid')

- **Code for iterating through one example of a hyperparameter**:


In [None]:

from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.1, 1, 10, 100], 'gamma': [1, 0.1, 0.01, 0.001], 'kernel': ['rbf']} 
clf = GridSearchCV(svm.SVC(), param_grid)
clf.fit(X_train, y_train)



- **Assumptions of the algorithm**: SVMs assume that the data it works with is in a specific format. Namely, that all of the input features



## Principal Component Analysis (PCA)

- **Intuition behind test**: PCA is a dimensionality reduction technique that is often used to reduce the dimensionality of large data sets, by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

- **Use case for test**: PCA is used in exploratory data analysis and for making predictive models. It is commonly used for dimensionality reduction by projecting each data point onto only the first few principal components to obtain lower-dimensional data while preserving as much of the data's variation as possible.

- **Intuition for using it for classification**: PCA itself is an unsupervised method and doesn't use any class label information. However, the transformed features (principal components) from PCA can be used for classification tasks.

- **Intuition for using it for regression**: Similarly, PCA doesn't directly apply to regression tasks, but the principal components can be used as predictors in a regression model.

- **How to code it**:


In [None]:

from sklearn.decomposition import PCA

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)



- **The most important hyperparameters to iterate through**: `n_components` (Number of components to keep)

- **Code for iterating through one example of a hyperparameter**: PCA typically doesn't require hyperparameter tuning.

- **Assumptions of the algorithm**: PCA assumes that the principal components are a linear combination of the original features, the components are orthogonal, and the most important component is the one that explains the most variance.



## Naive Bayes

- **Intuition behind test**: Naive Bayes classifiers are a family of simple "probabilistic classifiers" based on applying Bayes' theorem with strong (naive) independence assumptions between the features.

- **Use case for test**: Naive Bayes classifiers are highly scalable, requiring a number of parameters linear in the number of variables (features/predictors) in a learning problem. They are often used for text classification, spam filtering, and recommendation systems.

- **Intuition for using it for classification**: Naive Bayes is a probabilistic classifier, meaning it predicts on the basis of the probability of an object. It uses Bayes' Theorem, which is based on the concept of conditional probability.

- **Intuition for using it for regression**: Naive Bayes is not typically used for regression tasks as it's a probabilistic classifier and works based on the assumption of independence among predictors.

- **How to code it**:


In [None]:

from sklearn.naive_bayes import GaussianNB

model = GaussianNB()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: Naive Bayes typically doesn't have hyperparameters that need tuning, but some implementations like `BernoulliNB` and `MultinomialNB` have a `alpha` parameter which is a smoothing parameter.

- **Code for iterating through one example of a hyperparameter**: Naive Bayes typically doesn't require hyperparameter tuning.

- **Assumptions of the algorithm**: Naive Bayes assumes that all features are independent from each other and each one contributes independently to the probability of the outcome. This is a 'naive' assumption because it's rarely true in real-world scenarios.