## Logistic Regression

- **Intuition behind test**: Logistic regression is a statistical model that uses a logistic function to model a binary dependent variable. It's a way to predict the probability of a certain event happening, which makes it a very good fit for binary classification problems.

- **Use case for test**: Logistic regression is used when the dependent variable is binary. It's widely used for binary classification problems like spam detection, churn prediction, or health diagnosis.

- **Intuition for using it for classification**: Logistic regression outputs probabilities. If the probability is greater than 0.5, it assigns the instance to the positive class, otherwise it assigns it to the negative class.

- **Intuition for using it for regression**: Logistic regression is not typically used for regression tasks as it's designed for binary classification tasks.

- **How to code it**:



In [1]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn import datasets

iris = datasets.load_iris()

X_train, X_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.3)

model = LogisticRegression()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `C` (Inverse of regularization strength), `penalty` (Specifies the norm used in the penalization), `solver` (Algorithm to use in the optimization problem)

- **Code for iterating through one example of a hyperparameter**:


In [None]:
from sklearn.model_selection import GridSearchCV

param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000] }
clf = GridSearchCV(LogisticRegression(penalty='l2'), param_grid)
GridSearchCV(cv=None,
            estimator=LogisticRegression(C=1.0, intercept_scaling=1,   
              dual=False, fit_intercept=True, penalty='l2', tol=0.0001),
            param_grid={'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]})



- **Assumptions of the algorithm**: Logistic regression assumes that there is a linear relationship between the logit of the response and the predictors, requires the dependent variable to be binary and assumes no error in measurement.



## K-Nearest Neighbors

- **Intuition behind test**: K-Nearest Neighbors (KNN) is a type of instance-based learning where the function is only approximated locally and all computation is deferred until function evaluation.

- **Use case for test**: KNN can be used for both classification and regression predictive problems. However, it is more widely used in classification problems in the industry.

- **Intuition for using it for classification**: KNN works by finding the distances between a query and all the examples in the data, selecting the specified number examples (K) closest to the query, then votes for the most frequent label (in the case of classification) or averages the labels (in the case of regression).

- **Intuition for using it for regression**: For regression, KNN takes the average of the numerical target of the K nearest neighbors.

- **How to code it**:


from sklearn.neighbors import KNeighborsClassifier

model = KNeighborsClassifier(n_neighbors=3)
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `n_neighbors` (Number of neighbors to use), `weights` (Weight function used in prediction), `p` (Power parameter for the Minkowski metric)

- **Code for iterating through one example of a hyperparameter**:



from sklearn.model_selection import GridSearchCV

param_grid =

{'n_neighbors': [3, 5, 7, 9, 11]}
clf = GridSearchCV(KNeighborsClassifier(), param_grid)
clf.fit(X_train, y_train)



- **Assumptions of the algorithm**: KNN assumes that similar things exist in close proximity. In other words, similar things are near to each other.



## Decision Trees

- **Intuition behind test**: Decision Trees is a type of algorithm that makes decisions based on conditions. It's like playing a game of 20 questions to predict the class or value of the target variable.

- **Use case for test**: Decision Trees are used for both classification and regression tasks. They are widely used in customer segmentation, detection of fraudulent transactions, or prediction of diseases.

- **Intuition for using it for classification**: The tree is constructed in a way that the most important features appear at the top of the tree. It splits the data into subsets based on the feature that provides the most information gain. This process is repeated recursively until it makes a prediction for every subset.

- **Intuition for using it for regression**: Decision tree regression observes features of an object and trains a model in the structure of a tree to predict data in the future to produce meaningful continuous output.

- **How to code it**:



from sklearn.tree import DecisionTreeClassifier

model = DecisionTreeClassifier()
model.fit(X_train, y_train)
predictions = model.predict(X_test)



- **The most important hyperparameters to iterate through**: `max_depth` (The maximum depth of the tree), `min_samples_split` (The minimum number of samples required to split an internal node), `min_samples_leaf` (The minimum number of samples required to be at a leaf node)

- **Code for iterating through one example of a hyperparameter**:

```python



from sklearn.model_selection import GridSearchCV

param_grid = {'max_depth': [3, 5, 7, 9, 11]}
clf = GridSearchCV(DecisionTreeClassifier(), param_grid)
clf.fit(X_train, y_train)



- **Assumptions of the algorithm**: Decision tree algorithm assumes that the training data is noise-free, it assumes that missing values are at random, and the most crucial assumption is that the training set is a sample from the actual population.