# Logistic Regression
This model is oriented to predict a category.

# Setup dependencies
I will be using pandas and sklearn for managing data and machine learning.
<details>
    <summary>pip install...</summary>

```python
# Allows to install a python package
pip install package-name
# or install python package with a specific version
pip install package-name==version
```
</details>


!pip install pandas
!pip install scikit-learn

In [1]:
# Used to suppress warnings generated by your code:
def warn(*args, **kwargs):
    pass
import warnings
warnings.warn = warn
warnings.filterwarnings('ignore')

# Generating synthetic dataset
Study hours between 1 and 10

In [2]:
import numpy as np
import pandas as pd

np.random.seed(42)
# Dataset with 100 rows
hours = np.random.uniform(1, 10, 100) 

# Generating a categorical value (yes or no)
# Pass if hours + noise > 5, otherwise fail
pass_fail = (hours + np.random.normal(0, 1, 100) > 5).astype(int)

# Create a DataFrame
data = pd.DataFrame({'hours': hours, 'pass': pass_fail})
data.head(5)

Unnamed: 0,hours,pass
0,4.370861,0
1,9.556429,1
2,7.587945,1
3,6.387926,0
4,2.404168,0


# Split into training and testing sets
20% of the dataset is chosen like a training dataset

In [3]:
from sklearn.model_selection import train_test_split

x = data[['hours']]
y = data['pass']
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)

# Train the model

In [4]:
from sklearn.linear_model import LogisticRegression

# Train a logistic regression model
model = LogisticRegression()
model.fit(x_train, y_train)

# Predict and evaluate the model
Here I am going to evaluate how well the model performs.

## Confusion Matrix:

|                | Predicted Positive | Predicted Negative |
|----------------|--------------------|--------------------|
| **Actual Positive** | True Positive (TP)  | False Negative (FN) |
| **Actual Negative** | False Positive (FP) | True Negative (TN)  |

<details>
    <summary>Details</summary>
    <ul>
    <li>True Positive (TP): The model correctly predicted the positive class.</li>
    <li>True Negative (TN): The model correctly predicted the negative class.</li>
    <li>False Positive (FP): The model incorrectly predicted the positive class (Type I error).</li>
    <li>False Negative (FN): The model incorrectly predicted the negative class (Type II error).</li>
    </ul>

</details>

In [5]:
from sklearn.metrics import accuracy_score, classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_score
from sklearn.metrics import recall_score
from sklearn.metrics import f1_score

y_pred = model.predict(x_test)

conf_matrix = confusion_matrix(y_test, y_pred)
print("Confusion Matrix:")
print(conf_matrix)

Confusion Matrix:
[[10  0]
 [ 1  9]]


## Key Metrics Derived from the Confusion Matrix

|                | Formula | Explanation |
|----------------|--------------------|--------------------|
| **Accuracy** | $\frac{TP + TN}{TP + TN + FP + FN}$ | The proportion of correctly classified instances out of the total instances.  |
| **Precision** |  $\frac{TP}{TP + FP}$ | The proportion of positive predictions that are actually correct. |
| **Recall** |  $\frac{TP}{TP + FN}$ | The proportion of actual positives that are correctly predicted. |
| **Specificity** |  $\frac{TN}{TN + FP}$ | The proportion of actual negatives that are correctly predicted. |
| **F1-Score** | $2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}$  | The harmonic mean of precision and recall. |

### Accuracy

In [6]:
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.2f}")

Accuracy: 0.95


### Precision

In [7]:
precision = precision_score(y_test, y_pred)
print(f"Precision: {precision:.2f}")

Precision: 1.00


### Recall

In [12]:
recall = recall_score(y_test, y_pred)
print(f"Recall: {recall:.2f}")

Recall: 0.90


### F1 Score

In [13]:
f1 = f1_score(y_test, y_pred)
print(f"F1-Score: {f1:.2f}")

F1-Score: 0.95


In [10]:
report = classification_report(y_test, y_pred)
print("Classification Report:")
print(report)

Classification Report:
              precision    recall  f1-score   support

           0       0.91      1.00      0.95        10
           1       1.00      0.90      0.95        10

    accuracy                           0.95        20
   macro avg       0.95      0.95      0.95        20
weighted avg       0.95      0.95      0.95        20



# Prediction

In [11]:
# Model
print(model.predict([[9.556429]]))
print(model.predict([[3.556429]]))

[1]
[0]


# Conclusion
- This model predicts with an accuracy of 95%. 