<a href="https://colab.research.google.com/github/MahmoodInamdar/Python-projects-MI/blob/main/logistic_regression_Mahmood.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Logistic Regression
You should build a machine learning pipeline using a logistic regression model. In particular, you should do the following:
- Load the `mnist` dataset using [Pandas](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). You can find this dataset in the datasets folder.
- Split the dataset into training and test sets using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.train_test_split.html). 
- Train and test a logistic regression model using [Scikit-Learn](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html).
- Check the documentation to identify the most important hyperparameters, attributes, and methods of the model. Use them in practice.

## Import libraries 

In [14]:
import pandas as pd
import sklearn.model_selection
import sklearn.linear_model
import sklearn.metrics


## load data


In [4]:
df = pd.read_csv('mnist.csv')
df.head()

Unnamed: 0,id,class,pixel1,pixel2,pixel3,pixel4,pixel5,pixel6,pixel7,pixel8,...,pixel775,pixel776,pixel777,pixel778,pixel779,pixel780,pixel781,pixel782,pixel783,pixel784
0,31953,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,34452,8,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,60897,5,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,36953,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,1981,3,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


## Split the dataset into training and test sets

In [6]:
x = df.drop(["class","id"], axis=1)
y = df["class"]

x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x,y)
print("df:" , df.shape)
print("x_train:" , x_train.shape)
print("x_test:" , x_test.shape)
print("y_train:" , y_train.shape)
print("y_test:" , y_test.shape)


df: (4000, 786)
x_train: (3000, 784)
x_test: (1000, 784)
y_train: (3000,)
y_test: (1000,)


## Train the data


In [9]:
model = sklearn.linear_model.LogisticRegression(max_iter = 10000)
model.fit(x_train , y_train)

LogisticRegression(max_iter=10000)

## test the data


In [17]:
y_predicted = model.predict(x_test)
y_predicted

array([5, 6, 0, 9, 8, 6, 1, 3, 0, 0, 5, 3, 2, 8, 5, 1, 2, 3, 4, 7, 5, 2,
       5, 7, 2, 2, 2, 4, 6, 3, 6, 9, 1, 3, 5, 8, 6, 2, 6, 5, 4, 4, 4, 3,
       4, 4, 8, 0, 1, 1, 1, 4, 9, 5, 3, 9, 8, 9, 9, 1, 2, 8, 4, 2, 7, 6,
       0, 8, 9, 6, 7, 4, 1, 9, 3, 7, 4, 1, 5, 4, 7, 5, 1, 8, 6, 4, 6, 4,
       6, 6, 0, 7, 7, 7, 4, 1, 2, 9, 6, 8, 3, 1, 1, 1, 4, 3, 9, 1, 5, 1,
       8, 7, 2, 0, 0, 2, 7, 0, 5, 0, 6, 1, 1, 8, 9, 6, 7, 3, 5, 5, 9, 6,
       8, 7, 3, 8, 3, 9, 2, 1, 2, 7, 1, 3, 2, 4, 6, 8, 8, 5, 1, 2, 4, 8,
       9, 5, 1, 9, 5, 4, 1, 8, 9, 9, 0, 3, 2, 0, 4, 0, 8, 0, 2, 4, 7, 2,
       6, 4, 4, 5, 0, 9, 5, 1, 2, 5, 1, 7, 5, 6, 6, 1, 0, 7, 4, 1, 4, 6,
       4, 5, 1, 2, 0, 2, 1, 7, 7, 6, 1, 8, 1, 1, 2, 4, 1, 6, 7, 6, 2, 8,
       2, 4, 9, 8, 7, 5, 3, 4, 1, 3, 6, 7, 4, 0, 0, 3, 3, 6, 3, 5, 1, 1,
       6, 3, 6, 7, 1, 0, 0, 9, 8, 1, 1, 9, 3, 0, 3, 7, 1, 8, 7, 5, 1, 7,
       7, 6, 4, 0, 4, 0, 2, 8, 1, 4, 7, 6, 3, 2, 0, 1, 0, 1, 8, 4, 6, 6,
       7, 2, 7, 4, 7, 1, 4, 3, 7, 4, 4, 6, 2, 7, 5,

## Accuracy Test



In [19]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test,y_predicted)

0.867