# Password Strength Checker

<img src='images/pass.jpg'>

Password Strength Checker is an application that checks how strong a password is. Some popular password strength meters use machine learning algorithms to predict the strength of your password. So, if you want to learn how to use machine learning to check your password’s strength, this project is for you. In this project, I will take you through how to create a password strength checker with machine learning using Python.

## Importing Packages

In [1]:
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier

import warnings
warnings.filterwarnings("ignore")

pd.set_option("display.max_columns",100)
pd.set_option("display.max_rows",100)


## Importing Data

In [2]:
df = pd.read_csv("data/data.csv", on_bad_lines="skip")

## EDA - Exploratory Data Analysis 

In [3]:
df.head()

Unnamed: 0,password,strength
0,kzde5577,1
1,kino3434,1
2,visi7k1yr,1
3,megzy123,1
4,lamborghin1,1


In [4]:
df.shape

(669640, 2)

In [5]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 669640 entries, 0 to 669639
Data columns (total 2 columns):
 #   Column    Non-Null Count   Dtype 
---  ------    --------------   ----- 
 0   password  669639 non-null  object
 1   strength  669640 non-null  int64 
dtypes: int64(1), object(1)
memory usage: 10.2+ MB


In [6]:
df.isnull().sum()

password    1
strength    0
dtype: int64

In [7]:
df = df.dropna()

In [8]:
df["strength"] = df["strength"].map({0: "Weak", 
                                         1: "Medium",
                                         2: "Strong"})

In [9]:
df.sample(5)    

Unnamed: 0,password,strength
47237,papito1,Weak
450189,61870466y,Medium
299496,nujuluq738,Medium
31093,saatana77,Medium
146408,deus12,Weak


In [10]:
x = df["password"]
y = df["strength"]

In [11]:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.5, random_state = 42)

In [12]:
import neattext.functions as nfx
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier

In [13]:
def word(password):
    character=[]
    for i in password:
        character.append(i)
    return character

In [14]:
model_pipeline = Pipeline([
    ('tfidfvectorizer', TfidfVectorizer(tokenizer=word)), 
    ('clf', RandomForestClassifier()) 
])

In [15]:
model_pipeline.fit(x_train, y_train)

In [16]:
y_pred = model_pipeline.predict(x_test)

In [17]:
from sklearn.metrics import accuracy_score
accuracy_score(y_test, y_pred)

0.9505853891643271

In [18]:
from joblib import dump
dump(model_pipeline, 'model.joblib')

['model.joblib']

In [19]:
model_pipeline.predict(["password"]) 

array(['Medium'], dtype=object)