# Password Strength Checker using Python...

Password Strength Checker is an application that checks how strong a password is. Some popular password strength meters use machine learning algorithms to predict the strength of your password. So, if you want to learn how to use machine learning to check your password’s strength, this article is for you. In this article, I will take you through how to create a password strength checker with machine learning using Python.

#### How to Create a Password Strength Checker?
A password strength checker works by understanding the combination of digits, letters, and special symbols you use in your password. It is created by training a machine learning model on a labelled dataset of different combinations of letters and special symbols people use in passwords. The model learns from data about what combinations of letters and symbols can be classified as a solid or weak password.

So to create an application to check the strength of passwords, we need to have a labelled dataset about different combinations of letters and symbols. I found a dataset on Kaggle to train a machine learning model to predict the strength of a password. We can use that data for this task. You can download the dataset from here.

In the section below, I will take you through how to use Machine Learning to create a password strength checker using Python.

##### Let’s start by importing the necessary Python libraries and the dataset we need for creating a password strength checker:

In [1]:
import pandas as pd 
import numpy as np
import warnings
warnings.filterwarnings('ignore')
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

In [2]:
df = pd.read_csv('data.csv',error_bad_lines = False)
df.head()

Skipping line 2810: expected 2 fields, saw 5
Skipping line 4641: expected 2 fields, saw 5
Skipping line 7171: expected 2 fields, saw 5
Skipping line 11220: expected 2 fields, saw 5
Skipping line 13809: expected 2 fields, saw 5
Skipping line 14132: expected 2 fields, saw 5
Skipping line 14293: expected 2 fields, saw 5
Skipping line 14865: expected 2 fields, saw 5
Skipping line 17419: expected 2 fields, saw 5
Skipping line 22801: expected 2 fields, saw 5
Skipping line 25001: expected 2 fields, saw 5
Skipping line 26603: expected 2 fields, saw 5
Skipping line 26742: expected 2 fields, saw 5
Skipping line 29702: expected 2 fields, saw 5
Skipping line 32767: expected 2 fields, saw 5
Skipping line 32878: expected 2 fields, saw 5
Skipping line 35643: expected 2 fields, saw 5
Skipping line 36550: expected 2 fields, saw 5
Skipping line 38732: expected 2 fields, saw 5
Skipping line 40567: expected 2 fields, saw 5
Skipping line 40576: expected 2 fields, saw 5
Skipping line 41864: expected 2 field

Unnamed: 0,password,strength
0,kzde5577,1
1,kino3434,1
2,visi7k1yr,1
3,megzy123,1
4,lamborghin1,1


In [3]:
df.shape

(669640, 2)

In [4]:
df.isna().any()

password     True
strength    False
dtype: bool

##### The dataset has two columns; password and strength. In the strength column:

0 means: the password’s strength is weak;

1 means: the password’s strength is medium;

2 means: the password’s strength is strong;


Before moving forward, I will convert 0, 1, and 2 values in the strength column to weak, medium, and strong:

In [5]:
df = df.dropna()

df['strength'] = df['strength'].map({0:'Weak',
                                     1:'Medium',
                                     2:'Strong'})
print(df.sample(5))

                password strength
98646           boiqpl88   Medium
233222    Govinda@321Pav   Strong
330883   EIJxPvDWFfKc0oW   Strong
504297  ZelenGozdicek-12   Strong
220617        bouchama06   Medium


In [6]:
pd.DataFrame({'Count':df.isnull().sum(),'Percentage':df.isnull().sum()})/len(df)

Unnamed: 0,Count,Percentage
password,0.0,0.0
strength,0.0,0.0


#### Password Strength Prediction Model
Now let’s move to train a machine learning model to predict the strength of the password. Before we start preparing the model, we need to tokenize the passwords as we need the model to learn from the combinations of digits, letters, and symbols to predict the password’s strength. So here’s how we can tokenize and split the data into training and test sets:

In [7]:
def word(password):
    character=[]
    for i in password:
        character.append(i)
    return character
  
x = np.array(df["password"])
y = np.array(df["strength"])

tdif = TfidfVectorizer(tokenizer = word)
x = tdif.fit_transform(x)

x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.05,random_state=42)
    

#### Now here’s how to train a classificamtion model to predict the strength of the password:

In [None]:
model = RandomForestClassifier()
model.fit(x_train,y_train)
print(model.score(x_test,y_test))

#### Now here’s how we can check the strength of a password using the trained model:

In [None]:
import getpass
user = getpass.getpass('Enter PassWord:')
data = tdif.transform([user]).toarray()
output = model.predict(df)
output

#### Summary
So this is how you can use machine learning to create a password’s strength checker using the Python programming language. A password strength checker works by understanding the combination of digits, letters, and special symbols you use in your password. I hope you liked this article on creating a password’s strength checker with Machine Learning using Python.