Malicious URL Detection using Machine Learning

This repo provides a dataset with 388448 URLs labelled with 0 or 1, where 1 represents malicious URL. This work was done in early 2016. For demonstration purpose, I have trained a simple Logistic Regression model and have created a simple web app using Flask. Please note that this implementation is by no means the state-of-the-art, there are number of ways we can improve this model. First of all, you might get better result with deep neural networks (i.e Recurrent Neural Network). Secondly, directly using URL string as an input is not a good idea. We need to perform feature engineering and find better features(i.e using web page content or ip/host details). The data was collected from many sources, then it was merged and preprocessed. One of the sources is this.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
env		env
notebooks		notebooks
pre-trained		pre-trained
scripts		scripts
templates		templates
Procfile		Procfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

env

env

notebooks

notebooks

pre-trained

pre-trained

scripts

scripts

templates

templates

Procfile

Procfile

README.md

README.md

app.py

app.py

requirements.txt

requirements.txt

Repository files navigation

Malicious URL Detection using Machine Learning

About

Releases

Packages

Languages

bhattsameer/Malicious-URL-Detection-using-Machine-Learning

Folders and files

Latest commit

History

Repository files navigation

Malicious URL Detection using Machine Learning

About

Topics

Resources

Stars

Watchers

Forks

Languages