Machine learning based web application firewall

Machine learning based Web Application firewall for detection attacks such as SQL injections, XSS and shell script injections.

Course project for TDA602 Contributors: Filip Granqvist & Oskar Holmberg

Requirements:

Python 3.x
Jupyter notebook
Python libraries
- Pandas
- Numpy
- Sklearn
- matplotlib
- seaborn
- scipy
Node.js (to run demo server)

Follow along our development process in these notebooks:

0_Data_wrangling.ipynb (Formating other sources of payload datasets into a common format (don't step through this))
1_Data_cleaning.ipynb (Cleaning the data and output into a common .csv file)
2_Data_analysis.ipynb (All analysis, training, evaluation and saving models to pickles (not recommended to step through the training section, takes a long time))

To run notebooks (they can also be read from github):

Install jupyter notebook
type in cmd: jupyter notebook <notebookfile.ipynb>
step through each part of the notebook using Ctrl+Enter or from the toolbar

Plots_technical_background.ipynb contains junk used to create images for the report

Demo-server contains a Node.js server with our best classifier (in the form of a .pickle) implemented available for live testing
See README.md in demo-server for instructions on how to set up the server

images contains images used for the report

data folder contains our malicious and non-malicious data. Also contains trained classifiers in form of .pickle files
tfidf_2grams_randomforest.p contains our single best classifier
trained_classifiers.p contains all our classifiers along with performance metrics. But this is not the final version, it was too big for github. Download the final version from here: https://1drv.ms/f/s!Aj1zBHCOJiQFgbQwDswBYtpzB1Pulg and replace the old one

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

demo-server

demo-server

images

images

.gitignore

.gitignore

0_Data_wrangling.ipynb

0_Data_wrangling.ipynb

1_Data_cleaning.ipynb

1_Data_cleaning.ipynb

2_Data_analysis.ipynb

2_Data_analysis.ipynb

LICENSE

LICENSE

Plots_technical_background.ipynb

Plots_technical_background.ipynb

README.md

README.md

report.pdf

report.pdf

Repository files navigation

Machine learning based web application firewall

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
data		data
demo-server		demo-server
images		images
.gitignore		.gitignore
0_Data_wrangling.ipynb		0_Data_wrangling.ipynb
1_Data_cleaning.ipynb		1_Data_cleaning.ipynb
2_Data_analysis.ipynb		2_Data_analysis.ipynb
LICENSE		LICENSE
Plots_technical_background.ipynb		Plots_technical_background.ipynb
README.md		README.md
report.pdf		report.pdf

License

grananqvist/Machine-Learning-Web-Application-Firewall-and-Dataset

Folders and files

Latest commit

History

Repository files navigation

Machine learning based web application firewall

About

Topics

Resources

License

Stars

Watchers

Forks

Languages