Skip to content

Tool to generate a machine learning model to detect port scans, or maybe other unwanted activity

License

Notifications You must be signed in to change notification settings

le4onardo/pscan-classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pscan-classifier

Tool to generate machine learning models to detect port scans


Prerequisites

Open Argus 3.0.8.2 (argus and clients)

Nmap 7.91

Python 3.8.10

pandas 1.2.4

numpy 1.20.3

Matplotlib 3.4.2

sklearn 0.24.2

There are other python dependencies not listed here, but they can be installed on the way.


Usage

This project needs several .argus files, i.e. network flow information files, stored in "./trainData/netflows" folder. These files must have authentical network flows and port scan network flows. You can generate those files using argus and argus clients to record network activity, or converting existing .pcap files to a netflow version (.argus). Refer to argus documentation on how to do that.

One condition to generete these files is to keep track of wich computers in the network are the attackers, and wich ones are innocents, i.e. we need their ips. Then variables.json file needs these ips in scannerIps and targetIps properties respectively. Aditionally it needs the password for sudo privileges when running the trainer.

variables.json

{
	"argusConfig": "./netflowConfFiles",
	"trainingData": "./trainData/netflows",
	"demoData": "./demoData",
	"scannerIps": ["scanner ip here", "scanner ip here"], 
	"targetIps": ["target ip here", "target ip here"] ,
	"password": "password here"
}

Finnally running the train.py file will generate a bagging trained model with the following steps:

After dimensional reduction, the correlation matrix of remaining columns is displayed.

Correlation matrix

At this point the dataframe is ready to be used in training. Once the training ends, two grapichs are displayed, the first decision tree of the bagged model

Correlation matrix

And the confusion matrix

Correlation matrix

Lastly a column relevance grapich is displayed

Correlation matrix

The model is already created with name bag.pkl.

Demo

To see the model in action use the demo.py file to view a real time netflow clasification. It will search for a model called bag.pkl and it will use argus in daemon mode to fetch the network traffic on the machine.

Correlation matrix

About

Tool to generate a machine learning model to detect port scans, or maybe other unwanted activity

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published