It is such a simple program to detect whether the given urls(u can give any number of urls in the program)is malicious or not.
before running this code.py file make sure that u have installed all the necessarry packages like pandas,numpy..
the dataset given here(url_feature.csv) contains for than 10,000 urls. if u want u can reduce the number for time consumption.
Algorithm used: All of URLs in the dataset are labeled. We use 5-fold method to train-test our systems. After selecting features, we used four machine learning algorithms. They are
- Linear Regression
- Logistic Regression
- Random Forest
- Gaussian Naïve-Bayes
RESULTS: ALGORITHM -ACCURACY Linear Regression 93.04 Logistic Regression 96.17 Random Forest 82.20 Naïve bayes 96.00