Skip to content

Angione-Lab/12-machine-learning-models-for-text-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Expert Systems

This repository contains the Python code and data to reproduce the results presented in the paper: A. Occhipinti*, L. Rogers*, C. Angione, "A pipeline and comparative study of 12 machine learning models for text classification", Expert Systems with Applications, 201 (2022): 117193

How to run

The following steps are required to run the code:

  1. Python 3.6.x is required, a check is specific put into the code before it continues.
  2. Jupyter notebook server is required
  3. Enron spam corpus dataset is used for this paper, included is the tar zip folders containing the spam emails.
    • AV application's will flag some emails as malicious/virus or a scam, this is fine and restore where necessary.
  4. Ensure all pip dependencies are installed as listed in requirements.txt
  5. Run through the steps laid out in the notebook.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published