GitHub - AmMoPy/NLP_Enron_Emails: Natural Language Processing (NLP) and programmatic data extraction in large scale fraud investigations.

Background:

Enron Corporation was an American energy, commodities, and services company based in Houston, Texas. It was founded by Kenneth Lay in 1985 as a merger between Lay's Houston Natural Gas and InterNorth, both relatively small regional companies. Before its bankruptcy on December 3, 2001.

Dataset:

Enron Corpus is a database of over 500k real emails generated by 150 Enron employees mostly senior management; It was obtained by the Federal Energy Regulatory Commission during its investigation of Enron's collapse and was latter made public.

The dataset does not include attachments, and some messages have been deleted.

Project Motivation:

Given the size of available data, it can be overwhelming to explore and identify potential useful pieces of evidence or clues. This project is demonstrating one way of implementing Natural Language Processing (NLP) and programmatic data extraction in a large scale fraud investigation, using real data.

Along the way there are also some useful NLP and other methods deployed here that have general application, for example:

Comparing content of text files through hashing
Identifying unique and recurring words
Text summarization using deep learning models
Creating word cloud :D

Disclaimer:

This project is for demonstration purpose only and is not intended to draw conclusion whatsoever; detailed content of the emails will not be displayed despite the fact that it is publicly available elsewhere.

Resources:

Check my YT Channel

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
NLP - Enron.ipynb		NLP - Enron.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Background:

Dataset:

Project Motivation:

Disclaimer:

Resources:

About

Releases

Packages

Languages

AmMoPy/NLP_Enron_Emails

Folders and files

Latest commit

History

Repository files navigation

Background:

Dataset:

Project Motivation:

Disclaimer:

Resources:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages