Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 

Imagine we would like to know who is the best person to ask about a subject inside our company —a potential mentor. One way would be to infer each person’s speciality from their main body of work: emails.

If we lived in another world in which privacy is not an obvious concern —or if we worked in Google— reading other people's email would be totally kosher. In the normal, privacy-complaint world, this remains a purely academic exercise.

However we do have access to a publicly-released corpus of emails to work with: the Enron email dataset.

When I first approached this subject, my first idea was to use a named entity recognize (NER), because if one were designing a recommender system for an energy company, one of the use cases would be to suggest whom to ask about a very specific technical issue. At the time, I found SpaCy to to have a nice NER for python.

Read the post at https://danielpradilla.info/blog/recommender-system-for-finding-subject-matter-experts-using-the-enron-email-corpus

About

Experiments with the enron email corpus

Resources

Releases

No releases published

Packages

No packages published

Languages