Skip to content

A text mining project for Harvard Business Review articles from 1922 to 2012

Notifications You must be signed in to change notification settings

fahd09/HarvardBusinessReviews_in_90_years

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

History of the United States through the lens of Harvard Business Review Articles (1922 - 2012)

HBR 90 years visualization

-> Open the the high def picture

A text mining project for HBR articles in 90 years.

In this project, I use a multivariate technique called Correspondence Analysis (CA). Given a term-year matrix that describe how many times a term j have been mentioned in year (or group of years) j, CA produces a set of orthognal components (just like Principal Component Analysis PCA) that capture the "driving forces" of variance in a dataset.

How to read the plot ?

The plot shows a representative subset of words across all years. You can imagine a spring between each word and all the years. The strenth of the spring is weighted by the number of times a word has been mentioned in that year. This way, words associated with 30's will pull those years while words associated with recent years will pull in a different direction. The plot approximates this sort of image.

More technical describtions can be found in Wikipedia:

About

A text mining project for Harvard Business Review articles from 1922 to 2012

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages