History of the United States through the lens of Harvard Business Review Articles (1922 - 2012)

A text mining project for HBR articles in 90 years.

In this project, I use a multivariate technique called Correspondence Analysis (CA). Given a term-year matrix that describe how many times a term j have been mentioned in year (or group of years) j, CA produces a set of orthognal components (just like Principal Component Analysis PCA) that capture the "driving forces" of variance in a dataset.

How to read the plot ?

The plot shows a representative subset of words across all years. You can imagine a spring between each word and all the years. The strenth of the spring is weighted by the number of times a word has been mentioned in that year. This way, words associated with 30's will pull those years while words associated with recent years will pull in a different direction. The plot approximates this sort of image.

More technical describtions can be found in Wikipedia:

https://en.wikipedia.org/wiki/Correspondence_analysis

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
data		data
.gitignore		.gitignore
HBR_in90yrs.Rproj		HBR_in90yrs.Rproj
README.md		README.md
read_prepare_abstracts.R		read_prepare_abstracts.R
term_map_clean2.png		term_map_clean2.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

History of the United States through the lens of Harvard Business Review Articles (1922 - 2012)

About

Releases

Packages

Languages

fahd09/HarvardBusinessReviews_in_90_years

Folders and files

Latest commit

History

Repository files navigation

History of the United States through the lens of Harvard Business Review Articles (1922 - 2012)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages