Skip to content

Language analysis platform for historical documents from the Reconstruction and Gilded Age eras of American history.

Notifications You must be signed in to change notification settings

typpo/gilded-age

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gilded Age

Summary

Newspaper content analysis project tailored toward archives that contain records from the Civil War, Reconstruction, and Gilded Age eras of American history. In particular, we focus on textual and content analytics to identify local and national trends in corruption.

Proposal

Here

Goals

  • Create scrapers for multiple historical archives.
  • Create analyzers for semantic analysis and classification.
    • OpenCalais
  • Provide an API for creating queries and manipulating results as objects.
  • Provide graphing and other visualization capabilities
    • Using NetworkX with graphviz:
      • Basic relational graphs
      • Histograms
    • Using Circos:
      • Complex relational graphs
  • Other:
    • Tag clouds for simple words, phrases, and extracted semantic concepts.
  • Use open semantic sources like Freebase to:
    • generalize groups of similar people, things, or concepts.
    • pinpoint related concepts with greater accuracy.
  • Use statistical techniques like clustering to:
    • reveal relationships between people, things, and concepts.

About

Language analysis platform for historical documents from the Reconstruction and Gilded Age eras of American history.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published