Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
out
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

##Text Analysis of the Indian Claims Commission Decisions

The Indian Claims Commission was a legal body that adjudicated hundreds of claims that Indian Tribes had against the United States for past wrongs. It produced 43 volumes of decisions over more than 30 years of work. Though the ICC tried cases to legal standards, it was of its time and reflected changing attitudes towards Native Americans. This work attempts to examine its place in Federal-Indian policy and analyze how the Commission used historical knowledge to arrive at legal decisions. It is also a case study in using text mining to explore a large corpus (n=100%) of legal documents computationally.

This analysis collected the the Decisions from Oklahoma State University: Performed OCR of the PDFs using tesseract and Lincoln Mullen's make recipe from Civil-Procedure-Codes

#Update: OKstate has now changed their icc file structure, not allowing wget downloading. The "data" for the analysis is the raw txt files of the ICC decisions in "text." The "code" is run with topic.r only. Skip running load.r (cleans previously OCR'd text) and table.r (experiment in parsing html tables for providing decisions with metadata). Other files are for configuration (e.g. topic modeling stoplist is icc.txt).

I used the Makefile to perform each of the tasks- download, collect PDFs, OCR, collect tables/plaintiff tribes. I'd highly recommend running the OCR in parallel using make ocr -j2

The rest of the work is various R scripts that process and analyze the textural data. Use load.r and topic.r to perform the work. Table.r is a script to collect the plaintiff tribe names for the stoplist. Best practice is to use the curated stoplist on github as manual changes have been made to it.

It was originally created as a final class project for CLIO3: Hist 698 at George Mason University.

Visualizations at Petercarrjones.com

About

Text Analysis of the Indian Claims Commission Decisions

Resources

License

Releases

No releases published

Packages

No packages published

Languages