# Supreme Court Opinion Readability
## Explanatory Notebook
Nathan Losee and Kolby Bray

### Supreme Court Opinions Breakdown
Summary of supreme court opinion format (might be better to split this up to only give information in relevant sections)

The Supreme Court is the highest court in the United States.  It holds final jurisdiction over all appeals and communicates its rulings in the form of Opinions.  The Opinion of the Court is its ruling.
(Fix this^)
Also, we're going to explore the structure of how Supreme Court Opinions are set up while we also explore the trends of the data we've collected, but before that: some context about readability.


### Readability

Readability is literally how easy something is to read.  There are different ways to measure it, but for this project, we measured readability using two formulas: the Flesch Reading Ease formula and the Flesch-Kincaid Grade formula.  Lets break them down.

#### FRE Readability:
This formula determines how easy to read a passage is.  Higher score indicate higher ease of reading; lower scores indicate more difficulty.  Children's books typically score in the 90-100 range, while an incredibly dense and jargon-filled scientific journal would score around 20.

#### FRE Readability Formula:

![bd4916e193d2f96fa3b74ee258aaa6fe242e110e.svg](attachment:bd4916e193d2f96fa3b74ee258aaa6fe242e110e.svg)

#### F-K Readability:

This formula determines the grade level necessary to read a certain passage.  So a passage designed for a 5th grader, would receive a score of 5.

#### F-K Readability Formula:

![8e68f5fc959d052d1123b85758065afecc4150c3.svg](attachment:8e68f5fc959d052d1123b85758065afecc4150c3.svg)

### Data Description

Our data was taken raw from the official opinions of the Court, taken from the official Supreme Court website, https://www.supremecourt.gov/

Each Opinion was run through the following code, and then the resulting scores added to a dataset.

In [1]:
# Code for getting FRE and F-K Scores

from pypdf import PdfReader
import textstat

#Load PDF
pdf_path = "filepath_to_pdf"

#Command to pdf library
pdf_reader = PdfReader(pdf_path)

# Function to calculate readability scores for a given page range
def calculate_readability_scores(start_page, end_page):
    total_fre_score = 0
    total_fk_score = 0
    num_pages = end_page - start_page + 1
    
    for page_num in range(start_page, end_page + 1):
        page_text = pdf_reader.pages[page_num - 1].extract_text()
        fre_score = textstat.flesch_reading_ease(page_text)
        fk_score = textstat.flesch_kincaid_grade(page_text)
        total_fre_score += fre_score
        total_fk_score += fk_score
    
    avg_fre_score = total_fre_score / num_pages
    avg_fk_score = total_fk_score / num_pages
    
    return avg_fre_score, avg_fk_score

#Start and End page of the section
start_page = #page number
end_page = #page number

fre_score, fk_score = calculate_readability_scores(start_page, end_page)
print("Average Flesch Reading Ease (FRE) score:", fre_score)
print("Average Flesch-Kincaid Grade Level (F-K) score:", fk_score)

SyntaxError: invalid syntax (1583257466.py, line 31)

### Data Exploration

First we decided to get an overall view of readability over time.

In [None]:
# Code for first chart of Average Readability over Time

Remember, high FRE scores are easy to read while high F-K scores are more difficult to read.

Supreme Court Decisions are often complex, multifaceted documents, designed not only to express the opinion of the court, but also the varied opinions of all justices who wish to comment.  While the first opinion is the decision of the court and is authored by one justice, other justices can write opinions to further explain their thoughts on a case.  The only limit to the number of opinions on a case is how many justices wish to write them.  Here’s a view of readability by justice.

In [None]:
# Code for Individual Justice over Time

After the opinion of the court, other justices can specify whether or not they agree with the ruling of the majority.  This is done by labeling their opinion as Concurring or Dissenting.  We thought it would be interesting to investigate what patterns might emerge if we separated a justice’s readability by their Concurring and Dissenting opinions.

In [None]:
# Code for violin plots, Concurring and Dissenting over Time

Finally, we wondered if the controversy of a case would influence the complexity of writing in a justice’s opinion.  For an internal measure of controversy we decided on (insert here).  We then created a view of 

In [None]:
# Code for Controversial Flag, violin plots and bar chart

### Summary

The Supreme Court