# Supreme Court Opinion Readability
## Explanatory Notebook
Nathan Losee and Kolby Bray

### Supreme Court Opinions Breakdown

The Supreme Court is the highest court in the United States.  It holds final jurisdiction over all appeals and communicates its rulings in the form of Opinions.  The Opinion of the Court is its ruling.  These Decisions affect all US citizens and decisions made in all other courts in the country.  As such, it's very important that these documents are clear, concise, and not open to interpretation.  In short, these decisions should be easy to read.

We decided to investigate the readability of Supreme Court Decisions.  While we do so, we'll explain the structure of the Opinions as well.  Let's explore the trends of the data we've collected, but before that: some context about readability.


### Readability

Readability is literally how easy something is to read.  There are different ways to measure it, but for this project, we measured readability using two formulas: the Flesch Reading Ease formula and the Flesch-Kincaid Grade formula.  Let's break them down.

#### FRE Readability:
This formula determines how easy to read a passage is.  Higher scores indicate higher ease of reading; lower scores indicate more difficulty.  Children's books typically score in the 90-100 range, while an incredibly dense and jargon-filled scientific journal would score around 20.

#### FRE Readability Formula:

![bd4916e193d2f96fa3b74ee258aaa6fe242e110e.svg](attachment:bd4916e193d2f96fa3b74ee258aaa6fe242e110e.svg)

#### F-K Readability:

This formula determines the grade level necessary to read a certain passage.  So, a passage designed for a 5th grader would receive a score of 5.

#### F-K Readability Formula:

![8e68f5fc959d052d1123b85758065afecc4150c3.svg](attachment:8e68f5fc959d052d1123b85758065afecc4150c3.svg)

### Data Description

Our data was taken raw from the official opinions of the Court, taken from the official Supreme Court website, https://www.supremecourt.gov/

Each Opinion was run through the following code, and then the resulting scores added to a dataset.  We loaded in an example opinion from the Case: EVENWEL et al. v. ABBOTT, GOVERNOR OF TEXAS, et al. 

In [1]:
# Code for getting FRE and F-K Scores
#May need to install these for your code to work
#If not, just move on to the next cell, this is just an example of how data was gathered
from pypdf import PdfReader
import textstat

pdf_path = "2015_2_done.pdf"
pdf_reader = PdfReader(pdf_path)

#FRE and F-K Function
def calculate_readability_scores(start_page, end_page):
    total_fre_score = 0
    total_fk_score = 0
    num_pages = end_page - start_page + 1
    
    for page_num in range(start_page, end_page + 1):
        page_text = pdf_reader.pages[page_num - 1].extract_text()
        fre_score = textstat.flesch_reading_ease(page_text)
        fk_score = textstat.flesch_kincaid_grade(page_text)
        total_fre_score += fre_score
        total_fk_score += fk_score
    
    avg_fre_score = total_fre_score / num_pages
    avg_fk_score = total_fk_score / num_pages
    
    return avg_fre_score, avg_fk_score

#Start and End page of the section
start_page = 123
end_page = 139

fre_score, fk_score = calculate_readability_scores(start_page, end_page)
print("Average Flesch Reading Ease (FRE) score:", fre_score)
print("Average Flesch-Kincaid Grade Level (F-K) score:", fk_score)

Average Flesch Reading Ease (FRE) score: 47.67
Average Flesch-Kincaid Grade Level (F-K) score: 10.982352941176472


In [2]:
#Data Transformation and preparation

import pandas as pd

# Read the Excel file into a DataFrame
df = pd.read_excel('wrangled_reading_score_data.xlsx')

# Identify all columns that contain "Opinion" but not "Author" in their names
opinion_columns = [col for col in df.columns if 'Opinion' in col and 'Author' not in col and 'Type' not in col]
author_columns = [col for col in df.columns if 'Author' in col]

# Convert these columns to floats
for col in opinion_columns:
    df[col] = pd.to_numeric(df[col], errors='coerce')
# Convert author columns to strings
for col in author_columns:
    df[col] = df[col].astype(str)
    df[col] = df[col].str.replace('’', "'")
    df[col] = df[col].str.replace(r'\bGinsbur\b', 'Ginsburg', regex=True)

# Convert the remaining score columns to floats
df['Syllabus F-K Score'] = pd.to_numeric(df['Syllabus F-K Score'], errors='coerce')
df['Syllabus FRE Score'] = pd.to_numeric(df['Syllabus FRE Score'], errors='coerce')

#Strip whitespace from author columns
for col in author_columns:
    df[col] = df[col].str.replace('Justuce', 'Justice').str.strip()
for col in author_columns:
    df[col] = df[col].str.replace('Justicw', 'Justice').str.strip()
for col in author_columns:
    df[col] = df[col].str.replace('Ginsburg', 'Ginsberg').str.strip()

In [3]:
# Identify all columns that contain "Opinion" and "Type" in their names
opinion_type_columns = [col for col in df.columns if 'Opinion' in col and 'Type' in col]
for col in opinion_type_columns:
    df[col] = df[col].astype(str)
df['Opinion 2 Type'].unique()

# Create the "Controversial" flag
df['Controversial'] = df[opinion_type_columns].apply(lambda row: (row == 'dissenting').sum() >= 2, axis=1).astype(int)

In [4]:
### Combine all opinions and authors into respective columns
# Concatenate data for all opinions
opinion_columns = [col for col in df.columns if 'Opinion' in col and 'Score' in col]
author_columns = [col for col in df.columns if 'Opinion' in col and 'Author(s)' in col]
# Ensure the Year column remains throughout this transformation
long_df = pd.DataFrame()
for i in range(1, 10):  # Assuming there are up to 9 opinions
    fre_col = f'Opinion {i} FRE Score'
    fk_col = f'Opinion {i} F-K Score'
    author_col = f'Opinion {i} Author(s)'
    if fre_col in df.columns and fk_col in df.columns and author_col in df.columns:
        temp_df = df[['Year', fre_col, fk_col, author_col, 'Controversial']].copy()
        temp_df.columns = ['Year', 'FRE Score', 'F-K Score', 'Author(s)', 'Controversial']
        temp_df['Opinion'] = f'Opinion {i}'
        long_df = pd.concat([long_df, temp_df], ignore_index=True)


# If multiple authors are attributed to an opinion, only keep the first author
long_df['Author(s)'] = long_df['Author(s)'].str.split(',').str[0]
long_df['Author(s)'] = long_df['Author(s)'].apply(lambda x: x.split(' and ')[0])
#Drop all null authors
long_df = long_df.dropna()

### Data Exploration

First, we decided to get an overall view of readability over time.  We used average readability and the year the decisions were made to create a simple initial overview of what readability looks like in general.

In [5]:
# Code for first chart of Average Readability over Time
# Create line chart for F-K scores
import pandas as pd
import altair as alt

avg_scores = long_df.groupby('Year').agg({'F-K Score': 'mean', 'FRE Score': 'mean'}).reset_index()

# Create the nearest selection
nearest = alt.selection_point(on='mouseover', nearest=True, empty='none', fields=['Year'])

# Calculate the y-axis range for F-K scores
fk_min = avg_scores['F-K Score'].min()
fk_max = avg_scores['F-K Score'].max()
fk_range = [fk_min - (fk_max - fk_min) * 0.1, fk_max + (fk_max - fk_min) * 0.1]

# Create line chart for F-K scores
fk_line = alt.Chart(avg_scores).mark_line(strokeWidth=3).encode(
    x=alt.X('Year:O', title='Year'),
    y=alt.Y('F-K Score:Q', title='Average F-K Score', scale=alt.Scale(domain=fk_range))
)

# Create scatter plot for F-K scores with points
fk_points = alt.Chart(avg_scores).mark_point(size=100).encode(
    x=alt.X('Year:O', title='Year'),
    y=alt.Y('F-K Score:Q', title='Average F-K Score', scale=alt.Scale(domain=fk_range)),
    tooltip=[
        alt.Tooltip('Year:O', title='Year'),
        alt.Tooltip('F-K Score:Q', title='Average F-K Score')
    ],
    opacity=alt.condition(nearest, alt.value(1), alt.value(0.5))
).add_params(
    nearest
)

# Combine the line chart and points for F-K chart
fk_chart = alt.layer(fk_line, fk_points).properties(
    title='Average F-K Score per Year',
    width=700,
    height=400
)

# Calculate the y-axis range for FRE scores
fre_min = avg_scores['FRE Score'].min()
fre_max = avg_scores['FRE Score'].max()
fre_range = [fre_min - (fre_max - fre_min) * 0.1, fre_max + (fre_max - fre_min) * 0.1]

# Create line chart for FRE scores
fre_line = alt.Chart(avg_scores).mark_line(strokeWidth=3).encode(
    x=alt.X('Year:O', title='Year'),
    y=alt.Y('FRE Score:Q', title='Average FRE Score', scale=alt.Scale(domain=fre_range))
)

# Create scatter plot for FRE scores with points
fre_points = alt.Chart(avg_scores).mark_point(size=100).encode(
    x=alt.X('Year:O', title='Year'),
    y=alt.Y('FRE Score:Q', title='Average FRE Score', scale=alt.Scale(domain=fre_range)),
    tooltip=[
        alt.Tooltip('Year:O', title='Year'),
        alt.Tooltip('FRE Score:Q', title='Average FRE Score')
    ],
    opacity=alt.condition(nearest, alt.value(1), alt.value(0.5))
).add_params(
    nearest
)

# Combine
fre_chart = alt.layer(fre_line, fre_points).properties(
    title='Average FRE Score per Year',
    width=700,
    height=400
)

# Create an annotation for the year 2005
annotation_fk = alt.Chart(pd.DataFrame({'Year': [2005], 'F-K Score': [avg_scores.loc[avg_scores['Year'] == 2005, 'F-K Score'].values[0]]})).mark_text(
    align='left', dx=-300, dy=0, fontSize=12, fontWeight='bold', text='Chief Justice change'
)

annotation_fre = alt.Chart(pd.DataFrame({'Year': [2005], 'FRE Score': [avg_scores.loc[avg_scores['Year'] == 2005, 'FRE Score'].values[0]]})).mark_text(
    align='left', dx=-300, dy=-185, fontSize=12, fontWeight='bold', text='Chief Justice change'
)

# Add an arrow pointing to 2005 in the F-K chart
arrow_fk = alt.Chart(pd.DataFrame({'Year': [2005], 'F-K Score': [avg_scores.loc[avg_scores['Year'] == 2005, 'F-K Score'].values[0]]})).mark_rule(color='red').encode(
    x='Year:O',
    y='F-K Score:Q'
)

# Add an arrow pointing to 2005 in the FRE chart
arrow_fre = alt.Chart(pd.DataFrame({'Year': [2005], 'FRE Score': [avg_scores.loc[avg_scores['Year'] == 2005, 'FRE Score'].values[0]]})).mark_rule(color='red').encode(
    x='Year:O',
    y='FRE Score:Q'
)

# Combine the annotation and arrows with the existing charts
fre_chart = alt.layer(fre_line, fre_points, arrow_fre, annotation_fre).properties(
    title='Average FRE Score per Year',
    width=700,
    height=400
)

fk_chart = alt.layer(fk_line, fk_points, arrow_fk, annotation_fk).properties(
    title='Average F-K Score per Year',
    width=700,
    height=400
)

# Display
fre_chart & fk_chart



Interesting stuff.  FRE scores were pretty difficult but were trending easier until Chief Justice Rehnquist passed in 2005.  From there, Chief Justice Roberts was instated and FRE scores took a spike towards the more difficult, but since have been trending a little easier.

F-K scores have some heavy dips and spikes across the years but have generally trended towards more difficult.  But wait, shouldn't FRE score trends and F-K score trends match?  Why the difference?

The most likely reason is sensitivity.  FRE scores tend to be more sensitive to sentence length, while F-K scores tend to be more sensitive to syllable count per word.  In the case of F-K score variation, the spikes and dips are most likely due to word choice.

These trends are fun to look at and all, but how do they compare to actual reading we're familiar with?  We decided to add a couple more scores for context.

In [8]:
# Read the Excel file into a DataFrame
additional_scores = pd.read_excel('Other Scores.xlsx')
# Remove the first row in the dataset
additional_scores = additional_scores.iloc[1:]
# Print out the data
print(additional_scores)

                          Source        FRE        F-K
1               King James Bible  73.975645  10.301877
2                   Harry Potter  83.314779   4.786345
3  Average Citizen Reading Level  70.000000   6.000000


In [9]:
# Update the y-axis ranges to include additional scores
fk_min = min(avg_scores['F-K Score'].min(), additional_scores['F-K'].min())
fk_max = max(avg_scores['F-K Score'].max(), additional_scores['F-K'].max())
fk_range = [fk_min - (fk_max - fk_min) * 0.1, fk_max + (fk_max - fk_min) * 0.1]

fre_min = min(avg_scores['FRE Score'].min(), additional_scores['FRE'].min())
fre_max = max(avg_scores['FRE Score'].max(), additional_scores['FRE'].max())
fre_range = [fre_min - (fre_max - fre_min) * 0.1, fre_max + (fre_max - fre_min) * 0.1]

# Create horizontal lines for F-K scores with labels
fk_lines = alt.Chart(additional_scores).mark_rule(color='green', strokeWidth=3).encode(
    y=alt.Y('F-K:Q', title='Average F-K Score', scale=alt.Scale(domain=fk_range)),
    tooltip=[
        alt.Tooltip('Source:N', title='Source'),
        alt.Tooltip('F-K:Q', title='F-K Score')
    ]
)

fk_labels = alt.Chart(additional_scores).mark_text(
    align='left', dx=5, dy=-5, color='green'
).encode(
    y=alt.Y('F-K:Q', title='Average F-K Score', scale=alt.Scale(domain=fk_range)),
    text='Source:N'
)

# Create horizontal lines for FRE scores with labels
fre_lines = alt.Chart(additional_scores).mark_rule(color='green', strokeWidth=3).encode(
    y=alt.Y('FRE:Q', title='Average FRE Score', scale=alt.Scale(domain=fre_range)),
    tooltip=[
        alt.Tooltip('Source:N', title='Source'),
        alt.Tooltip('FRE:Q', title='FRE Score')
    ]
)

fre_labels = alt.Chart(additional_scores).mark_text(
    align='left', dx=5, dy=-5, color='green'
).encode(
    y=alt.Y('FRE:Q', title='Average FRE Score', scale=alt.Scale(domain=fre_range)),
    text='Source:N'
)

# Combine the horizontal lines and labels with the existing FRE chart
fre_chart = alt.layer(fre_line, fre_points, fre_lines, fre_labels).properties(
    title='Average FRE Score per Year',
    width=700,
    height=400
)

# Combine the horizontal lines and labels with the existing F-K chart
fk_chart = alt.layer(fk_line, fk_points, fk_lines, fk_labels).properties(
    title='Average F-K Score per Year',
    width=700,
    height=400
)

# Display the combined charts
fre_chart & fk_chart

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


Remember, high FRE scores are easy to read while high F-K scores are more difficult to read.

This is kind of crazy!  An average US citizen reads at a 7th grade level (F-K score: 7, FRE score:70).  According to both FRE score and F-K score, on average, no Supreme Court decision in the past 23 years has been easy enough to read for the average US citizen.

We decided it was time to get a little more granular.  Supreme Court Decisions are not single-author documents that are all impossible to read.  They have multiple sections, allowing multiple justices to weigh in on the case at hand.  While the first opinion is the decision of the court and is authored by one justice, other justices can write opinions to further explain their thoughts on a case.  The only limit to the number of opinions on a case is how many justices wish to write them.  Here’s a view of readability by justice.

In [10]:
# Code for Individual Justice over Time

#Average FRE and F-K scores per year per author
single_avg = (
    long_df.groupby(['Year', 'Author(s)'])
    .agg(
        FRE_Score=('FRE Score', 'mean'),
        FK_Score=('F-K Score', 'mean')
    )
    .reset_index()
)

justice_selection = alt.selection_point(fields=['Author(s)'])

#Nearest tooltip
nearest = alt.selection(
    type='single',
    nearest=True,
    on='mouseover',
    fields=['Year', 'Author(s)'],
    empty='none'
)

#Highlight
highlight_color = alt.condition(
    justice_selection,
    alt.Color('Author(s):N', legend=None, scale=alt.Scale(scheme='category10')),
    alt.value('lightgray')
)

#FRE Chart
line_chart_fre = alt.Chart(single_avg).mark_line().encode(
    x=alt.X('Year:O', title='Year'),
    y=alt.Y('FRE_Score:Q', title='FRE Score', scale=alt.Scale(domain=[35, 75])),
    color=highlight_color
).properties(
    title='Line Chart of FRE Scores by Author Over Time',
    width=600,
    height=400
).add_params(justice_selection)

#FRE tooltips
fre_tooltip = alt.Chart(single_avg).mark_point(size=50, opacity=0).encode(
    x='Year:O',
    y='FRE_Score:Q',
    tooltip=[
        'Year:O', 
        'FRE_Score:Q', 
        'Author(s):N'
    ]
).add_params(nearest)

fre_combined_chart = line_chart_fre + fre_tooltip

#FRE legend
fre_legend = alt.Chart(single_avg).mark_point().encode(
    y=alt.Y('Author(s):N', axis=alt.Axis(title='Author(s)', labelLimit=100)),
    color=highlight_color
).properties(
    width=50
).add_params(justice_selection)

#F-K Chart
line_chart_fk = alt.Chart(single_avg).mark_line().encode(
    x=alt.X('Year:O', title='Year'),
    y=alt.Y('FK_Score:Q', title='F-K Score', scale=alt.Scale(domain=[4, 12])),
    color=highlight_color
).properties(
    title='Line Chart of F-K Scores by Author Over Time',
    width=600,
    height=400
).add_params(justice_selection)

#F-K tooltip
fk_tooltip = alt.Chart(single_avg).mark_point(size=50, opacity=0).encode(
    x='Year:O',
    y='FK_Score:Q',
    tooltip=[
        'Year:O', 
        'FK_Score:Q', 
        'Author(s):N'
    ]
).add_params(nearest)

fk_combined_chart = line_chart_fk + fk_tooltip

#F-K legend
fk_legend = alt.Chart(single_avg).mark_point().encode(
    y=alt.Y('Author(s):N', axis=alt.Axis(title='Author(s)', labelLimit=100)),
    color=highlight_color
).properties(
    width=50
).add_params(justice_selection)

#Combine
fre_combined = alt.hconcat(fre_combined_chart, fre_legend).resolve_scale(color='independent')
fk_combined = alt.hconcat(fk_combined_chart, fk_legend).resolve_scale(color='independent')

line_charts = alt.vconcat(fre_combined, fk_combined).configure_view(
    stroke=None
)

#Display
line_charts

   Use 'selection_point()' or 'selection_interval()' instead; these functions also include more helpful docstrings.
        combined and should be specified using "selection_point()".
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


Even when broken down by justice, very rarely has any justice written within the reading level of an average American.  According to FRE score, Justice Scalia was the only justice to do so, only once dipping an average score past 70 in 2005.  According to F-K score, more opinions have been readable, but the likelihood of a Supreme Court decision being accessible to the average American is very low.

We wondered what might affect the complexity of opinion writing.  In a Supreme Court Decision, after the opinion of the court is written, in the following opinions, the other justices can specify whether or not they agree with the ruling of the majority.  This is done by labeling their opinion as Concurring or Dissenting.  We wondered if Dissenting or Concurring opinions might be more or less complex.  Here's a view of justice readability by their Concurring and Dissenting opinions.

In [11]:
### Combine all opinions and authors into respective columns
# Concatenate data for all opinions
opinion_columns = [col for col in df.columns if 'Opinion' in col and 'Score' in col]
author_columns = [col for col in df.columns if 'Opinion' in col and 'Author(s)' in col]
type_columns = [col for col in df.columns if 'Opinion' in col and 'Type' in col]
# Ensure the Year column remains throughout this transformation
long_df_2 = pd.DataFrame()
for i in range(1, 10):  # Assuming there are up to 9 opinions
    fre_col = f'Opinion {i} FRE Score'
    fk_col = f'Opinion {i} F-K Score'
    author_col = f'Opinion {i} Author(s)'
    type_col = f'Opinion {i} Type'
    if fre_col in df.columns and fk_col in df.columns and author_col in df.columns and type_col in df.columns:
        temp_df = df[['Year', fre_col, fk_col, author_col, type_col, 'Controversial']].copy()
        temp_df.columns = ['Year', 'FRE Score', 'F-K Score', 'Author(s)', 'Type', 'Controversial']
        temp_df['Opinion'] = f'Opinion {i}'
        long_df_2 = pd.concat([long_df_2, temp_df], ignore_index=True)

long_df_2['Author(s)'] = long_df_2['Author(s)'].str.split(',').str[0]
long_df_2['Author(s)'] = long_df_2['Author(s)'].apply(lambda x: x.split(' and ')[0])

#Drop all null authors
long_df_2 = long_df_2.dropna()
# Update messy Types to 'concurring' or 'dissenting'
long_df_2['Type'] = long_df_2['Type'].replace({
    '_x000D_\nconcurring': 'concurring',
    '_x000D_\ndissenting': 'dissenting'
})

# Print rows where Type isn't dissenting, concurring, or both
invalid_types = long_df_2[~long_df_2['Type'].isin(['dissenting', 'concurring', 'concurring and dissenting'])]
long_df_2 = long_df_2.drop(invalid_types.index)

# Drop rows where Type is 'concurring and dissenting'
long_df_2 = long_df_2[long_df_2['Type'] != 'concurring and dissenting']

# Verify the update
print(long_df_2['Type'].value_counts())


Type
dissenting    1074
concurring     966
Name: count, dtype: int64


In [12]:
import altair as alt

# Create the heatmap for FRE scores
heatmap_fre = alt.Chart(long_df_2).mark_rect().encode(
    x=alt.X('Author(s):N', title='Author', sort='ascending'),
    y=alt.Y('Type:N', title='Type'),
    color=alt.Color('mean(FRE Score):Q', title='Average FRE Score', scale=alt.Scale(range=['brown', 'lime'])),
    tooltip=[
        alt.Tooltip('Author(s):N', title='Author'),
        alt.Tooltip('Type:N', title='Type'),
        alt.Tooltip('mean(FRE Score):Q', title='Average FRE Score')
    ]
).properties(
    title='Heatmap of Average FRE Scores by Opinion Type and Author',
    width=800,
    height=400
).configure_axis(
    labelAngle=-45
)

# Create the heatmap for F-K scores
heatmap_fk = alt.Chart(long_df_2).mark_rect().encode(
    x=alt.X('Author(s):N', title='Author', sort='ascending'),
    y=alt.Y('Type:N', title='Type'),
    color=alt.Color('mean(F-K Score):Q', title='Average F-K Score', scale=alt.Scale(range=['lime', 'brown'])),
    tooltip=[
        alt.Tooltip('Author(s):N', title='Author'),
        alt.Tooltip('Type:N', title='Type'),
        alt.Tooltip('mean(F-K Score):Q', title='Average F-K Score')
    ]
).properties(
    title='Heatmap of Average F-K Scores by Opinion Type and Author',
    width=800,
    height=400
).configure_axis(
    labelAngle=-45
)

# Display the heatmaps
heatmap_fre.display()
heatmap_fk.display()

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


Interestingly, these visualizations show that, except for a few exceptions, most justices don't have any significant differences in the complexity of their Concurring and Dissenting opinions.  Even the justices with demonstrable differences in opinion complexity barely stray into readable territory for the average American.

Finally, we wondered if the controversy of a case would influence the complexity of writing in a justice’s opinion.  For an internal measure of controversy we decided on cases that had at least two Dissenting opinions.  Here's a view of readability and controversy.

In [13]:
aggregated_fre = long_df.groupby(['Controversial'])['FRE Score'].agg(
    total_count='size',
    min_score='min',
    max_score='max',
    avg_score='mean'
).reset_index()

# Calculate summary statistics for Controversial and Non-Controversial cases (F-K)
aggregated_fk = long_df.groupby(['Controversial'])['F-K Score'].agg(
    total_count='size',
    min_score='min',
    max_score='max',
    avg_score='mean'
).reset_index()

# Create a violin plot for FRE scores by Controversial flag
violin_plot_controversial_fre = alt.Chart(long_df).transform_filter(
    alt.FieldOneOfPredicate(field='Controversial', oneOf=[0, 1])
).transform_density(
    density='FRE Score',
    as_=['FRE Score', 'density'],
    extent=[long_df['FRE Score'].min(), long_df['FRE Score'].max()],
    groupby=['Controversial']
).transform_lookup(
    lookup='Controversial',
    from_=alt.LookupData(aggregated_fre, 'Controversial', ['total_count', 'min_score', 'max_score', 'avg_score'])
).mark_area(orient='horizontal').encode(
    alt.X('density:Q')
        .stack('center')
        .impute(None)
        .title(None)
        .axis(labels=False, values=[0], grid=False, ticks=True),
    alt.Y('FRE Score:Q'),
    alt.Color('Controversial:N',
        scale=alt.Scale(domain=[0, 1], range=['#1f77b4', '#d62728']),  # Blue for 0, Red for 1
        legend=alt.Legend(
            title="Controversy",
            values=[0, 1],
            labelExpr="datum.value == 0 ? 'Not Controversial' : 'Controversial'"
        )
    ),
    alt.Column('Controversial:N')
        .spacing(10)
        .header(titleOrient='bottom', labelOrient='bottom', labelPadding=0),
    tooltip=[
        alt.Tooltip('total_count:Q', title='Total Count'),
        alt.Tooltip('min_score:Q', title='Min Score'),
        alt.Tooltip('max_score:Q', title='Max Score'),
        alt.Tooltip('avg_score:Q', title='Avg Score')
    ]
).properties(
    title='Violin Plot of FRE Scores by Controversial Flag',
    width=100,
    height=400
)

# Create a violin plot for F-K scores by Controversial flag
violin_plot_controversial_fk = alt.Chart(long_df).transform_filter(
    alt.FieldOneOfPredicate(field='Controversial', oneOf=[0, 1])
).transform_density(
    density='F-K Score',
    as_=['F-K Score', 'density'],
    extent=[long_df['F-K Score'].min(), long_df['F-K Score'].max()],
    groupby=['Controversial']
).transform_lookup(
    lookup='Controversial',
    from_=alt.LookupData(aggregated_fk, 'Controversial', ['total_count', 'min_score', 'max_score', 'avg_score'])
).mark_area(orient='horizontal').encode(
    alt.X('density:Q')
        .stack('center')
        .impute(None)
        .title(None)
        .axis(labels=False, values=[0], grid=False, ticks=True),
    alt.Y('F-K Score:Q'),
    alt.Color('Controversial:N',
        scale=alt.Scale(domain=[0, 1], range=['#1f77b4', '#d62728']),  # Blue for 0, Red for 1
        legend=alt.Legend(
            title="Controversy",
            values=[0, 1],
            labelExpr="datum.value == 0 ? 'Not Controversial' : 'Controversial'"
        )
    ),
    alt.Column('Controversial:N')
        .spacing(10)
        .header(titleOrient='bottom', labelOrient='bottom', labelPadding=0),
    tooltip=[
        alt.Tooltip('total_count:Q', title='Total Count'),
        alt.Tooltip('min_score:Q', title='Min Score'),
        alt.Tooltip('max_score:Q', title='Max Score'),
        alt.Tooltip('avg_score:Q', title='Avg Score')
    ]
).properties(
    title='Violin Plot of F-K Scores by Controversial Flag',
    width=100,
    height=400
)

# Concatenate the violin plots for controversial flag vertically and configure the view at the top level
violin_plots_controversial = alt.hconcat(
    violin_plot_controversial_fre, 
    violin_plot_controversial_fk
).configure_view(
    stroke=None
)

# Display both visualizations side by side
violin_plots_controversial.display()

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)
  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


In [14]:
# Aggregate the data to get the total case count for each author, split by controversial and non-controversial cases
author_case_stats = long_df.groupby(['Author(s)', 'Controversial']).agg(
    case_count=('F-K Score', 'size'),
    mean_fk_score=('F-K Score', 'mean'),
    min_fk_score=('F-K Score', 'min'),
    max_fk_score=('F-K Score', 'max'),
    mean_fre_score=('FRE Score', 'mean'),
    min_fre_score=('FRE Score', 'min'),
    max_fre_score=('FRE Score', 'max')
).reset_index()

# Create the stacked bar chart
stacked_bar_chart = alt.Chart(author_case_stats).mark_bar().encode(
    x=alt.X('Author(s):N', title='Author', sort=alt.EncodingSortField(field='case_count', op='sum', order='descending')),
    y=alt.Y('case_count:Q', title='Total Case Count'),
    color=alt.Color('Controversial:N', title='Controversial', scale=alt.Scale(domain=[0, 1], range=['#1f77b4', '#d62728']),
        legend=alt.Legend(
            title="Controversy",
            values=[0, 1],
            labelExpr="datum.value == 0 ? 'Not Controversial' : 'Controversial'"
        )
    ),
    tooltip=[
        alt.Tooltip('Author(s):N', title='Author'),
        alt.Tooltip('case_count:Q', title='Case Count'),
        alt.Tooltip('Controversial:N', title='Controversial'),
        alt.Tooltip('mean_fk_score:Q', title='Mean F-K Score'),
        alt.Tooltip('min_fk_score:Q', title='Min F-K Score'),
        alt.Tooltip('max_fk_score:Q', title='Max F-K Score'),
        alt.Tooltip('mean_fre_score:Q', title='Mean FRE Score'),
        alt.Tooltip('min_fre_score:Q', title='Min FRE Score'),
        alt.Tooltip('max_fre_score:Q', title='Max FRE Score')
    ]
).properties(
    title='Total Case Count by Author (Stacked by Controversial and Non-Controversial Cases)',
    width=800,
    height=400
).configure_axis(
    labelAngle=-45
)

# Display the chart
stacked_bar_chart.display()

  col = df[col_name].apply(to_list_if_array, convert_dtype=False)


We found that readability isn't really affected by controversy.  Typically, Justices don't lean towards complexity or simplicity when writing concurring or dissenting opinions and that trend extends even into more complex and nuanced cases.  Readability just isn't affected by controversy in the Supreme Court.

### Summary

We've explored a lot about readability of Supreme Court Decisions but ultimately why does it matter?

As the final appellate court, the Supreme Court holds a lot of power.  They have the jurisdiction to interpret the Constitution, the responsibility to protect the rights of the American people, and the duty to mediate controversy.  Ultimately, the Court should serve the American people.  As such, it doesn't make a ton of sense that the American people should be locked out of understanding the decisions of the Court because of inaccessible writing.

The public should be able to understand controversies surrounding their rights and should be able to understand when and why those rights have been protected by the Supreme Court.  All rulings by the Court should be accessible to the general public.

This project is a very small review of a small problem within the judicial branch of the American government, but a small change towards simpler writing would have a massive impact towards empowering the American people and pushing them towards more active participation in government.