# Summary
This notebook provides a basic analysis of the comments including example comments, how many comments are from templates, and how many unique comments exist.

In [19]:
import pandas
import numpy as np
import folium
from folium import plugins

The comments are loaded in from the Data Cleanup notebook.

In [20]:
data = pandas.read_json('./data/comments_cleaned.json', orient='records', dtype='false')

Below is a print out of the pandas dataframe schema and first comment.

In [39]:
print("Loaded %d comments." % len(data))
display(data[:1])

Loaded 12225 comments.


Unnamed: 0,doc.attachment_download,doc.attachment_download -href,doc.attachment_name,doc.category,doc.city,doc.comment_body,doc.country,doc.name,doc.state,doc.zip
0,,,,,United States,"Dear Assistant General Counsel Hilary Malawer,...",Parent/Relative,Heather Hirsch,MN,55016


A sample of the comment_body in the dataset.

In [22]:
with pandas.option_context('display.max_colwidth', 5000):
    display(data[['doc.comment_body']].sample(5))

Unnamed: 0,doc.comment_body
5791,"Dear Assistant General Counsel Hilary Malawer,\n\nAll Department of Education civil rights regulations and guidance documents are important and necessary. Far from being burdensome, current civil rights rules and regulations benefit schools and students by providing a clear framework that, when followed, allow all students an equal opportunity to learn in a safe and welcoming environment regardless of sex, race, color, national origin, disability status, English proficiency, sexual orientation, or gender identity.\nI urge the Department to keep in its current form 34 C.F.R. pts. 1 thru 1299 , which include regulations governing the Secretary and the offices for Civil Rights; Elementary and Secondary Education; Special Education and Rehabilitative Services; Career, Technical, and Adult Education; Post-Secondary Education; Educational Research and Improvement; and the National Council on Disability. \n\nI also urge the Department to preserve all current significant guidance documents, including guidance on sexual, racial, and disability-based harassment (including guidance on sexual violence); access to athletic opportunities; gender equity in career and technical education; single-sex schools; equal access to educational resources; nondiscriminatory school discipline; racial diversity programs; the rights of students with disabilities in charter schools; restraint and seclusion of students with disabilities; and the rights of English language learners. I urge you to keep current regulations and guidance in place, and to continue enforcing these critical civil rights laws so that all students have an equal opportunity to learn and thrive.\n\nSincerely,\nMatthew Faulkner\n Shawnee, KS 66203"
6461,"Dear Assistant General Counsel Hilary Malawer,\n\nAll Department of Education civil rights regulations and guidance documents are important and necessary. Far from being burdensome, current civil rights rules and regulations benefit schools and students by providing a clear framework that, when followed, allow all students an equal opportunity to learn in a safe and welcoming environment regardless of sex, race, color, national origin, disability status, English proficiency, sexual orientation, or gender identity.\nI urge the Department to keep in its current form 34 C.F.R. pts. 1 thru 1299 , which include regulations governing the Secretary and the offices for Civil Rights; Elementary and Secondary Education; Special Education and Rehabilitative Services; Career, Technical, and Adult Education; Post-Secondary Education; Educational Research and Improvement; and the National Council on Disability. \n\nI also urge the Department to preserve all current significant guidance documents, including guidance on sexual, racial, and disability-based harassment (including guidance on sexual violence); access to athletic opportunities; gender equity in career and technical education; single-sex schools; equal access to educational resources; nondiscriminatory school discipline; racial diversity programs; the rights of students with disabilities in charter schools; restraint and seclusion of students with disabilities; and the rights of English language learners. I urge you to keep current regulations and guidance in place, and to continue enforcing these critical civil rights laws so that all students have an equal opportunity to learn and thrive.\n\nSincerely,\nKathie Noga\n Minneapolis, MN 55404"
6358,"Dear Assistant General Counsel Hilary Malawer,\n\nAll Department of Education civil rights regulations and guidance documents are important and necessary. Far from being burdensome, current civil rights rules and regulations benefit schools and students by providing a clear framework that, when followed, allow all students an equal opportunity to learn in a safe and welcoming environment regardless of sex, race, color, national origin, disability status, English proficiency, sexual orientation, or gender identity.\nI urge the Department to keep in its current form 34 C.F.R. pts. 1 thru 1299 , which include regulations governing the Secretary and the offices for Civil Rights; Elementary and Secondary Education; Special Education and Rehabilitative Services; Career, Technical, and Adult Education; Post-Secondary Education; Educational Research and Improvement; and the National Council on Disability. \n\nI also urge the Department to preserve all current significant guidance documents, including guidance on sexual, racial, and disability-based harassment (including guidance on sexual violence); access to athletic opportunities; gender equity in career and technical education; single-sex schools; equal access to educational resources; nondiscriminatory school discipline; racial diversity programs; the rights of students with disabilities in charter schools; restraint and seclusion of students with disabilities; and the rights of English language learners. I urge you to keep current regulations and guidance in place, and to continue enforcing these critical civil rights laws so that all students have an equal opportunity to learn and thrive.\n\nSincerely,\nSuzanne Goodenberger\n Council Bluffs, IA 51501"
11263,"We are requesting this modification to the list of Related Services so as to clarify the needs of children who are deafblind to have the services of a qualified intervener. \n \n- Including intervener services in the related services list will significantly increase the opportunities for children who are deafblind to have these services considered in their IEP process. \n \n- The use of interveners in educational settings continues to increase across the country. \n \n- Intervener services have been recognized by OSEP as a credible service delivery option for children who are deafblind under the IEP process.\n \n- States are hesitant to recognize the need for these services until they are recognized at a national level. \n \n- Due to the low incidence nature of deafblindness, the needs of these children to have a qualified intervener as part of FAPE are poorly understood.\n \n- Many local education agencies are not aware of what intervener services are and are hesitant to provide them.\n \n- Parents struggle to have these services considered in the IEP process because of the lack of awareness and understanding of their child's needs for these services.\n \nI have been an intervener for many years and it is a separate and necessary service."
8138,"I remember before Title IX - I was told I couldn't do things for no other reason than I was a girl. Sport opportunities were fewer, and less resourced, than the boys had. Vocational options in school were divided by gender. My daughters now have opportunities that I didn't have because of Title IX. Retain, and enforce Title IX as it makes a difference for everyone, not just girls, since it also requires protection against sexual harassment and gender related abuse. \n\nTitle IX has done great things for our society but inequities persist for our girls. The current rules and regulations of Title IX, paired with strong enforcement, are crucial to ensuring our country is able to fulfill its promise of an equitable educational experience for all students. Sports provide its participants with lifelong health, education and leadership benefits equally important to our daughters and our sons. I urge you to keep Title IX strong by preserving the current rules and enforcing this crucial legislation."


Below is a list of common strings to use in pattern matching and see how many similar comments exist. This list is handpicked and the matching is based on the one conducted [here](https://github.com/j2kao/fcc_nn_research/blob/master/proc_17_108_analysis_01_level_0_manual_tagging.ipynb) for FCC comments

In [23]:
common_strings = "Dear Assistant General Counsel Hilary Malawer,\n\nAll Department of Education civil rights regulations and guidance documents are important and necessary"
common_string2 = "Dear Assistant General Counsel Hilary Malawer,\n\nCurrent federal regulations and guidance help all studentsregardless of sex, race, color, sexual orientation,"

In [35]:
dup_removed = data.copy()

# Find rows containing the common strings
invalid_indexes = []
invalid_indexes.extend(data[data['doc.comment_body'].str.contains(common_strings, case=False, na=False)].index.values)
invalid_indexes.extend(data[data['doc.comment_body'].str.contains(common_string2, case=False, na=False)].index.values)

dup_removed.drop(invalid_indexes, inplace=True)
dup_removed = dup_removed.reset_index(drop=True)

# Saving the dataset with duplicates removed for later use
dup_removed.to_json('./data/comments_duplicates_removed.json')

print("Total comments after removing duplicates: ", len(dup_removed))

Total comments after removing duplicates:  2868


Another look at the dataset, now with the most common comments removed.

In [25]:
with pandas.option_context('display.max_colwidth', 5000):
    display(dup_removed[['doc.comment_body']].sample(50))

Unnamed: 0,doc.comment_body
7808,"Department of Education ED/OII, Completed Actions, Charter Schools Grants to SEAs, 1855-AA12""Gainful Employment"" rule to prepare students for employment in a recognized occupation Income-driven ""pay as you earn"" program, Race to the Top, Every Student Succeeds Act (ESSA)"
11022,"Subject; <WIOA> In accordance with Executive Order 13777, Enforcing the Regulatory Reform Agenda, the dept of Education is seeking input on regulations THAT MAY BE APPROPRIATE FOR REPEAL, REPLACEMENT, OR MODIFICATION.\n\n We request that the homemaker rules must be repealed because a person that has LOST their eye sight must have the chance to be trained to be Independent and be able to live on their own. The cost of this training is much less than having to put this person in a facility because they cant take care of themselves. The cost of this training would be a fraction to the Department of Rehabilitation compared to the cost to Medicare, Medi-cade, social security at an early age, or just their care being charged to one of the disability programs.\n\n This training is for maybe 40 weeks and then the person would be able to live independently for the REST of their LIFE. Wouldnt you agree that would be a better quality of life and much more of a better way to spend TAX PAYERS money than just putting then in a facility that they couldnt pay the fee to stay the for the rest of the life with no way to HAVE A CHANCE TO BE A PRODUCTIVE PERSON AGAIN. Just because they go thru the training as a HOMEMAKER is NOT the reason or mean things could change after they get this training for INDEPENDENCE \n\n We really need you to repeal order 13777 for the reasons I have stated plus many more reasons. Maybe they lost their sight serving our Country, State, or Community, or they lost it doing the job that took care of the family.\n\n Thank you Leonard A. Blottin President of the Board of the SAN DIEGO CENTER FOR THE BLIND. \n\n A non profit 501C-3 Organization.\n"
9116,protect victims of college campus sexual assault rather than pretend a serious problem doesn't exist!
4646,"Dear Assistant General Counsel Hilary Malawer,\n\nLet's stop selling everything to greedy people.\n\nAll Department of Education civil rights regulations and guidance documents are important and necessary. Far from being burdensome, current civil rights rules and regulations benefit schools and students by providing a clear framework that, when followed, allow all students an equal opportunity to learn in a safe and welcoming environment regardless of sex, race, color, national origin, disability status, English proficiency, sexual orientation, or gender identity.\nI urge the Department to keep in its current form 34 C.F.R. pts. 1 thru 1299 , which include regulations governing the Secretary and the offices for Civil Rights; Elementary and Secondary Education; Special Education and Rehabilitative Services; Career, Technical, and Adult Education; Post-Secondary Education; Educational Research and Improvement; and the National Council on Disability. \n\nI also urge the Department to preserve all current significant guidance documents, including guidance on sexual, racial, and disability-based harassment (including guidance on sexual violence); access to athletic opportunities; gender equity in career and technical education; single-sex schools; equal access to educational resources; nondiscriminatory school discipline; racial diversity programs; the rights of students with disabilities in charter schools; restraint and seclusion of students with disabilities; and the rights of English language learners. I urge you to keep current regulations and guidance in place, and to continue enforcing these critical civil rights laws so that all students have an equal opportunity to learn and thrive.\n\nSincerely,\nJohn Wilson\n Magdalena, NM 87825"
9725,"The mission of the Department of education is to educate the people of our land. Choosing to educate only some of the people does not fulfill the mission. Allowing people to get their own education does not fulfill the mission because some people are not able to get their own education and without help would not become educated.\n Therefore do not weaken, repeal, ignore or fail to enforce existing legislation which protects the civil rights of all Americans and makes it illegal to discriminate against those who otherwise would not receive the blessing of the education to which they are entitled."
10344,"Dear Assistant General Counsel Hilary Malawer,\nAll Department of Education civil rights regulations and guidance documents need to be remain. Current civil rights rules and regulations benefit schools, students, and families by providing a clear framework that allows all students an equal opportunity to learn in a safe and welcoming environment regardless of sex, race, color, national origin, disability status, English proficiency, sexual orientation, or gender identity.\nI urge the Department to keep in its current form 34 C.F.R. pts. 1 thru 1299 , which include regulations governing the Secretary and the offices for Civil Rights; Elementary and Secondary Education; Special Education and Rehabilitative Services; Career, Technical, and Adult Education; Post-Secondary Education; Educational Research and Improvement; and the National Council on Disability. \nI also urge the Department to preserve all current significant guidance documents, including guidance on sexual, racial, and disability-based harassment (including guidance on sexual violence); access to athletic opportunities; gender equity in career and technical education; single-sex schools; equal access to educational resources; nondiscriminatory school discipline; racial diversity programs; the rights of students with disabilities in charter schools; restraint and seclusion of students with disabilities; and the rights of English language learners. I urge you to keep current regulations and guidance in place, and to continue enforcing these critical civil rights laws so that all students have an equal opportunity to learn and thrive.\nAlso please include the following suggestions for keeping, modifying, or rescinding regulations in this document https://docs.google.com/.../1bLJ2uu9UePJJkjEdcM7H.../edit..."
8601,"I am commenting on Regulation ID: ED-2015-OSERS-001-1167, State Vocational Rehabilitation Services Program; State Supported Employment Services Program; Limitations on Use of Sub-Minimum Wage.\n\nCongress did not intend WIOA to take away a full of array of vocational services from individuals with disabilities, nor the opportunities that have been denied to people age 24 and under. This has been promulgated by a group pushing their narrow agenda at the expense of people with disabilities and their families. Please return the freedom to people with disabilities to select the vocational service that best serves them.\n\nOne size does not fit all!\n\nMargaret Winn"
8348,"Sexual violence is the most underreported, unconvicted violent crime in America. Title IX provides critical protections for students to continue to pursue their education while overcoming the trauma of sexual abuse. The Dear Colleague letter and other activities taken to promote Title IX under the Obama Administration encouraged schools to take prompt action to increase protections for survivors of sexual violence. \nPER NSVRC: The majority of sexual assaults, an estimated 63 percent, are never reported to the police (Rennison, 2002). The prevalence of false reporting cases of sexual violence is low (Lisak, Gardinier, Nicksa, & Cote, 2010), yet when survivors come forward, many face scrutiny or encounter barriers. For example, when an assault is reported, survivors may feel that their victimization has been redefined and even distorted by those who investigate, process, and categorize cases. Sexual assault victims commonly struggle with a range of emotions that make it difficult for them to report or disclose abuse. Often, victims who do report will delay doing so (Archambault & Lonsway, 2006) for a variety of reasons that are connected to neurobiological and psychological responses to their assault (D'Anniballe, 2010). For example, victims may struggle to remember precise details of the assault or experience negative feelings when doing so (D'Anniballe, 2010). Victims may worry about how reporting will affect their family or friends (Campbell, 1998). Further, they may be fearful of family fracture if the person sexually assaulting them is a family member (Campbell & Raja, 1999).In addition, completing the forensic exam or ""rape kit,"" can be a struggle for victims. For example, answering personal questions, enduring an intensive physical exam and evidence collection prevents some victims from pursuing a criminal justice resolution.To date, much of the research conducted on the prevalence of false allegations of sexual assaults is unreliable because of inconsistencies with definitions and methods employed to evaluate data (Archambault, n.d.). A review of research finds that the prevalence of false reporting is between 2 percent and 10 percent. The following studies support these findings: \n- A multi-site study of eight U.S. communities including 2,059 cases of sexual assault found a 7.1 percent rate of false reports (Lonsway,Archambault, & Lisak, 2009).\n- A study of 136 sexual assault cases in Boston from 1998-2007 found a 5.9 percent rate of false reports (Lisak et al., 2010).\n- Using qualitative and quantitative analysis, researchers studied 812 reports of sexual assault from 2000-2003 and found a 2.1 percent rate of false reports (Heenan & Murray 2006). \nResearch shows that rates of false reporting are frequently inflated, in part because of inconsistent definitions and protocols, or a weak understanding of sexual assault. Misconceptions about false reporting rates have direct, negative consequences and can contribute to why many victims don't report sexual assaults (Lisak et al., 2010). To improve the response to victims\nof sexual violence, law enforcement and service providers need a thorough understanding of sexual violence and consistency in their definitions, policies and procedures.(https://www.nsvrc.org/sites/default/files/Publications_NSVRC_Overview_False-Reporting.pdf)"
11245,"The rules for cost of attendance for less-than-half-time students are burdensome and inconsistent. Room and Board can sometimes, but not always be used on the student's cost of attendance. Regardless of enrollment status, the student will continue to have room and board costs.\n\nFrom the Federal Student Aid Handbook Volume 3: \nFor less-than-half-time students, room and board for a limited\nduration\n Schools have the option to include in the COA for a less-\nthan-half-time student an allowance for room and board for up \nto three semesters (or equivalent), with no more than two of the \nsemesters being consecutive at any one school. You are not required \nto monitor COA components from other schools attended by the \nstudent."
10282,"CFR Title 34, Subtitle III, Chapter III, Part 300, Subpart A, 300.34 Related Services\n\nI request a modification of the regulations to include Intervener Services on the list of Related Services.\n\nDeafblindness is a disablity of access. An intervener is the bridge between the individual with deafblindness and his/her environment. Interveners have a specific, unique skillset to accomplish just that, and an increasing number of states is using interveners in educational settings with great success. Intervener services have been recognized by OSEP as a credible service delivery option for children who are deafblind under the IEP process; however, states are hesitant to recognize the need for these services until they are recognized at a national level. For these reasons alone intervener services need to be on the list of related services to be considered when determining the LRE for a student with deafblindness."


Creating a dataframe of states and zip codes from the cleaned comments dataset (dupicate comments intact) to use in a map.

In [26]:
map_data = data[['doc.state', 'doc.zip']]

# Remove any null values
map_data.dropna(inplace=True)

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  after removing the cwd from sys.path.


Generating a heat map visualization by state of where comments came from along with listing the states from which the most comments were sent.

In [27]:
state_data = map_data.groupby(['doc.state']).size().reset_index(name='duplicate_count')
state_data = state_data.sort_values('duplicate_count', ascending=False).reset_index(drop=True)

zip_data = map_data.groupby(['doc.zip']).size().reset_index(name='duplicate_count')
zip_data = zip_data.sort_values('duplicate_count', ascending=False).reset_index(drop=True)

display(state_data[:5], zip_data[:5])

Unnamed: 0,doc.state,duplicate_count
0,CA,1464
1,NY,727
2,WA,441
3,FL,423
4,PA,383


Unnamed: 0,doc.zip,duplicate_count
0,10025,18
1,11215,18
2,10011,15
3,10023,13
4,95060,12


In [28]:
state_geo = './resources/us-states.json'

map = folium.Map(location=[48, -102], zoom_start=3)
map.choropleth(
    geo_data=state_geo,
    name='map',
    data=state_data,
    columns=['doc.state', 'duplicate_count'],
    key_on='feature.id',
    fill_color='BuPu',
    fill_opacity=0.9,
    line_opacity=0.1
)
folium.LayerControl().add_to(map)

map

Resources:
https://github.com/j2kao/fcc_nn_research/blob/master/proc_17_108_analysis_01_level_0_manual_tagging.ipynb