# Use of Solitary Confinement at Northwest ICE Processing Center\*
## UW Center for Human Rights preliminary report, November 2019

Data analyzed:

1. Dataset of solitary confinement placements at NWDC relased to UWCHR via FOIA on February 22, 2019
2. Dataset of national solitary confinement placements in ICE detention analyzed by International Consortium of Investigative Journalists (Spencer Woodman, Karrie Kehoe, Maryam Saleh, and Hannah Rappleye, ["Thousands of Immigrants Suffer In US Solitary Confinement"]((https://www.icij.org/investigations/solitary-voices/thousands-of-immigrants-suffer-in-us-solitary-confinement/)), ICIJ, May 21 2019)
3. Dataset of national solitary confinement placements in ICE detention analyzed by Project on Government Oversight (POGO, ["ISOLATED: ICE Confines Some Detainees with Mental Illness in Solitary for Months"](https://www.pogo.org/investigation/2019/08/isolated-ice-confines-some-detainees-with-mental-illness-in-solitary-for-months/), August 14 2019)

\* This report uses the term "solitary confinement" to describe practices named by ICE and GEO Group as "administrative segregation" or "disciplinary segregation". This report interchangably uses the titles "Northwest ICE Processing Center" (the title currently employed by ICE) and "Northwest Detention Center (NWDC)" (the former title used during the time period covered by this report) to refer to the immigration prison in Tacoma, Washington privately owned and operated by GEO Group on behalf of ICE.



\[Description of NWDC and ICE segregation practices here?\]



## NWDC Solitary Confinement Placements Dataset Released to UWCHR

### FOIA Request Timeline

On September 18, 2017, UWCHR filed the following Freedom of Information Request with Immigration and Customs Enforcement (ICE):

"We are seeking all records (including written documents, files, electronic communications, records or reports of any sort) describing or reviewing the placement of detainees in segregation (including both administrative and disciplinary segregation) at the Northwest Detention Center in Tacoma, WA, from September 2013 to the present. This includes both notifications of initial placement in segregation as well as regular reviews performed in cases of extended segregation. We ask that documents be redacted to protect inmate privacy."

The request was acknowledged by ICE on October 4, 2017 and assigned Case Number 2018-ICFO-00515. During the following year UWCHR made various requests for status updates. On February 26, 2018, an ICE FOIA officer acknowledged that the request was active.

On September 21, 2018, the University of Washington filed suit against the Department of Homeland Security (DHS), ICE, and Customs and Border Protection (CBP) for violations of FOIA, including failure to respond to this request in a timely manner (The University of Washington, et al. v. Department of Homeland Security, et al.
U.S. District Court, Western District of Washington Case No. 2:18-cv-01396-BJR).

On February 22, 2019, as part of document production pursuant to the UW lawsuit, ICE released a with filename `2018-ICFO-00515 Highlighted_EDIT.xlsx` in response to this request. This document is described and analyzed in the continuing sections.

In response to this production as part of the ongoing litigation, UWCHR pointed out that this release does not fully satisfy its original request:

"_First_, the original request sought all records, not just summaries of those records in a spreadsheet. Plaintiffs are seeking the underlying records which ICE used to create this spreadsheet, such as Administrative Detention Orders. _Second_, Plaintiffs have an Administrative Detention Order which shows that one particular detainee was sent to solitary confinement on 8/8/2017. This entry is not included in the spreadsheet ICE produced, which makes Plaintiffs question the thoroughness and accuracy of the spreadsheet as a representation of the underlying data." (March 12, 2019 letter by UW counsel Thomas R. Burke to Assistant United States Attorney Michelle R. Lambert)

Litigation in this case is ongoing.

\[Include ADO referenced above as appendix?\]

## Dataset Description and Analysis

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.pyplot import cm
import seaborn as sns
import datetime as dt
import yaml
from pandas.tseries import offsets
from matplotlib.ticker import (MultipleLocator, FormatStrFormatter,
                               AutoMinorLocator, NullLocator)
import matplotlib.dates as mdates

The following code section loads the dataset, prints a description of the dataset, and displays the first five records:

In [None]:
csv_opts = {'sep': '|',
            'quotechar': '"',
            'compression': 'gzip',
            'encoding': 'utf-8'}

df = pd.read_csv('../input/uwchr.csv.gz',
                 parse_dates=['placement_date', 'release_date'],
                 **csv_opts)
print(df.info())
print()
df.head()

The following code section checks that the "Tracking Number" value for each record is unique; checks that no solitary record release date precedes placement date; and prints some basic characteristics of the dataset:

In [None]:
assert len(df[df['release_date'] < df['placement_date']]) == 0
assert len(df) == len(set(df['tracking_number']))
print(f'{len(df)} total unique records.')
print()
print(f'Earliest placement date: {df.placement_date.min()}')
print(f'Latest placement date: {df.placement_date.max()}')
print()
print(f'Earliest release date: {df.release_date.min()}')
print(f'Latest release date: {df.release_date.max()}')
print()
print('Solitary pop. countries of citizenship (Top 10):')
print(df['country_of_citizenship'].value_counts(dropna=False).head(10))
print()
print('Solitary population by gender:')
print(df['gender'].value_counts(dropna=False).head(10))

In [None]:
top_10 = pd.DataFrame(df['country_of_citizenship'].value_counts(dropna=False).head(10))

In [None]:
all_others = df[~df['country_of_citizenship'].isin(list(top_10.index))]

In [None]:
top_10.loc['ALL OTHERS', 'country_of_citizenship'] = len(all_others)

In [None]:
top_10.reset_index()

Total solitary placements per calendar year:

In [None]:
g_annual = df.set_index('placement_date').groupby(pd.Grouper(freq='AS'))
g_annual['tracking_number'].nunique()

Total solitary placements per fiscal year:

In [None]:
g_fy = df.set_index('placement_date').groupby(pd.Grouper(freq='AS-OCT'))
g_fy['tracking_number'].nunique()

Average monthly solitary placements per year:

In [None]:
g_monthly = df.set_index('placement_date').groupby(pd.Grouper(freq='MS'))
g_monthly['tracking_number'].nunique().resample('AS').mean()

In [None]:
g_monthly['tracking_number'].nunique()

The following code section generates a visualization of the number of solitary confinement placements per month for the time period covered by the dataset:

In [None]:
data = g_monthly['tracking_number'].nunique()

years_loc = mdates.YearLocator()
months_loc = mdates.MonthLocator()
years_fmt = mdates.DateFormatter('%Y')

fig, ax = plt.subplots(figsize=(10,6))
ax.set_facecolor('#DDDDDD')
ax.set_axisbelow(True)
ax.yaxis.grid(color='#FFFFFF')
ax.bar(data.index, data, width=20)
ax.xaxis.set_minor_locator(months_loc)
ax.xaxis.set_major_locator(years_loc)
ax.xaxis.set_major_formatter(years_fmt)
ax.axhline(data.mean(), color='r')
ax.yaxis.set_minor_locator(MultipleLocator(1))
ax.yaxis.set_major_locator(MultipleLocator(2))
plt.title('NWDC Solitary Confinement Placements per month', fontsize=16)
plt.ylabel('Placements', fontsize=12)
plt.xlabel('Calendar year', fontsize=12)
plt.legend(('Average monthly placements', 'Monthly solitary placements'), loc='upper right')

plt.show()

print(f'Minimum monthly solitary placements: {data.min()}')
print(f'Maximum monthly solitary placements: {data.max()} in {data.idxmax().month_name()} {data.idxmax().year}')
print(f'Overall average monthly solitary placements: {data.mean()}')

### Calculating time in solitary confinement

We see in the following code section that several records do not have an "Release Date" specified. All of these records are segregation placements beginning during the latter portion of the dataset, which suggests that these refer to individuals who remained in solitary confinement at the time of production of this dataset.

In [None]:
null_start = df['placement_date'].isnull()
assert sum(null_start) == 0
null_end = df['release_date'].isnull()
print('No missing placement dates.')
print(f'{sum(null_end)} records with missing release dates.')
print()
print('Description of "Placement Date" for records with no "Release Date":')
print(df[null_end]['placement_date'].describe())
print()

For purposes of analysis, we set the release date for individuals still in solitary confinement to that of the latest date represented in the dataset, allowing us to calculate minimum total solitary confinement length for all records:

In [None]:
df_pre_fill = df.copy()
df['release_date'] = df['release_date'].fillna(df['release_date'].max())
df['solitary_length'] = df['release_date'] - df['placement_date']
df['solitary_length'].describe()

After setting missing release dates to latest date represented in dataset, we examine the 14 records in question and see that for all measures except maximum time, they represent individuals with longer solitary stays:

In [None]:
df[null_end]['solitary_length'].describe()

By comparison, we note that excluding records without a specified release date results in a decreased average length of solitary confinement:

In [None]:
df_drop = df_pre_fill.dropna(subset=['release_date'], axis=0).copy()
df_drop.loc[:, 'solitary_length'] = df_drop['release_date'] - df_drop['placement_date']
df_drop['solitary_length'].describe()

In [None]:
# Using dataset with minimum total solitary length for all records
df['days_solitary'] = df['solitary_length'] / np.timedelta64(1, 'D')
df.loc[:, 'log_days_solitary'] = np.log(df['days_solitary'])

In [None]:
# Testing that no solitary placement has negative stay length
assert sum(df['days_solitary'] < 0) == 0

In [None]:
df.sort_values(by='days_solitary', ascending=False)

A histogram of solitary placement lengths shows that the vast majority of placement are less than 50 days, but the distribution has a wide distribution to a maximum of 781 days.

In [None]:
df['days_solitary'].mode()

In [None]:
df['days_solitary'].median()

In [None]:
num_bins = np.arange(0,780,30)
data = df['days_solitary']
# the histogram of the data
n, bins, patches = plt.hist(data, num_bins, facecolor='blue', alpha=0.5)
 
plt.plot()
plt.xlabel('Days')
plt.xticks(np.arange(0, 780, step=60), rotation=45)
plt.ylabel('Placement count')
plt.yticks(np.arange(0, 300, step=50))
plt.ylim(-5, 275)
plt.title('Solitary placement length')
 
# Tweak spacing to prevent clipping of ylabel
plt.subplots_adjust(left=0.15) 
# plt.savefig('output/nwdc_solitary_length_hist.png', bbox_inches='tight')
plt.show()

82 solitary placements were for stays longer than 75 days:

In [None]:
sum(df['days_solitary'] > 75)

In [None]:
sum(df['days_solitary'] > 15)

In [None]:
sum(df['days_solitary'] > 15) / len(df['days_solitary'])

In [None]:
sum(df['days_solitary'] > 30)

In [None]:
sum(df['days_solitary'] > 30) / len(df)

In [None]:
round(sum(df['days_solitary'] <= 30) / len(df) * 100, 2)

We can plot length of solitary stay by date of placement. Note that as we might expect, the longest placement falls early in the timeline; as noted above some placements were apparently ongoing at the time of release of this data, and we represent the minimum amount of solitary time (orange points):

In [None]:
# years = mdates.YearLocator()   # every year
# months = mdates.MonthLocator()  # every month
# years_fmt = mdates.DateFormatter('%Y')

x=df[~null_end].loc[:, 'placement_date'].astype(int)
y=df[~null_end].loc[:,'days_solitary']

fig = plt.figure()
ax = fig.add_subplot(111)
plt.scatter(x, y)

x=df[null_end].loc[:, 'placement_date'].astype(int)
y=df[null_end].loc[:,'days_solitary']
plt.scatter(x, y)

# ax.xaxis.set_major_locator(years)
# ax.xaxis.set_major_formatter(years_fmt)
# ax.xaxis.set_minor_locator(months)

# datemin = np.datetime64(df['placement_date'][0], 'Y')
# datemax = np.datetime64(df['placement_date'][-1], 'Y') + np.timedelta64(1, 'Y')
# ax.set_xlim(datemin, datemax)

xticks = ax.get_xticks()
xticks_dates = [str(f'{pd.to_datetime(x).year}-Q{pd.to_datetime(x).quarter}') for x in xticks]
ax.set_xticklabels(xticks_dates,  rotation=45)
plt.show()
plt.close(fig=fig)
del fig, ax

Values vary greatly, and the average solitary placement length is much lower than the maximum, so we can use a log transformation to get a better idea of distribution. These rough visualizations of trend in solitary stay length over time suggest a slight apparent increase. More statistically rigorous work would be needed to explore whether this trend is significant.

In [None]:
df['placement_year'] = df['placement_date'].map(lambda x: x.year)
ax = sns.boxplot(data=df, x="placement_year", y="log_days_solitary")

In [None]:
years = mdates.YearLocator()   # every year
months = mdates.MonthLocator()  # every month
years_fmt = mdates.DateFormatter('%Y')

In [None]:
x=df.loc[:, 'placement_date'].astype(int)
y=df.loc[:,'log_days_solitary']

fig, ax = plt.subplots()
sns.regplot(x, y, ax=ax)

# ax.xaxis.set_major_locator(years)
# ax.xaxis.set_major_formatter(years_fmt)

# datemin = np.datetime64(df['placement_date'].min(), 'Y')
# datemax = np.datetime64(df['placement_date'].max(), 'Y')
# ax.set_xlim(datemin, datemax)


xticks = ax.get_xticks()
xticks_dates = [str(f'{pd.to_datetime(x).year}-Q{pd.to_datetime(x).quarter}') for x in xticks]
ax.set_xticklabels(xticks_dates, rotation=45);

In [None]:
np.datetime64(df['placement_date'].min(), 'Y')

In [None]:
df.set_index('placement_date').resample('AS')['days_solitary'].mean()

In [None]:
df.set_index('placement_date').resample('AS')['days_solitary'].median()

In [None]:
df.set_index('placement_date').resample('AS-OCT')['days_solitary'].median()

Solitary placement length by gender:

In [None]:
ax = sns.boxplot(data=df, x="gender", y="log_days_solitary")

Solitary length for nationalities with > 5 placements:

In [None]:
coc_count = df['country_of_citizenship'].value_counts()
coc_list = list(coc_count[coc_count > 5].index)
mask = df['country_of_citizenship'].isin(coc_list)

In [None]:
order = df[mask].groupby(by=["country_of_citizenship"])["log_days_solitary"].median().sort_values(ascending=False).index

In [None]:
ax = sns.boxplot(data=df[mask], x="country_of_citizenship", y="log_days_solitary", order=order)
ax.set_xticklabels(ax.get_xticklabels(),rotation=90);

Solitary placements coded as detainee requested tend to be longer than facility-iniated placements:

In [None]:
ax = sns.boxplot(data=df, x="detainee_request", y="log_days_solitary")
ax = ax.set_xticklabels(ax.get_xticklabels(),rotation=0);

In [None]:
sns.boxplot(x='placement_year',y='log_days_solitary',data=df,hue='detainee_request')

In [None]:
len(df[pd.Series(df['placement_reason'] == 'Disciplinary') & pd.Series(df['days_solitary'] > 30)])

In [None]:
data = df.set_index('placement_date').groupby([pd.Grouper(freq='Q')])['placement_reason_type'].value_counts().unstack()

In [None]:
data

In [None]:
years_loc_1 = mdates.YearLocator()
months_loc_1 = mdates.MonthLocator()
years_fmt_1 = mdates.DateFormatter('%Y')
months_fmt_1 = mdates.DateFormatter('%M')

fig, ax = plt.subplots(figsize=(10,6))

fig = data.plot(kind='bar', ax=ax, stacked=True)

ax.xaxis.set_minor_locator(months_loc_1)
ax.xaxis.set_minor_formatter(months_fmt_1)
ax.xaxis.set_major_locator(years_loc_1)
ax.xaxis.set_major_formatter(years_fmt_1)
ax.yaxis.set_minor_locator(MultipleLocator(5))
ax.yaxis.set_major_locator(MultipleLocator(10))

plt.title('NWDC Solitary Confinement Placements per quarter', fontsize=16)
plt.ylabel('Placements', fontsize=12)
plt.xlabel('Calendar year', fontsize=12)
plt.legend(loc='upper right')



In [None]:
data = g_monthly['tracking_number'].nunique()

years_loc = mdates.YearLocator()
months_loc = mdates.MonthLocator()
years_fmt = mdates.DateFormatter('%Y')

fig, ax = plt.subplots(figsize=(10,6))
ax.set_facecolor('#DDDDDD')
ax.set_axisbelow(True)
ax.yaxis.grid(color='#FFFFFF')
ax.bar(data.index, data, width=20)
ax.xaxis.set_minor_locator(months_loc)
ax.xaxis.set_major_locator(years_loc)
ax.xaxis.set_major_formatter(years_fmt)
ax.axhline(data.mean(), color='r')
ax.yaxis.set_minor_locator(MultipleLocator(1))
ax.yaxis.set_major_locator(MultipleLocator(2))
plt.title('NWDC Solitary Confinement Placements per month', fontsize=16)
plt.ylabel('Placements', fontsize=12)
plt.xlabel('Calendar year', fontsize=12)
plt.legend(('Average monthly placements', 'Monthly solitary placements'), loc='upper right')

plt.show()

print(f'Minimum monthly solitary placements: {data.min()}')
print(f'Maximum monthly solitary placements: {data.max()} in {data.idxmax().month_name()} {data.idxmax().year}')
print(f'Overall average monthly solitary placements: {data.mean()}')

### Calculating total population in solitary placements at any given time

For the period of time covered by the dataset, we count the number of people currently in solitary confinement placements on each day to present a timeline of total solitary confinement population. Note that some individuals may have already been in solitary confinement at the beginning of this timeline, so the earlier period may represent an under-count of the total population.

We can note that since early 2017, the total solitary confinement population has only briefly fallen below the overall mean population.

In [None]:
min_date = df['placement_date'].min()
max_date = df['release_date'].max()
timeline = pd.date_range(min_date, max_date, freq='D')
years = timeline.year.unique()

counts = pd.Series(index=timeline)
for day in timeline:
    in_range = df[(df['placement_date'] <= day) & (df['release_date'] >= day)]
    counts[day] = len(in_range)

fig, ax = plt.subplots()
fig.figsize=(10,8)

ax = counts.plot();
ax.axhline(counts.mean(), color='r')
ax.set_facecolor('#DDDDDD')
ax.set_axisbelow(True)
ax.yaxis.grid(color='#FFFFFF')
plt.title('NWDC Daily Solitary Confinement Population', fontsize=16)
plt.xlabel('Calendar year', fontsize=12)
plt.ylabel('Total solitary pop.', fontsize=12)
leg = ax.legend(('Population', 'Mean'), loc='upper right', bbox_to_anchor=(1.35, 1))
plt.show()
# fig.savefig('samplefigure', bbox_inches='tight')

In [None]:
counts.resample('AS-OCT').mean()

In [None]:
counts.mean()

We can also visualize this data at higher resolution by plotting each year separately, and superimposing lines for yearly average population:

In [None]:
fig = plt.figure(figsize=(10,8))
i = 0
for year in years:
    ax=plt.subplot(3,2,i+1)

    ax.set_title(f'{year}')
    counts[f'{year}'].plot(ax=ax)
    datemin = pd.Timestamp(f'{year}-01-01')
    datemax = pd.Timestamp(f'{year}-12-31')
    ax.set_facecolor('#EEEEEE')
    ax.set_xlim([datemin, datemax])
    ax.set_ylim([-2,30])
    ax.yaxis.set_minor_locator(MultipleLocator(5))
    ax.xaxis.set_minor_locator(NullLocator())
    ax.axhline(counts[f'{year}'].mean(), color='r')
    i = i + 1
    plt.grid(axis='y', which='minor', color='#FFFFFF')
plt.subplots_adjust(wspace=.2, hspace=.4)
fig.suptitle("NWDC Daily Solitary Confinement Population", fontsize=16)
fig.legend(('Population', 'Annual Mean'), loc='upper right')
plt.show()

While limitations of the dataset may cause an underestimate of the total population during the early period of the dataset, we see clearly that the average population in solitary is between 9 to 12 during 2013-2016, rising to an average of nearly 17 during 2017, and an average of more than 18 people in solitary time during the first five months of 2018.

## National solitary confinement datasets

### ICIJ

See: ["Solitary Voices: Thousands of Immigrants Suffer In US Solitary Confinement"](https://www.icij.org/investigations/solitary-voices/thousands-of-immigrants-suffer-in-us-solitary-confinement/) by Spencer Woodman Karrie Kehoe Maryam Saleh Hannah Rappleye, International Consortium of Investigative Journalists, May 21, 2019 

See notebook `icij.ipynb` for writeup on apparent issues with `placement_date` and `release_date` values in the published version of this dataset. Correcting for this issue shows that ICIJ and UWCHR's respective datasets for NWDC solitary confinement placements are very similar, though some discrepancies remain.

### POGO

See: POGO, ["ISOLATED: ICE Confines Some Detainees with Mental Illness in Solitary for Months"](https://www.pogo.org/investigation/2019/08/isolated-ice-confines-some-detainees-with-mental-illness-in-solitary-for-months/), August 14 2019

See notebook `pogo.ipynb` for descriptive analysis of POGO dataset including comparison with UWCHR records. Records published by POGO match recods released to UWCHR.

In [None]:
icij = pd.read_csv('../input/icij.csv.gz',
                   parse_dates=['placement_date', 'release_date'],
                   **csv_opts)

# icij = pd.read_csv('../frozen/icij-date-fix-temp.csv',
#                    parse_dates=['placement_date',
#                                 'release_date',
#                                 'placement_date_fixed',
#                                 'release_date_fixed'],
#                    **csv_opts)

pogo = pd.read_csv('../input/pogo.csv.gz',
                   parse_dates=['placement_date', 'release_date'],
                   **csv_opts)

In [None]:
# pogo = pogo[pd.notnull(pogo['days_solitary'])]
# pogo = pogo[pogo['days_solitary'] > 0]

In [None]:
print(icij.info())
print()
icij.head()

In [None]:
icij['facility'] = icij['facility'].str.strip()

In [None]:
icij_nwdc_str = icij[icij['state'] == 'WA']['facility'].unique()[0]

In [None]:
icij_nwdc_str

In [None]:
icij_nwdc = icij[icij['facility'] == icij_nwdc_str]

In [None]:
icij_nwdc_count = len(icij_nwdc)
icij_nwdc_max_date = icij_nwdc['placement_date'].max()
icij_nwdc_min_date = icij_nwdc['placement_date'].min()
print(icij_nwdc_min_date, icij_nwdc_max_date)

In [None]:
print(pogo.info())
print()
pogo.head()

In [None]:
pogo['record_id'] = range(len(pogo))
pogo = pogo.rename({'length_of_solitary_confinement_(pogo_calculation)': 'days_solitary'}, axis=1)

In [None]:
pogo_nwdc_str = 'TACOMA ICE PROCESSING CENTER (NORTHWEST DET CTR) (WA)'
pogo_nwdc = pogo[pogo['facility'] == pogo_nwdc_str]

In [None]:
pogo_nwdc_count = len(pogo_nwdc)
pogo_nwdc_max_date = pogo_nwdc['placement_date'].max()
pogo_nwdc_min_date = pogo_nwdc['placement_date'].min()
print(pogo_nwdc_min_date, pogo_nwdc_max_date, pogo_nwdc_count)

### ICIJ: NWDC use of solitary confinement in comparison with other ICE detention facilities

The dataset published by ICIJ includes 272 solitary confinement placements at NWDC between March 2013 to October 2017, placing NWDC at #9 among 111 ICE facilities ranked by number of solitary placements. The average length of solitary stay is almost 52 days (rank #10), median solitary stay is 27 days (rank #24).

In [None]:
avg_days = icij.groupby('facility')['days_solitary'].mean().sort_values(ascending=False)
avg_days.name = 'mean_days_solitary'
avg_days = avg_days.reset_index()
avg_days['rank'] = avg_days.index + 1
avg_days.set_index('rank', inplace=True)
avg_days.head(10)

In [None]:
median_days = icij.groupby('facility')['days_solitary'].median().sort_values(ascending=False)
median_days.name = 'median_days_solitary'
median_days = median_days.reset_index()
median_days['rank'] = median_days.index + 1
median_days.set_index('rank', inplace=True)
median_days[median_days['facility'] == icij_nwdc_str]

In [None]:
placement_count = icij.groupby('facility')['record_id'].count().sort_values(ascending=False)
placement_count.name = 'solitary_placements'
placement_count = placement_count.reset_index()
placement_count['rank'] = placement_count.index + 1
placement_count.set_index('rank', inplace=True)
placement_count.head(10)

### POGO: NWDC use of solitary confinement in comparison with other ICE detention facilities

The dataset published by POGO includes 149 solitary confinement placements at NWDC from January 1, 2016 to May 3, 2018, placing NWDC at #13 among 99 ICE facilities ranked by number of solitary placements. The average length of solitary stay is almost 70 days (rank #2), median solitary stay is 42 days (rank #6). For both measures of length of stay, NWDC has the longest stays among federal detention centers.

In [None]:
pogo['days_solitary'].describe()

In [None]:
sum(pogo['days_solitary'] > 15)

In [None]:
sum(pogo['days_solitary'] > 15) / len(pogo)

In [None]:
pogo.groupby('facility')['record_id'].count().rank(ascending=False)[pogo_nwdc_str]

In [None]:
pogo.groupby('facility')['days_solitary'].median().rank(ascending=False, method='min')[pogo_nwdc_str]

In [None]:
pogo.groupby('facility')['days_solitary'].describe().sort_values(by='mean', ascending=False).head(10)

In [None]:
pogo.groupby('facility')['days_solitary'].describe().sort_values(by='count', ascending=False).head(15)

In [None]:
avg_days = pogo.groupby('facility')['days_solitary'].mean().sort_values(ascending=False)
avg_days.name = 'mean_days_solitary'
avg_days = avg_days.reset_index()
avg_days['rank'] = avg_days.index + 1
avg_days.set_index('rank', inplace=True)
avg_days.head(10)

In [None]:
median_days = pogo.groupby('facility')['days_solitary'].median().sort_values(ascending=False)
median_days.name = 'median_days_solitary'
median_days = median_days.reset_index()
median_days['rank'] = median_days.index + 1
median_days.set_index('rank', inplace=True)
median_days[median_days['facility'] == pogo_nwdc_str]
median_days.head(10)

In [None]:
placement_count = pogo.groupby('facility')['record_id'].count().sort_values(ascending=False)
placement_count.name = 'solitary_placements'
placement_count = placement_count.reset_index()
placement_count['rank'] = placement_count.index + 1
placement_count.set_index('rank', inplace=True)
placement_count[placement_count['facility'] == pogo_nwdc_str]

In [None]:
pogo_nwdc_mask = pogo.set_index('placement_date')['facility'] == pogo_nwdc_str
pogo_nwdc = pogo.set_index('placement_date').loc[pogo_nwdc_mask, 'days_solitary'].dropna()
pogo_not_nwdc = pogo.set_index('placement_date').loc[~pogo_nwdc_mask, 'days_solitary'].dropna()

x=pogo_not_nwdc.index.astype(int)
y=np.log(pogo_not_nwdc.values)

fig = plt.figure()
ax = fig.add_subplot(111)
plt.scatter(x, y)

x=pogo_nwdc.index.astype(int)
y=np.log(pogo_nwdc.values)
plt.scatter(x, y)

xticks = ax.get_xticks()
xticks_dates = [pd.to_datetime(x).year for x in xticks]
ax.set_xticklabels(xticks_dates)
plt.show()
plt.close(fig=fig)
del fig, ax

In [None]:
import scipy.stats as scipystats

In [None]:
nwdc = pogo['facility'] == pogo_nwdc_str

In [None]:
# This statistical analysis section is speculative

In [None]:
ttest = scipystats.ttest_ind(pogo[nwdc]['days_solitary'], pogo[~nwdc]['days_solitary'], nan_policy='omit')

In [None]:
ttest

### Placement reasons

### Solitary length by Placement Reason

Placement reasons are uniform categories used by all three datasets (UWCHR, ICIJ, POGO).

In [None]:
set(df['placement_reason']).union(set(pogo['placement_reason'])).union(set(icij['placement_reason']))

In [None]:
placement_counts = df['placement_reason'].value_counts(dropna=False)
placement_mean_days = df.groupby(['placement_reason'])['days_solitary'].mean()
placements = pd.concat([placement_counts, placement_mean_days], axis=1, sort=False)
placements = placements.rename({'placement_reason': 'Total placements',
    'days_solitary': 'Avg. solitary length'}, axis=1)
placements.index.name = 'Solitary Placement Reason'
placements

Here we simplify placement reasons into broader groups for easier comparative analysis between dataests. We note that no "Suicide Risk Placement" records are associated with NWDC in any of the datasets, but all the other categories are present.

In [None]:
# Simplify placement reasons into more general categories:
with open(f'../hand/placement-types.yaml', 'r') as yamlfile:
    placement_reason_type = yaml.load(yamlfile)

In [None]:
df['placement_reason_type'] = df['placement_reason'].replace(placement_reason_type)

In [None]:
order = df.groupby(by=["placement_reason_type"])["log_days_solitary"].median().sort_values(ascending=False).index

In [None]:
ax = sns.boxplot(data=df, x="placement_reason_type", y="log_days_solitary", order=order)
ax.set_xticklabels(ax.get_xticklabels(),rotation=90);

In [None]:
df['placement_reason_type'] = df['placement_reason_type'].astype('category')

In [None]:
df['placement_date_int'] = df['placement_date'].astype(int)

In [None]:
# x=df.loc[:, 'placement_date'].astype(int)
# y=df.loc[:,'log_days_solitary']

lm = sns.lmplot(data=df, x='placement_date_int', y='log_days_solitary', hue='placement_reason_type', height=5,
               aspect=2, fit_reg=False)

ax = lm.axes

xticks = ax[0,0].get_xticks()
xticks_dates = [str(f'{pd.to_datetime(x).year}-Q{pd.to_datetime(x).quarter}') for x in xticks]
ax = ax[0,0].set_xticklabels(xticks_dates, rotation=45)

In [None]:
# x=df.loc[:, 'placement_date'].astype(int)
# y=df.loc[:,'log_days_solitary']

lm = sns.lmplot(data=df, x='placement_date_int', y='log_days_solitary', hue='detainee_request', height=5,
               aspect=2, fit_reg=False)

ax = lm.axes

xticks = ax[0,0].get_xticks()
xticks_dates = [str(f'{pd.to_datetime(x).year}-Q{pd.to_datetime(x).quarter}') for x in xticks]
ax[0,0].set_xticklabels(xticks_dates, rotation=45);

In [None]:
icij['placement_reason_type'] = icij['placement_reason'].replace(placement_reason_type)
pogo['placement_reason_type'] = pogo['placement_reason'].replace(placement_reason_type)

In [None]:
icij['nwdc'] = icij['facility'] == icij_nwdc_str
pogo['nwdc'] = pogo['facility'] == pogo_nwdc_str

In [None]:
icij.groupby(by=['nwdc',"placement_reason_type"])["days_solitary"].describe()

In [None]:
pogo.groupby(by=["nwdc","detainee_request"])["days_solitary"].describe()

In [None]:
pogo['log_days_solitary'] = np.log(pogo['days_solitary'])

In [None]:
sns.boxplot(x='nwdc',y='log_days_solitary',data=pogo,hue='detainee_request')

In [None]:
pogo.groupby(by=["nwdc"])["days_solitary"].describe()

In [None]:
pogo.groupby(by=["nwdc","placement_reason_type"])["days_solitary"].describe()

In [None]:
df.groupby(by=["placement_reason"])["days_solitary"].describe().sort_values(by='count', ascending=False)

In [None]:
df.groupby(by=["placement_reason_type", 'detainee_request'])["days_solitary"].describe()

In [None]:
pogo['detainee_request'].value_counts()

In [None]:
facility_init = pogo['detainee_request'] == 'Facility-Initiated'

In [None]:
# Only facility-initated segregation: NWDC versus all other facilities
pogo[facility_init].groupby(by=["nwdc"])["days_solitary"].describe()

In [None]:
pogo.groupby(by=["nwdc", 'detainee_request', 'placement_reason_type'])["days_solitary"].describe()

In [None]:
pogo.set_index('placement_date').groupby('facility')['days_solitary'].mean().rank(ascending=False)[pogo_nwdc_str]

In [None]:
pogo.set_index('placement_date').groupby([pd.Grouper(freq='Q'),'facility'])['days_solitary'].mean().rank(pct=False, ascending=False).unstack()[pogo_nwdc_str]

In [None]:
pogo.set_index('placement_date').groupby([pd.Grouper(freq='Q'),'facility'])['days_solitary'].mean().unstack()[pogo_nwdc_str]

In [None]:
pogo[facility_init].set_index('placement_date').groupby('facility')['days_solitary'].mean().rank(ascending=False)[pogo_nwdc_str]

### Facility population

Bringing in standardized `DETLOC` codes from ICE Facilities List

In [122]:
facil_df = pd.read_csv('../../../ice-facilities/import/output/ICEFacilityListReport.csv.gz',
                      **csv_opts,
                      header=8)

In [123]:
facil_df.head()

Unnamed: 0,DETLOC,Name,Address,City,County,State,Zip,Circuit,AOR,Docket,...,DSM Assigned?,DSM Assignment Type,FY18 Calendar Days in Use,FY18 Possible Days,FY18 % of Days in Use,FY18 Total Mandays,FY17 Calendar Days in Use,FY17 % of Days in Use,FY17 Total Mandays,FY17 Max Pop Count
0,ABRDNWA,ABERDEEN CITY JAIL,210 EAST MARKET ST,ABERDEEN,GRAYS HARBOR,WA,98520,9,SEA,SEA,...,No,,0,400,0%,0,0,0%,0,0
1,ABTHOLD,ABILENE HOLD ROOM,12071 FM 3522,ABILENE,ABILENE,TX,79601,5,DAL,ABT,...,No,,0,400,0%,0,0,0%,0,0
2,ABRXSPA,ABRAXAS ACADEMY DETENTION CENTER,1000 ACADEMY DRIVE,MORGANTOWN,BERKS,PA,19543,3,PHI,BRK,...,No,,35,400,9%,800,372,102%,1207,5
3,RICRANS,"ACI (CRANSTON, RHODE ISLAND)",39 HOWARD AVE,CRANSTON,PROVIDENCE,RI,2920,1,BOS,BOS,...,No,,0,400,0%,0,0,0%,0,0
4,ADACOID,ADA COUNTY JAIL,7210 BARRISTER DRIVE,BOISE,ADA,ID,83704,9,SLC,HEL,...,No,,0,400,0%,0,0,0%,0,0


In [None]:
# Quick and dirty cleaning of facilities data
for col in facil_df.columns:
    try:
        facil_df.loc[:, col] = facil_df.loc[:, col].astype(str)
        facil_df.loc[:, col] = facil_df.loc[:, col].str.replace(',','')
        facil_df.loc[:, col] = facil_df.loc[:, col].str.replace('$','')
        facil_df.loc[:, col] = facil_df.loc[:, col].str.replace('%','')
        facil_df.loc[:, col] = facil_df.loc[:, col].astype(int)
    except ValueError:
        pass

In [None]:
facil_name = facil_df.set_index('DETLOC')['Name']

In [None]:
facil_detloc_df = pd.read_csv('../hand/icij-pogo-facilities.csv')

In [None]:
facil_detloc_df.set_index('detloc').join(facil_name)

In [None]:
facil_detloc_df.to_csv('../output/icij-pogo-facilities.csv')

In [None]:
facil_detloc = dict(zip(facil_detloc_df['facility'], facil_detloc_df['detloc']))

In [None]:
icij['detloc'] = icij['facility'].replace(facil_detloc)

In [None]:
pogo['detloc'] = pogo['facility'].replace(facil_detloc)

### Stats by facility operator

In [None]:
detloc_operator = facil_df[['DETLOC', 'Facility Operator']].set_index('DETLOC')

In [None]:
pogo = pogo.join(detloc_operator, on='detloc')

In [None]:
pogo['Facility Operator'] = pogo['Facility Operator'].fillna('UNKNOWN')
pogo.groupby('Facility Operator')['days_solitary'].mean().sort_values()

In [None]:
facil_df.set_index('DETLOC')['FY17 ADP'].sort_values(ascending=False).head(15)

In [None]:
facil_capacity = facil_df.set_index('DETLOC')['Capacity']
facil_capacity = facil_capacity[facil_capacity != 'AS NEEDED']
facil_capacity = facil_capacity.astype(int)
facil_capacity.sort_values(ascending=False).head(15)

## Analysis of ADP, ALOS

In [None]:
with open(f'../hand/adp_dict.yaml', 'r') as yamlfile:
    adp_dict = yaml.load(yamlfile)
    
with open(f'../hand/alos_dict.yaml', 'r') as yamlfile:
    alos_dict = yaml.load(yamlfile)

In [None]:
adp = pd.DataFrame(adp_dict).astype('float')

In [None]:
alos = pd.DataFrame(alos_dict).astype('float')

In [None]:
alos = alos.drop('Redacted', axis=1)

In [None]:
alos.dropna(how='all', axis=1)

In [None]:
alos.mean().dropna().rank(pct=True, ascending=True)['CSCNWWA']

In [None]:
# NWDC ALOS percentile rank
alos.T.rank(pct=True, ascending=True).loc['CSCNWWA']

In [None]:
alos.T.loc['CSCNWWA']

In [None]:
alos.index

In [None]:
alos.T.rank(pct=True, ascending=True).loc['CSCNWWA'].mean()

In [None]:
# NWDC ALOS absolute rank
alos.T.rank(ascending=False).loc['CSCNWWA']

In [None]:
alos.loc[:,'CSCNWWA']

In [None]:
nwdc_adp = adp.loc[:,'CSCNWWA']
nwdc_alos = alos.loc[:,'CSCNWWA']

In [None]:
nwdc_alos

In [None]:
nwdc_adp.index = ['2008-10-01',
                  '2009-10-01',
                  '2010-10-01',
                  '2011-10-01',
                  '2012-10-01',
                  '2013-10-01',
                  '2014-10-01',
                  '2015-10-01',
                  '2016-10-01',
                  '2017-10-01',
                 ]

nwdc_alos.index = ['2014-10-01',
                   '2015-10-01',
                   '2016-10-01',
                   '2017-10-01',
                  ]

In [None]:
nwdc_adp.index = pd.to_datetime(nwdc_adp.index)
nwdc_alos.index = pd.to_datetime(nwdc_alos.index)

In [None]:
fy14_fy17_adp = nwdc_adp.loc['2013':'2016']
fy15_fy17_alos = nwdc_alos.loc['2014':'2016']

In [None]:
df_full_fy = df[(df['placement_date'] > '2013-10-01') & (df['placement_date'] < '2017-10-01')]

In [None]:
fy14_fy17_adp

In [None]:
fy15_fy17_alos

In [None]:
min_date = df_full_fy['placement_date'].min()
max_date = df_full_fy['placement_date'].max()
timeline = pd.date_range(min_date, max_date, freq='D')
years = timeline.year.unique()

counts = pd.Series(index=timeline)
for day in timeline:
    in_range = df_full_fy[(df_full_fy['placement_date'] <= day) & (df_full_fy['release_date'] >= day)]
    counts[day] = len(in_range)

In [None]:
fy14_fy17_solitary = counts.resample('AS-OCT').mean()

In [None]:
fy14_fy17_solitary.name = 'solitary_ADP'

In [None]:
fy14_fy17_solitary_length = df_full_fy.set_index('placement_date').resample('AS-OCT')['days_solitary'].mean()
fy14_fy17_solitary_length.name = 'avg_solitary_length'

In [None]:
nwdc_per_capita = pd.concat([fy14_fy17_solitary, fy14_fy17_adp], axis=1)

In [None]:
nwdc_per_capita = nwdc_per_capita.rename({'CSCNWWA': 'ADP'}, axis=1)

In [None]:
nwdc_per_capita

In [None]:
nwdc_per_capita['per_capita'] = nwdc_per_capita['solitary_ADP'] / nwdc_per_capita['ADP']

In [None]:
nwdc_per_capita['per_capita'] * 100

In [None]:
# pogo_fy_17 = pogo[(pogo['placement_date'] > '2016-10-01') & (pogo['placement_date'] < '2017-10-01')]

In [None]:
min_date = pogo['placement_date'].min()
max_date = pogo['release_date'].max()
timeline = pd.date_range(min_date, max_date, freq='D')
years = timeline.year.unique()

counts = dict()

for facility in pogo['facility'].unique():
    facil_count = pd.Series(index=timeline)
    facil_temp = pogo[pogo['facility'] == facility]
    for day in timeline:
        in_range = facil_temp[(facil_temp['placement_date'] <= day) & (facil_temp['release_date'] >= day)]
        facil_count[day] = len(in_range)
        facil_solitary_adp = facil_count.resample('AS-OCT').mean()
    counts[facility] = facil_solitary_adp.loc['2016']

Calculation of average daily solitary pop. for NWDC based on POGO is close but not exactly the same as when calculated with UWCHR dataset. Why not? Because UWCHR dataset includes people placed prior to POGO dataset?

In [None]:
counts[pogo_nwdc_str]

In [None]:
detloc_count = pogo.groupby('detloc')['record_id'].count()
top_15_detloc = set(detloc_count.sort_values(ascending=False).head(15).index)

In [None]:
detloc_count_fy17 = pogo.set_index('placement_date').loc['2016'].groupby('detloc')['record_id'].count()


In [None]:
detloc_count_fy17.head()

In [None]:
facil_solitary_adp = pd.DataFrame.from_dict(counts).T
facil_solitary_adp = facil_solitary_adp.rename({0: 'solitary_ADP'}, axis=1)

In [None]:
facil_solitary_adp.columns = ['FY17_solitary_ADP']

In [None]:
facil_solitary_adp.sort_values(by='FY17_solitary_ADP',ascending=False).head(10)

In [None]:
facil_solitary_adp = facil_solitary_adp.join(pogo.set_index('facility')['detloc'].drop_duplicates())

In [None]:
facil_solitary_adp.set_index('detloc', inplace=True)

In [None]:
pogo_fy17_adps = facil_solitary_adp.join(facil_df.set_index('DETLOC')['FY17 ADP'])
pogo_fy17_adps['FY17 Solitary per capita'] = pogo_fy17_adps['FY17_solitary_ADP'] / pogo_fy17_adps['FY17 ADP'] * 100
pogo_fy17_adps.sort_values(by='FY17_solitary_ADP', ascending=False).head(15)
# pogo_fy17_adps.sort_values(by='FY17 Solitary per capita', ascending=False).head(15)

In [None]:
pogo_fy17_adps.sort_values(by='FY17 Solitary per capita', ascending=False).head(15)

In [None]:
pogo_avg_solitary_length = pogo.set_index('placement_date').groupby(['detloc',pd.Grouper(freq='AS-OCT')])['days_solitary'].mean().unstack()

In [None]:
pogo_med_solitary_length = pogo.set_index('placement_date').groupby(['detloc',pd.Grouper(freq='AS-OCT')])['days_solitary'].median().unstack()

In [None]:
pogo_med_solitary_length.head(10)

In [None]:
pogo_fy17_avg_solitary_length = pogo_avg_solitary_length.T.loc['2016'].T

In [None]:
pogo_fy17_med_solitary_length = pogo_med_solitary_length.T.loc['2016'].T

In [None]:
pogo_fy17_avg_solitary_length.columns = ['FY17 avg solitary length']

In [None]:
pogo_fy17_med_solitary_length.columns = ['FY17 med solitary length']

In [None]:
alos_fy17 = pd.DataFrame(alos.T['FY17 ALOS'])

In [None]:
solitary_v_alos_fy17 = pogo_fy17_avg_solitary_length.join(alos_fy17)

In [None]:
solitary_v_alos_fy17.columns = ['FY17 avg solitary length', 'FY17 ALOS']

In [None]:
solitary_v_alos_fy17['Solitary / ALOS ratio'] = solitary_v_alos_fy17['FY17 avg solitary length'] / solitary_v_alos_fy17['FY17 ALOS']

In [None]:
solitary_v_alos_fy17 = solitary_v_alos_fy17.join(pd.DataFrame(facil_capacity))

In [None]:
# solitary_v_alos_fy17 = solitary_v_alos_fy17.loc[solitary_v_alos_fy17.index.isin(top_15_detloc)]

In [None]:
# solitary_v_alos_fy17 = solitary_v_alos_fy17.sort_values(by='Solitary / ALOS ratio', ascending=False)

In [None]:
# solitary_v_alos_fy17 = solitary_v_alos_fy17.reset_index().reset_index().rename({'index': 'Solitary / ALOS ratio rank'}, axis=1)

In [None]:
solitary_v_alos_fy17.head()

In [None]:
# solitary_v_alos_fy17['Solitary / ALOS ratio rank'] = solitary_v_alos_fy17['Solitary / ALOS ratio rank'] + 1

In [None]:
# solitary_v_alos_fy17.set_index('detloc').loc['CSCNWWA']

In [None]:
# solitary_v_alos_fy17.set_index('detloc', inplace=True)

In [None]:
solitary_v_alos_fy17['solitary_placements_fy17'] = detloc_count_fy17

In [None]:
temp = solitary_v_alos_fy17.join(pogo_fy17_adps)

In [None]:
temp = temp.join(pogo_fy17_med_solitary_length)

In [None]:
# temp = temp.drop(['Capacity', 'solitary_placements'], axis=1)

In [None]:
temp_2 = facil_detloc_df[['detloc', 'facility']].drop_duplicates(subset='detloc').set_index('detloc')

In [None]:
temp = temp.join(temp_2)

In [None]:
# temp.to_csv('../output/FY17-solitary-stats-draft.csv')

In [None]:
temp.sort_values(by='FY17_solitary_ADP', ascending=False).head(15)

## Visualizing/calculating rank over time

In [None]:
# pogo['placement_year'] = pogo['placement_date'].map(lambda x: x.year)

In [None]:
facil_count = pogo.groupby('facility')['record_id'].count()

In [None]:
facil_count = facil_count.sort_values(ascending=False)

In [None]:
top_15_facil = facil_count.sort_values(ascending=False).head(15).index

In [None]:
edit_labels = pd.Series(facil_count.index + ' - ' + facil_count.values.astype(str)).head(15)

In [None]:
edit_labels.values

In [None]:
avg_days = pogo.set_index('placement_date').groupby([pd.Grouper(freq='Q'),'facility'])['days_solitary'].mean()

In [None]:
avg_days

In [None]:
avg_days = pogo.set_index('placement_date').groupby([pd.Grouper(freq='Q'),'facility'])['days_solitary'].mean()
avg_days.name = 'mean_days_solitary'
avg_days = avg_days.reset_index().set_index('facility')
avg_days = avg_days.sort_values(by=['placement_date', 'mean_days_solitary'],ascending=[True, False])

In [None]:
# avg_days.set_index('facility').loc[pogo_nwdc_str]

In [None]:
avg_days = avg_days.reset_index().set_index('placement_date')

In [None]:
for period in avg_days.index.unique():
    avg_days.loc[period, 'rank'] = avg_days.loc[period, 'mean_days_solitary'].rank(ascending=False)

In [None]:
data = avg_days.reset_index().set_index('facility')

In [None]:
data = data.reset_index().set_index('placement_date').dropna()

In [None]:
temp = data[data['facility'] == pogo_nwdc_str]

In [None]:
temp

In [None]:
temp.reset_index().set_index('placement_date')['mean_days_solitary'].plot()

Starting to become legible... NWDC consistently among top large facilities in terms of average length of solitary placement.

In [None]:
len(set(data['facility']))

In [None]:
fig, ax = plt.subplots(figsize=(10,6))

color=iter(cm.viridis_r(np.linspace(0,1,15)))
grey=iter(cm.Greys(np.linspace(0,1,100)))

for facil in top_15_facil:
# for facil in data['facility'].unique():
    if facil == pogo_nwdc_str:
        data_sub = data[data['facility'] == facil]
        data_sub['mean_days_solitary'].name = facil
        c=next(color)
        ax.plot(data_sub['mean_days_solitary'], color=c, alpha=1, linewidth=3)
    else:
        data_sub = data[data['facility'] == facil]
        data_sub['mean_days_solitary'].name = facil
        c=next(color)
        ax.plot(data_sub['mean_days_solitary'], color=c, alpha=.9)

ax.yaxis.grid(color='#DDDDDD')
plt.suptitle('Avg. length of solitary confinement in immigration detention (quarterly)', fontsize=20)
plt.title('Top 15 ICE facilities by total solitary placements, Jan. 2016 - May 2018', fontsize=14)
handles, labels = ax.get_legend_handles_labels()
leg = ax.legend(loc='upper right', bbox_to_anchor=(1.9, 1), fontsize=12,
                title='Detention Facility - Total Solitary Placements',
                title_fontsize=14, labels=edit_labels.values)
plt.ylabel('Length of stay (days)', fontsize=14)
txt='Source: ICE data obtained via FOIA by Project on Government Oversight\nAnalysis and figure by UW Center for Human Rights'
plt.figtext(1.58, .05, txt, wrap=True, horizontalalignment='right', fontsize=12)
fig.savefig('../output/avg_solitary_length.png', dpi=300, bbox_inches='tight')
plt.show()
    

In [None]:
data = pogo[pogo['facility'].isin(top_15_facil)]

In [None]:
data

In [None]:
order = data.groupby('detloc')['record_id'].count().sort_values(ascending=False).index

In [None]:
# Top 15 facilities by solitary placements distribution of solitary length (in order of # of placements)
ax = sns.boxplot(x='detloc',y='log_days_solitary',data=data, order=order)
ax.set_xticklabels(ax.get_xticklabels(),rotation=90);

In [None]:
# Top 15 facilities by solitary placements distribution of solitary length
ax = sns.boxplot(x='detloc',y='log_days_solitary',data=data)
ax.set_xticklabels(ax.get_xticklabels(),rotation=90);

# ADP/ALOS

In [None]:
adp.index = pd.to_datetime(adp.index)
alos.index = pd.to_datetime(alos.index )

In [None]:
adp.index = ['2008-10-01',
             '2009-10-01',
             '2010-10-01',
             '2011-10-01',
             '2012-10-01',
             '2013-10-01',
             '2014-10-01',
             '2015-10-01',
             '2016-10-01',
             '2017-10-01',
            ]
alos.index = ['2014-10-01',
              '2015-10-01',
              '2016-10-01',
              '2017-10-01',
             ]

In [None]:
adp = adp['2015':].T
alos = alos['2015':].T

In [None]:
fy_sol_alos = pogo.set_index('placement_date').groupby([pd.Grouper(freq='AS-OCT'),'detloc'])['days_solitary'].mean()
fy_sol_count = pogo.set_index('placement_date').groupby([pd.Grouper(freq='AS-OCT'),'detloc'])['record_id'].count()

In [None]:
fy_sol_alos = fy_sol_alos.unstack().T
fy_sol_count = fy_sol_count.unstack().T

In [None]:
adp.columns = ['FY16 ADP', 'FY17 ADP', 'FY18 ADP']
alos.columns = ['FY16 ALOS', 'FY17 ALOS', 'FY18 ALOS']
fy_sol_alos.columns = ['FY16 solitary ALOS', 'FY17 solitary ALOS', 'FY18 solitary ALOS']
fy_sol_count.columns = ['FY16 solitary count', 'FY17 solitary count', 'FY18 solitary count']

In [None]:
facil_solitary_adp.head()

In [None]:
data = adp.join([alos, fy_sol_alos, fy_sol_count, facil_solitary_adp])

In [None]:
data.head()

In [None]:
data = data[data.loc[:,'FY16 ALOS'] < 500]
data = data[data.loc[:,'FY17 ALOS'] < 500]
data = data[data.loc[:,'FY18 ALOS'] < 500]

In [None]:
fys = ['FY16',
       'FY17',
       'FY18']

In [None]:
#this is currently excluding all but 1 redacted ORR facility

target_facil = 'CSCNWWA'

fig = plt.figure(figsize=(10,8))
i = 0
for fy in fys:
    adp_col = f'{fy} ADP'
    alos_col = f'{fy} ALOS'
    ax=plt.subplot(2,2,i+1)
    i = i+1
    
    plt.scatter(x=alos_col, y=adp_col, data=data)
    plt.scatter(x=alos_col, y=adp_col, c='r', data=data.loc[target_facil])
    
    plt.xlabel('ALOS')
    plt.xticks(np.arange(0, 450, step=50), rotation=45)
    plt.ylabel('ADP')
    plt.yticks(np.arange(0, 2001, step=200))
    plt.ylim(-100, 2000)
    plt.title(f'{fy}')

plt.suptitle(f'ICE facilities by ADP, ALOS; {target_facil} highlighted', fontsize=14)
plt.subplots_adjust(wspace=.4, hspace=.4)
plt.show;

In [None]:
data[['FY16 solitary ALOS', 'FY17 solitary ALOS',
       'FY18 solitary ALOS', 'FY16 solitary count', 'FY17 solitary count',
       'FY18 solitary count', 'FY17_solitary_ADP']].to_csv('../output/solitary_stats.csv', sep='|')