Skip to content

Commit

Permalink
[US_TN] Summary Sheet + CAF Form Updates (Recidiviz/recidiviz-data#29346
Browse files Browse the repository at this point in the history
)

## Description of the change

This PR makes a series of updates to the opportunity record query for TN
Annual Reclass + Custody Level Downgrade as detailed here Recidiviz/recidiviz-data#29105

Some specifics to call out:
- There were many references to CellBedAssignment raw data while we work
out ingest issues. So I've created a new analyst_data view to pull
current facility and unit information and updated references to that in
various analyst data views + resident record
- There are some new case notes added and some metadata removed. I've
tried to comment on those specifics since they may change or break
things on the front end
- Updates

Validation notes:

1. The following query returns 0 rows as expected because all the
updated admission dates should be later than the original ones since the
original ones are prior to someone entering a TDOC facility
```
WITH sandbox AS (
  select person_external_id, admission_date
  from `recidiviz-123.dsharm20240425_workflows_views.resident_record_materialized`,
  unnest(all_eligible_opportunities) as all_eligible_opportunities
  WHERE state_code = 'US_TN'
),  main AS (
  select person_external_id, admission_date
  from `recidiviz-123.workflows_views.resident_record_materialized`,
  unnest(all_eligible_opportunities) as all_eligible_opportunities
  WHERE state_code = 'US_TN'
)
select *
from sandbox join main using(person_external_id)
where sandbox.admission_date < main.admission_date
```

2. Eligibility counts dont change for any opp in resident record

```
WITH sandbox AS (
  select all_eligible_opportunities, count(*) as sandbox_c
  from `recidiviz-123.dsharm20240425_workflows_views.resident_record_materialized`,
  unnest(all_eligible_opportunities) as all_eligible_opportunities
  group by 1
),  main AS (
  select all_eligible_opportunities, count(*) as main_c
  from `recidiviz-123.dsharm20240425_main_workflows_views.resident_record_materialized`,
  unnest(all_eligible_opportunities) as all_eligible_opportunities
  group by 1
)
select *
from sandbox join main using(all_eligible_opportunities)
where sandbox.sandbox_c != main.main_c
```

3. Other checks - Q6 metadata should now only have arrays of length 1 or
2 (or null) because we're keeping 2 most recent violations rather than
all, which we see when comparing sandbox output to main

```
select ARRAY_LENGTH(JSON_QUERY_ARRAY(form_information_q6_notes)), count(*)
from `recidiviz-123.dsharm20240425_main_workflows_views.us_tn_annual_reclassification_review_record_materialized`
group by 1
```

## Type of change

> All pull requests must have at least one of the following labels
applied (otherwise the PR will fail):

| Label | Description |
|-----------------------------
|-----------------------------------------------------------------------------------------------------------
|
| Type: Bug | non-breaking change that fixes an issue |
| Type: Feature | non-breaking change that adds functionality |
| Type: Breaking Change | fix or feature that would cause existing
functionality to not work as expected |
| Type: Non-breaking refactor | change addresses some tech debt item or
prepares for a later change, but does not change functionality |
| Type: Configuration Change | adjusts configuration to achieve some end
related to functionality, development, performance, or security |
| Type: Dependency Upgrade | upgrades a project dependency - these
changes are not included in release notes |

## Related issues

Closes Recidiviz/recidiviz-data#29105

## Checklists

### Development

**This box MUST be checked by the submitter prior to merging**:
- [x] **Double- and triple-checked that there is no Personally
Identifiable Information (PII) being mistakenly added in this pull
request**

These boxes should be checked by the submitter prior to merging:
- [x] Tests have been written to cover the code changed/added as part of
this pull request

### Code review

These boxes should be checked by reviewers prior to merging:

- [ ] This pull request has a descriptive title and information useful
to a reviewer
- [ ] Potential security implications or infrastructural changes have
been considered, if relevant

GitOrigin-RevId: 6636f1847328ac5b7f1816eb4304bb5d8fcaaa55
  • Loading branch information
DSharm authored and Helper Bot committed May 17, 2024
1 parent fda13ab commit e094b04
Show file tree
Hide file tree
Showing 11 changed files with 263 additions and 172 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,9 @@
from recidiviz.calculator.query.state.views.analyst_data.us_tn.us_tn_caf_q8 import (
US_TN_CAF_Q8_VIEW_BUILDER,
)
from recidiviz.calculator.query.state.views.analyst_data.us_tn.us_tn_cellbed_assignment_raw import (
US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_BUILDER,
)
from recidiviz.calculator.query.state.views.analyst_data.us_tn.us_tn_classification_raw import (
US_TN_CLASSIFICATION_RAW_VIEW_BUILDER,
)
Expand Down Expand Up @@ -332,4 +335,5 @@
US_TN_SEGREGATION_STAYS_VIEW_BUILDER,
US_TN_CLASSIFICATION_RAW_VIEW_BUILDER,
WORKFLOWS_PERSON_IMPACT_FUNNEL_STATUS_SESSIONS_VIEW_BUILDER,
US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_BUILDER,
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
# Recidiviz - a data platform for criminal justice reform
# Copyright (C) 2024 Recidiviz, Inc.
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
# =============================================================================
"""Computes current facility / unit from raw TN data"""

from recidiviz.big_query.big_query_view import SimpleBigQueryViewBuilder
from recidiviz.calculator.query.state.dataset_config import ANALYST_VIEWS_DATASET
from recidiviz.utils.environment import GCP_PROJECT_STAGING
from recidiviz.utils.metadata import local_project_id_override

US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_NAME = "us_tn_cellbed_assignment_raw"

US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_DESCRIPTION = (
"""Computes current facility / unit from raw TN data"""
)

US_TN_CELLBED_ASSIGNMENT_RAW_QUERY_TEMPLATE = """
-- TODO(#24959): Deprecate usage when re-run is over
-- TODO(#27428): Once source of facility ID in TN is reconciled, this can be removed
SELECT
state_code,
person_id,
/* From TN: "The ‘Requested’ location columns contain the new housing assignment
(where the person is being moved to). The ‘Assigned’ columns show the person’s assigned bed
when the new cell bed assignment was entered. Same for the ‘Actual’ columns – this is where he
‘actually’ was when the request was made (this is only different from Assigned when the person
has been moved to another institution temporarily)." Our interpretation of this is that Requested and
Actual work sometimes together and sometimes on a "lag"; Actual should be updated to show the same
thing as Requested, but sometimes it isnt, so we rely on Requested most, then Actual.
For current population, Requested is always hydrated, so the COALESCE is not strictly
needed but is a catch all if Requested is ever missing */
COALESCE(RequestedSiteID, ActualSiteID, AssignedSiteID) AS facility_id,
COALESCE(RequestedUnitID, ActualUnitID, AssignedUnitID) AS unit_id,
FROM `{project_id}.us_tn_raw_data_up_to_date_views.CellBedAssignment_latest` c
INNER JOIN `{project_id}.normalized_state.state_person_external_id` pei
ON c.OffenderID = pei.external_id
AND pei.state_code = "US_TN"
-- The latest assignment is not always the one with an open assignment. This can occur when someone is assigned to a facility
-- but temporarily sent to another facility (e.g. Special needs facility). Most people only have 1 open assignment, unless
-- they are currently temporarily sent elsewhere (~11 people out of 18k) so we further deduplicate by choosing the latest assignment
WHERE EndDate IS NULL
QUALIFY ROW_NUMBER() OVER(PARTITION BY OffenderID ORDER BY CAST(AssignmentDateTime AS DATETIME) DESC) = 1
"""

US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_BUILDER = SimpleBigQueryViewBuilder(
dataset_id=ANALYST_VIEWS_DATASET,
view_id=US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_NAME,
description=US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_DESCRIPTION,
view_query_template=US_TN_CELLBED_ASSIGNMENT_RAW_QUERY_TEMPLATE,
should_materialize=True,
)

if __name__ == "__main__":
with local_project_id_override(GCP_PROJECT_STAGING):
US_TN_CELLBED_ASSIGNMENT_RAW_VIEW_BUILDER.build_and_print()
Original file line number Diff line number Diff line change
Expand Up @@ -45,17 +45,8 @@
INNER JOIN `{project_id}.normalized_state.state_person`
USING(person_id, state_code)
-- TODO(#27428): Remove this join when custody level information aligns with location information
INNER JOIN (
SELECT
OffenderID AS external_id,
COALESCE(RequestedSiteID, ActualSiteID, AssignedSiteID) AS facility_id,
COALESCE(RequestedUnitID, ActualUnitID, AssignedUnitID) AS unit_id,
FROM `{project_id}.us_tn_raw_data_up_to_date_views.CellBedAssignment_latest`
-- Ensures that someone is actively assigned to a TDOC bed at the moment
WHERE EndDate IS NULL
QUALIFY ROW_NUMBER() OVER(PARTITION BY OffenderID ORDER BY CAST(AssignmentDateTime AS DATETIME) DESC) = 1
)
USING(external_id)
INNER JOIN `{{project_id}}.analyst_data.us_tn_cellbed_assignment_raw_materialized`
USING(person_id, state_code)
WHERE c.state_code = 'US_TN'
AND c.custody_level = 'MAXIMUM'
AND c.end_date_exclusive is null
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@
pei.person_id,
pei.state_code,
ofs.description,
DispositionDate AS disposition_date,
CAST(CAST(DispositionDate AS DATETIME) AS DATE) AS disposition_date,
CAST(CAST(ArrestDate AS DATETIME) AS DATE) AS sentence_effective_date,
CAST(CAST(OffenseDate AS DATETIME) AS DATE) AS offense_date,
CourtName AS conviction_county,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -54,32 +54,15 @@
s.SegragationReason AS segregation_reason,
s.SegregationStatus AS segregation_status,
s.SegregationType AS segregation_type,
c.current_facility_id,
c.current_unit_id
c.facility_id AS current_facility_id,
c.unit_id AS current_unit_id,
FROM `{{project_id}}.us_tn_raw_data_up_to_date_views.Segregation_latest` s
INNER JOIN `{{project_id}}.normalized_state.state_person_external_id` pei
ON s.OffenderID = pei.external_id
AND pei.state_code = 'US_TN'
-- TODO(#27428): Remove this join when custody level information aligns with location information
LEFT JOIN (
SELECT
OffenderID AS external_id,
/* From TN: "The ‘Requested’ location columns contain the new housing assignment
(where the person is being moved to). The ‘Assigned’ columns show the person’s assigned bed
when the new cell bed assignment was entered. Same for the ‘Actual’ columns – this is where he
‘actually’ was when the request was made (this is only different from Assigned when the person
has been moved to another institution temporarily)." Our interpretation of this is that Requested and
Actual work sometimes together and sometimes on a "lag"; Actual should be updated to show the same
thing as Requested, but sometimes it isnt, so we rely on Requested most, then Actual.
For current population, Requested is always hydrated, so the COALESCE is not strictly
needed but is a catch all if Requested is ever missing */
COALESCE(RequestedSiteID, ActualSiteID, AssignedSiteID) AS current_facility_id,
COALESCE(RequestedUnitID, ActualUnitID, AssignedUnitID) AS current_unit_id,
FROM `{{project_id}}.us_tn_raw_data_up_to_date_views.CellBedAssignment_latest`
WHERE EndDate IS NULL
QUALIFY ROW_NUMBER() OVER(PARTITION BY OffenderID ORDER BY CAST(AssignmentDateTime AS DATETIME) DESC) = 1
) c
USING(external_id)
LEFT JOIN `{{project_id}}.analyst_data.us_tn_cellbed_assignment_raw_materialized` c
USING(person_id, state_code)
-- There are a very small number of duplicates on person id and start date in the segregation table
QUALIFY ROW_NUMBER() OVER(PARTITION BY pei.person_id, start_date ORDER BY end_date DESC) = 1
),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,9 +84,6 @@
us_me_raw_data_up_to_date_dataset=raw_latest_views_dataset_for_region(
state_code=StateCode.US_ME, instance=DirectIngestInstance.PRIMARY
),
us_tn_raw_data_up_to_date_dataset=raw_latest_views_dataset_for_region(
state_code=StateCode.US_TN, instance=DirectIngestInstance.PRIMARY
),
us_me_task_eligibility_criteria_dataset=task_eligibility_criteria_state_specific_dataset(
StateCode.US_ME
),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -57,12 +57,46 @@

_RESIDENT_RECORD_INCARCERATION_DATES_CTE = f"""
incarceration_dates AS (
-- Adding a TN specific admission date that is admission to TDOC facility, in addition to overall incarceration
-- admission date
SELECT
ic.*,
MAX(t.projected_completion_date_max)
OVER(w) AS release_date,
MAX(c.start_date)
OVER(w) AS us_tn_facility_admission_date,
FROM
incarceration_cases ic
LEFT JOIN (
SELECT cs.person_id, cs.state_code, cs.start_date
FROM `{{project_id}}.{{sessions_dataset}}.compartment_sub_sessions_materialized` cs
INNER JOIN `{{project_id}}.reference_views.location_metadata_materialized`
ON facility = location_external_id
WHERE cs.state_code = 'US_TN'
AND compartment_level_1 = 'INCARCERATION'
AND compartment_level_2 = 'GENERAL'
AND location_type = 'STATE_PRISON'
AND CURRENT_DATE('US/Eastern') BETWEEN cs.start_date
AND {nonnull_end_date_clause('cs.end_date_exclusive')}
) c
USING(person_id, state_code)
LEFT JOIN `{{project_id}}.{{sessions_dataset}}.incarceration_projected_completion_date_spans_materialized` t
ON ic.person_id = t.person_id
AND ic.state_code = t.state_code
AND CURRENT_DATE('US/Eastern')
BETWEEN t.start_date AND {nonnull_end_date_clause('t.end_date_exclusive')}
WHERE ic.state_code IN ("US_TN")
WINDOW w as (PARTITION BY ic.person_id)
UNION ALL
SELECT
ic.* EXCEPT(admission_date),
MAX(t.start_date)
OVER(w) AS admission_date,
MAX({nonnull_end_date_clause('t.end_date')})
OVER(w) AS release_date
OVER(w) AS release_date,
CAST(NULL AS DATE) AS us_tn_facility_admission_date,
--TODO(#16175) ingest intake and release dates
FROM
incarceration_cases ic
Expand All @@ -83,7 +117,8 @@
SELECT
ic.* EXCEPT(admission_date),
NULL AS admission_date,
NULL AS release_date
NULL AS release_date,
CAST(NULL AS DATE) AS us_tn_facility_admission_date,
FROM incarceration_cases ic
WHERE state_code="US_MO"
Expand All @@ -92,15 +127,16 @@
SELECT
ic.*,
MAX(t.projected_completion_date_max)
OVER(w) AS release_date
OVER(w) AS release_date,
CAST(NULL AS DATE) AS us_tn_facility_admission_date,
FROM
incarceration_cases ic
LEFT JOIN `{{project_id}}.{{sessions_dataset}}.incarceration_projected_completion_date_spans_materialized` t
ON ic.person_id = t.person_id
AND ic.state_code = t.state_code
AND CURRENT_DATE('US/Eastern')
BETWEEN t.start_date AND {nonnull_end_date_clause('t.end_date_exclusive')}
WHERE ic.state_code NOT IN ("US_ME", "US_MO")
WHERE ic.state_code NOT IN ("US_ME", "US_MO", "US_TN")
WINDOW w as (PARTITION BY ic.person_id)
),
"""
Expand All @@ -112,7 +148,7 @@
{revert_nonnull_start_date_clause('admission_date')} AS admission_date,
{revert_nonnull_end_date_clause('release_date')} AS release_date
FROM incarceration_dates
GROUP BY 1,2,3,4,5,6,7,8
GROUP BY 1,2,3,4,5,6,7,8,9
),
"""

Expand Down Expand Up @@ -167,7 +203,6 @@
housing_unit AS (
SELECT
person_id,
-- TODO(#27428): Once source of facility ID in TN is reconciled, this can be removed
CAST(NULL AS STRING) AS facility_id,
housing_unit AS unit_id
FROM `{{project_id}}.{{normalized_state_dataset}}.state_incarceration_period`
Expand All @@ -180,28 +215,19 @@
SELECT
person_id,
-- TODO(#27428): Once source of facility ID in TN is reconciled, this can be removed
CAST(NULL AS STRING) AS facility_id,
IF(complex_number=building_number, complex_number, complex_number || " " || building_number) AS unit_id
FROM current_bed_stay
UNION ALL
--TODO(#24959): Deprecate usage when re-run is over
-- TODO(#27428): Once source of facility ID in TN is reconciled, this can be removed
SELECT
person_id,
-- TODO(#27428): Once source of facility ID in TN is reconciled, this can be removed
COALESCE(RequestedSiteID, ActualSiteID, AssignedSiteID) AS facility_id,
COALESCE(RequestedUnitID, ActualUnitID, AssignedUnitID) AS unit_id,
FROM `{{project_id}}.{{us_tn_raw_data_up_to_date_dataset}}.CellBedAssignment_latest` c
INNER JOIN `{{project_id}}.{{normalized_state_dataset}}.state_person_external_id` pei
ON c.OffenderID = pei.external_id
AND pei.state_code = "US_TN"
-- The latest assignment is not always the one with an open assignment. This can occur when someone is assigned to a facility
-- but temporarily sent to another facility (e.g. Special needs facility). Most people only have 1 open assignment, unless
-- they are currently temporarily sent elsewhere (~11 people out of 18k) so we further deduplicate by choosing the latest assignment
WHERE EndDate IS NULL
QUALIFY ROW_NUMBER() OVER(PARTITION BY OffenderID ORDER BY CAST(AssignmentDateTime AS DATETIME) DESC) = 1
facility_id,
unit_id
FROM `{{project_id}}.analyst_data.us_tn_cellbed_assignment_raw_materialized`
),
"""

Expand Down Expand Up @@ -279,6 +305,7 @@
custody_level.custody_level,
ic.admission_date,
ic.release_date,
ic.us_tn_facility_admission_date,
FROM
incarceration_cases_wdates ic
LEFT JOIN custody_level
Expand Down Expand Up @@ -306,6 +333,7 @@
custody_level,
admission_date,
release_date,
us_tn_facility_admission_date,
opportunities_aggregated.all_eligible_opportunities,
portion_served_needed,
GREATEST(portion_needed_eligible_date, months_remaining_eligible_date) AS sccp_eligibility_date,
Expand Down

0 comments on commit e094b04

Please sign in to comment.