# COGS 108 - Project Proposal

## Authors


- Zilong Feng – Analysis, Data curation, Visualization
- Alice Zhang – Background research, Writing – original draft
- Bob Li – Methodology, Data curation, Writing – review & editing

## Research Question

Our research question is: What is the relationship between the passage of Oregon's voting Bill 110 and the changes in housing prices and opioid-related events in Portland from 2020 to 2021? Specifically, we investigated the correlations between opioid overdose, hospitalization, and housing market indicators such as median selling prices and the number of days on Zillow.

## Background and Prior Work

While prior research has examined the relationship between drug policy changes and public health outcomes, as well as housing instability among populations affected by substance use, few studies have explored how drug decriminalization policies may coincide with shifts in housing market dynamics at the city level. Our project builds on this work by examining population-level trends in Portland before and after the passage of Measure 110.

Existing literature has largely focused on either public health or housing in isolation. For instance, research on housing instability and opioid-related harms suggests that these issues often co-occur, particularly in contexts of rapid policy change². Another strand of literature has evaluated the impact of Measure 110 on overdose mortality, noting that decriminalization itself was not associated with increased fatal overdoses once the spread of fentanyl was accounted for¹. However, there remains a gap in understanding how such policy shifts intersect with urban housing metrics—such as median selling prices and market velocity—especially in cities like Portland where housing affordability and drug policy have both been salient public issues³.

Our research question is: What is the relationship between the passage of Oregon's Ballot Measure 110 and the changes in housing prices and opioid-related events in Portland from 2020 to 2021? Specifically, we investigated the correlations between opioid overdose, hospitalization, and housing market indicators such as median selling prices and the number of days listed on Zillow before sale. To address this, we employ a longitudinal analysis comparing pre- and post-Measure 110 data, while controlling for relevant confounders such as broader economic trends and the influx of fentanyl into the drug supply⁴.

This study contributes to the literature by bridging housing economics with drug policy evaluation, offering a more integrated perspective on how decriminalization may correlate with changes in urban living conditions. If associations are found, they could inform future policy design that simultaneously considers public health and housing stability.

¹ M. Joukov, Artem. “Can You Take Me Higher? No. 110, Drug Decriminalization, and Residential Real Estate Prices.” Drug Decriminalization, and Residential Real Estate Prices (October 28, 2024) (2024).

² Staudt, Sarah. “Oregon Shouldn’t Go Backwards on Drug Decriminalization.” www.prisonpolicy.org, 15 Feb. 2024, www.prisonpolicy.org/blog/2024/02/15/oregon-110/.

³ Swensen, Isaac D., and Cody Tuttle. "The Effects of the Opioid Crisis on Housing Markets." Journal of Urban Economics, vol. 132, 2022, p. 103517. https://doi.org/10.1016/j.jue.2022.103517.

⁴ Oregon Health Authority. "Oregon’s Statewide Drug Overdose Surveillance Dashboard." OHA Public Health Division, Updated Feb. 2024. https://www.oregon.gov/oha/ph/diseasesconditions/opioids/pages/data-dashboard.aspx.

## Hypothesis


Based on the findings of prior research and the observed trends surrounding Measure 110, we hypothesize that the decriminalization policy will be negatively associated with housing market indicators in Portland. Specifically, we expect that the period following the passage of Measure 110 will correlate with a decrease in median housing prices and an increase in the average number of days properties remain listed on Zillow, concurrent with an increase in opioid-related overdose and hospitalization rates.

## Data

To address our research question, we will use multiple real-world datasets that contain information on housing prices and opioid-related incidents in the Portland, Oregon area.

1. Portland Housing Prices Data:We will use the dataset “Portland Housing Prices with Time on Market” from Kaggle, which provides information on median sale prices and “days on Zillow” for houses in Portland before and after the implementation of Measure 110. This dataset includes variables such as city, sale date, median price, and average days on market, which we will use to measure changes in housing affordability and market velocity over time.
Dataset URL:
https://www.kaggle.com/datasets/threnjen/portland-housing-prices-sales-jul-2020-jul-2021

2. Opioid Overdose and Drug-Related Mortality Data:To measure opioid-related outcomes, we will use provisional drug overdose death data from the Centers for Disease Control and Prevention (CDC). This dataset provides monthly overdose death counts by state and drug category, including opioids. We will subset the data to focus on Oregon and use overdose death rates as a proxy for opioid-related harm during the study period.
Dataset URL:
https://data.cdc.gov/NCHS/Provisional-Drug-Overdose-Death-Counts/xkb8-kh2a

3. Supplemental Public Health Data: If available, we will supplement with county-level health statistics from public sources such as the CDC WONDER database or Oregon Health Authority to capture opioid overdose mortality and hospitalization rates. These data can provide additional validation of patterns seen in the survey-based drug use measures.

All datasets we plan to use are publicly accessible and do not contain personally identifiable information (PII). Before analysis, we will clean and aggregate the data as needed to align geographic units (e.g., ZIP code or county) and time ranges, ensuring consistency across sources. We will then merge housing and opioid data by common temporal and geographic identifiers to explore correlations and trends.

## Ethics 

Instructions: Keep the contents of this cell. For each item on the checklist
-  put an X there if you've considered the item
-  IF THE ITEM IS RELEVANT place a short paragraph after the checklist item discussing the issue.
  
Items on this checklist are meant to provoke discussion among good-faith actors who take their ethical responsibilities seriously. Your teams will document these discussions and decisions for posterity using this section.  You don't have to solve these problems, you just have to acknowledge any potential harm no matter how unlikely.

Here is a [list of real world examples](https://deon.drivendata.org/examples/) for each item in the checklist that can refer to.

[![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/)

### A. Data Collection
 - [X] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?

> Example of how to use the checkbox, and also of how you can put in a short paragraph that discusses the way this checklist item affects your project.  Remove this paragraph and the X in the checkbox before you fill this out for your project

 - [X] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?
>Drug incidence data is often a proxy for policing intensity rather than actual drug usage rates. Low-income neighborhoods or areas with more public transit may show higher drug incidence in police records because of increased surveillance, not necessarily because drug use is higher than high-income areas.
 - [X] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?
>We should ensure that you are using aggregated data rather than individual data.
 - [X] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?
>Our project involves "protected group status" implicitly through geographic proxies (zip codes). There is a risk that findings could be used to further stigmatize specific Portland neighborhoods or justify "redlining-like" behavior in real estate. We should attempt to discuss the socioeconomic confounding variables (like poverty rates or historical disinvestment) that impact both housing prices and public health outcomes.

### B. Data Storage
 - [X] **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)?
>While the datasets are public, we will maintain a secure workflow to ensure data security. All data will be processed and stored in a private Github repository. Only team members and course staff will have access.
 - [X] **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed?
 - [X] **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed?
>Our team will retain the data only for the duration of the Winter 2026 quarter. Once we have received our final grade, we will delete all local copies of the dataset and archive the Github repository.


### C. Analysis
 - [X] **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)?
 - [X] **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)?
 - [X] **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data?
 - [X] **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis?
 - [X] **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future?

### D. Modeling
 - [X] **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory?
 - [X] **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)?
>We are aware of potential confounds related to race, financial background, and mental illness. All of these are well documented elements in Measure 110's history, and we believe that the model results will be generalized to be fair with respect to different groups, since we will not be concentrating on any of these aspects in our models. Groups could be revisited in our conclusion if a weak correlation is drawn.
 - [X] **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics?
>Additional metrics plan to be substantiated by the background given, and we have also uncovered papers which have good data models that we want to replicate. By replicating these models, we can wrangle the data to work with other datasets of ours, and begin to explore and analyze data further. Optimizing our defined metrics directly is related to consulting outside sources, and we are prepared to cite these sources appropriately.
 - [X] **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed?
 - [X] **D.5 Communicate limitations**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood?

### E. Deployment
 - [X] **E.1 Monitoring and evaluation**: Do we have a clear plan to monitor the model and its impacts after it is deployed (e.g., performance monitoring, regular audit of sample predictions, human review of high-stakes decisions, reviewing downstream impacts of errors or low-confidence decisions, testing for concept drift)?
 - [X] **E.2 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)?
>Similar to D.2, if we find that our model accentuates more complex relationships in race for instance, we would want a disclaimer of what our data looks at and the connection it aims to establish, and what the data does not suggest. We want to ensure our data is communicated in a way that is indicative of what we wanted to establish a correlational between, and do not want our data used in samples that to make correlations that it was not intended for. If users are excessively harmed, we will act accordingly to have our data be more selective to public. This also doubles as an answer for E.4.
 - [X] **E.3 Roll back**: Is there a way to turn off or roll back the model in production if necessary?
 - [X] **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed?


## Team Expectations 

· Communicate regularly and respond to messages in a timely manner  
· Respect differing opinions and resolve conflicts constructively  
· Share workload fairly and meet agreed-upon deadlines  

## Project Timeline Proposal


| Meeting Date  | Meeting Time| Completed Before Meeting  | Discuss at Meeting |
|---|---|---|---|
| 1/20  |  1 PM | Read and review COGS 108 project expectations; brainstorm potential research topics related to housing and drug policy  | Decide on communication methods; finalize project topic; begin background research | 
| 1/26  |  10 AM | Conduct background research on Measure 110, housing prices, and opioid-related outcomes | Discuss relevant literature, ethics considerations, and potential datasets; outline project proposal |
| 2/1  | 10 AM  | Finalize and submit project proposal; identify and evaluate datasets | Discuss data wrangling plan; assign group members to specific project components |
| 2/14  | 6 PM  | Import and clean housing and overdose datasets; begin exploratory data analysis (EDA) | Review data cleaning steps; refine analysis plan; discuss preliminary findings |
| 2/23  | 12 PM  | Complete EDA and visualizations; begin formal analysis | Discuss results; refine hypotheses if needed; prepare project check-in |
| 3/13  | 12 PM  | Complete analysis; draft results, discussion, and conclusion sections | Review full project; edit for clarity and completeness |
| 3/20  | Before 11:59 PM  | NA | Submit final project and complete group project surveys |