I learned at a Shopify internship process webinar that applicants would need to submit a personal coding project to walk through. So I decided to try and build something from scratch by the deadline. And I wanted to ensure my project would be different from everyone else's.
I chose to focus on the opiate overdose crisis because of my interest in social-justice issues. And it's personal, too; I lost my cousin, Vince, my uncle, Kevin and Xephiral, my friend. The theme is relevant to my application, as well: Shopify empowers merchants and consumers — but you can't become empowered if you're dead.
The opiate overdose crisis sits at the nexus of two intractable social forces — (1) rampant overprescribing of addictive medications, and (2) a toxic, unregulated street drug supply. No one deserves to die because of that. And now there's a chance for them to get better. Once revived, they can survive. Naloxone, a treatment that can temporarily reverse an opioid overdose, is available free at pharmacies around Ontario and across Toronto.
This project has three parts. First, I use data visualization tools and techniques to tell the story of the overdose epidemic. Second, I've built a simple tool based on a geo-spatial dataset that can inform people about the locations of nearby recovery resources, including pharmacies where they can access naloxone. Carrying it on their person, anyone can save a life. Third, I try to build a predictive model based on time-series data about overdose deaths in the province of Ontario.
NOTE: There are three modules to the project: Module 1: Data Visualization, Module 2: Overdose Resources Locator and Module 3: Predictive Model.
Module 2 was set up using an anaconda virtual environment:
conda create --name shopify flask numpy pandas requests
Run the following from a command prompt whilst inside the environment:
conda install -c conda-forge folium
Similarly Module 3 can be run in a virtual environment:
conda install -c matplotlib numpy pandas sklearn statsmodels
Run the following from a command prompt whilst inside the environment:
conda install -c conda-forge pmdarima
Otherwise, the directory structure of the repository should be fairly self-explanatory. For instance, csv files are in the csv directory.
- Data visualization: I believe I've done a reasonably good job bringing the data to life using visualization best practices and attempting to conform to the Polaris style system whilst doing so.
- Overdose resource locator: There are a number of improvements to come. The tool currently relies on geolocation via IP to identify user location, which is not sufficiently accurate. As a next step, it will be re-written in a combination of Javascript and python and deployed for online access using Heroku or a similar platform. This will improve ease of use and drastically increase location accuracy.
- Predictive model: This needs a lot more work. I tried a lot of different approaches and ran into various roadblocks and learned a ton. But there is more to explore in order to ensure I can create the best model possible.
- Toronto Overdose Information System
- Drug Testing Results, Ottawa
- Opioid-related Harms in Canada
- Canadian Institute for Health Information
- Ontario Public Drug Programs Narcotics Monitoring System tracked opioids dataset
- Interactive Opioid Tool | Public Health Ontario
I reviewed the following relevant references whilst working on Revival = Survival. Many of them include details on code/packages/tactics that I did not end up using, however they are all useful to visit.
- Open Street Map
- Shopify Polaris
- A tour of the top 5 sorting algorithms with Python code
- ArcGIS City of Toronto basemap
- City of Toronto Open Data Project: Neighbourhoods
- Using Folium to Visualize Distribution of Public Services in 140 Toronto Neighbourhoods
- Geocoding a Location Using Python and Flask
- Geographic Data with Basemap
- Mapping in Python – QuantEcon DataScience
- Geo-sorting: Using Device Geolocation to Sort by Distance
- Map Stack | Stamen Design
- How to find the distance between two lat-long coordinates in Python
- OpenStreetMap data analysis: how to parse the data with Python?'
- Mapping with Matplotlib, Pandas, Geopandas and Basemap in Python
- GeoPandas 101
- Creating a GeoDataFrame from a DataFrame with coordinates
- Recognize and temporarily reverse an opioid overdose
- Github for Isle of Insanity and Grief: Overcoming my son's overdose and death
- Time Series Analysis and Forecasting with ARIMA
- Step by Step Time Series Analysis
- Time Series Prediction using SARIMAX
- SARIMAX: Introduction — statsmodels
- Vector Autoregression
- pmdarima: ARIMA estimators for Python
There is not as much out there as you would think when it comes to opioid overdose and data science, but here are a few influential articles that inspired me. Note that in some cases, their approaches were only possible because of access to large amounts of privacy-protected data.
Epidemiological and geospatial profile of the prescription opioid crisis in Ohio, United States
Relapse trigger: Predicting stress with A.I.
Patterns in Accidental Drug overdose fatalities
White paper: Data and Analytics to Combat the Opioid Epidemic
Towards automating location-specific opioid toxicosurveillance from Twitter via data science methods