# Clustering Neighborhoods in Pittsburgh for Pharmacy Location Intelligence

This is a project for my IBM Data Science Capstone on Coursera.

# Introduction

The importance of Pharmacies in neighborhoods cannot be over emphasized. About 70% of Americans are on at least one prescription drug and over 50% of Americans are on at least two prescription drugs. In some communities, Pharmacists may be the sole source of clinical advice to the people. 

Some experts argue that, given the widening scope of services many pharmacies are providing, including physicals, immunizations, drug counseling, sexually transmitted infection screening, other laboratory testing and even access to naloxone, the medication used to reverse opioid overdose. Pharmacies are increasingly important pieces of the national conversation around health care.

This project will focus on clustering pittsburgh neighborhoods for pharmacy location intelligence. I will be using the Foursquare API for location analysis. I will perform data mining and data preparation using required libraries such as pandas, numpy etc. I will segment and cluster neighborhoods using the k-means algorithm. The folium library will be used to visualize the neighborhoods and their respective clusters.

# Business Problem

Pittsburgh is a city in the state of Pennsylvania with a population of about 302,407(2018) and is the county seat of Allegheny. Allegheny County historically has been one of the oldest counties in the US. This emphasizes one of the importance of pharmacy in the county's neighborhood. Most seniors who have healthcare needs such as refilling their prescription may not have the opportunity of driving miles which in effect may cause them not taking their medications consistently. 

In some areas where there is demonstrable need for access to naloxone, pharmacy closures or lack thereof can frustrate treatment. A research conducted in Cook county, Illinois shows that community areas where opioid-related deaths are higher than the Chicago average are in areas where as at 2017 has one or fewer active pharmacies. “A lot of public attention focuses on insurance, but that’s not enough, even if medications are affordable, if the pharmacy isn’t accessible, they're not accessible.” Dima Qato.

This project is targeted at independent pharmacies, chain pharmacies, the City of Pittsburgh, and other healthcare stakeholders


# Data

I will be using the Foursquare location data for my clustering analysis. In addition to the foursquare data I will perform webscraping to source for the addresses and list of Pharmacies in Pittsburgh area for mapping purpose. Pittsburgh census data from Data.gov by zip codes and list of overdose in pttsburgh (2017) by zip codes will also be used in this project.

In [2]:
import pandas as pd #library to handle data analysis

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import csv

import numpy as np # library to handle data in a vectorized manner

import json # library to handle JSON files

from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

from bs4 import BeautifulSoup

import requests

from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors
import matplotlib as plt
%matplotlib inline

#import folium # map rendering library

# import k-means from clustering stage
from sklearn.cluster import KMeans

print('libraries imported')

libraries imported


##### Loading pittsburgh zipcodes with its longitude and latitude

In [3]:
path=r'C:\Users\Olawale\Desktop\IBM Data Science\PittZip_geocode.csv'
pitt_geodata=pd.read_csv(path, engine='python')

#Creating a cleaner dataframe pitts_neborhood
pitts_neborhood=pitt_geodata[['postcode', 'latitude', 'longitude']]
pitts_neborhood.head()

Unnamed: 0,postcode,latitude,longitude
0,15122,40.3651,-79.8973
1,15201,40.4722,-79.9529
2,15202,40.5015,-80.0682
3,15203,40.4256,-79.9771
4,15204,40.4509,-80.0552


##### Using Foursquare API to explore pittsburgh neighborhood data

In [4]:
# The code was removed by Watson Studio for sharing.

CLIENT_ID, CLIENT_SECRET & VERSION Processed


In [7]:
neighborhood_latitude = pitts_neborhood.loc[9, 'latitude'] # neighborhood latitude value
neighborhood_longitude = pitts_neborhood.loc[9, 'longitude'] # neighborhood longitude value
neighborhood_zip = pitts_neborhood.loc[9, 'postcode'] # neighborhood name

print('Latitude and longitude values of {} are {}, {}.'.format(neighborhood_zip, 
                                                               neighborhood_latitude, 
                                                               neighborhood_longitude))

Latitude and longitude values of 15209 are 40.4976, -79.9781.


In [8]:
# Calling the Foursquare API
search_query='15209'
radius=500
LIMIT=100


url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
    CLIENT_ID, CLIENT_SECRET, neighborhood_latitude, neighborhood_longitude, VERSION, radius, LIMIT)


results = requests.get(url).json()
results

{'meta': {'code': 200, 'requestId': '5e548c771a4b0a001b39d2a0'},
 'response': {'headerLocation': 'Shaler Township',
  'headerFullLocation': 'Shaler Township',
  'headerLocationGranularity': 'city',
  'totalResults': 8,
  'suggestedBounds': {'ne': {'lat': 40.5021000045, 'lng': -79.97219336328281},
   'sw': {'lat': 40.493099995499996, 'lng': -79.98400663671718}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bb3d58114cfd13aa89e16ab',
       'name': "Rita's Italian Ice & Frozen Custard",
       'location': {'address': '1320 Babcock Blvd',
        'lat': 40.49596561608586,
        'lng': -79.97646061587949,
        'labeledLatLngs': [{'label': 'display',
          'lat': 40.49596561608586,
          'lng': -79.97646061587949}],
        'distance': 228,
        'postalCode

##### Webscraping Pittsburgh list of Pharmacy data

In [9]:
url=requests.get('https://www.rxlist.com/pharmacy/pittsburgh-pa_pharmacies.htm').text
soup=BeautifulSoup(url, 'lxml')

In [10]:
tags = soup('p')
pharmalist=list()

for tag in tags:
    if len(tag)<1:continue
    words=tag.text
    if not "Pharmacy" in words: continue
    for line in words.splitlines():
        lines = line.strip()
        if lines.startswith('Find'):continue
        pharmalist.append(lines)
        print (lines)
#print(pharmalist[0:9])

Rite Aid Pharmacy #10937 623 Smithfield StPittsburgh,PA 15222 (412) 471-8882
CVS Pharmacy #5100 429 Smithfield StPittsburgh,PA 15222 (412) 261-4846
CVS Pharmacy #4008 242 5th AvePittsburgh,PA 15222 (412) 566-2619
AHN Pharmacy #5 120 5th Ave FL 3Pittsburgh,PA 15222 (412) 471-5901
CVS Pharmacy #4133 520 Penn AveWilkinsburg,PA 15221 (412) 243-6048
Rite Aid Pharmacy #10936 519 Penn AvePittsburgh,PA 15222 (412) 391-0969
Mercy Health Center Pharmacy 1515 Locust St Ste 1Pittsburgh,PA 15219 (412) 232-7672
Giant Eagle Pharmacy #477 318 Cedar AvePittsburgh,PA 15212 (412) 321-3553
Rite Aid Pharmacy #10928 623 E Ohio StPittsburgh,PA 15212 (412) 322-1566
Medicine Shoppe Pharmacy #1846 330 S 9th St Ste 180Pittsburgh,PA 15203 (412) 697-4880
Rite Aid Pharmacy #4965 201 Grace StPittsburgh,PA 15211 (412) 381-1464
AGH Apothecary Pharmacy 320 E North Ave Ste 111Pittsburgh,PA 15212 (412) 359-8677
Giant Eagle Pharmacy #61 2021 Wharton StPittsburgh,PA 15203 (412) 488-1802
Rite Aid Pharmacy #10921 1915 E Cars

##### Loading Pittsburgh 2017 fatal overdose incident file by postcode

In [12]:
path=r"C:\Users\Olawale\Documents\crimelabaccidentaldrugdeathsextract2017.csv"
pitt_opioid=pd.read_csv(path, engine='python')
pitt_opioid.head()

Unnamed: 0,Death Date,Death Time,Manner of Death,Age,Sex,Race,Case Dispo,Combined OD1,Combined OD2,Combined OD3,Combined OD4,Combined OD5,Combined OD6,Combined OD7,Incident Zip,Decedent Zip,Case Year
0,1/1/2017,5:55 AM,Accidents,57,Male,Black or African American,MO,Alcohol,Cocaine,Fentanyl,,,,,15219,15202.0,2017
1,1/1/2017,8:12 AM,Accident,39,Male,White,MO,Amitriptyline,Heroin,Nortriptyline,Oxymorphone,,,,15216,15216.0,2017
2,1/1/2017,9:13 AM,Accident,20,Male,White,MO,Diphenhydramine,Fentanyl,Mirtazapine,,,,,15101,15101.0,2017
3,1/1/2017,2:11 PM,Accident,28,Male,White,MO,Fentanyl,Furanyl Fentanyl,Heroin,U-47700 Synthetic Opioid,,,,15210,15210.0,2017
4,1/1/2017,5:21 PM,Accident,29,Male,White,MO,Fentanyl,,,,,,,15206,15206.0,2017
