# PharmaFeed

### The Battle of Neighborhoods Project

#### Business Problem

In recent days the need for pharmaceutical drugs in Italy has drastically increased due to the Covid-19 pandemic. Thus, the purpose of this study is to map find the closest drug maker, depot or wholesaler to satisfy pharmacies' needs. 

As a consequence, this Notebook is thought for those pharmacies in the Milanese area that needs to find the faster way to get a drug.


#### Data Section

In this study we will use the data available for the city of Milan, which is the biggest city in Lombardy (where the majority of the cases appeared) and thus it will probably need this study more than other territories.

The dataset containing data on drug makers, depots and wholesalers is freely available at the municipality's website (https://dati.comune.milano.it/dataset/ds684-sanita-distributori-di-farmaci) in both GeoJSON and CSV format.

Here are some useful terminology tips to better understand our dataset:

1. **CAP** is the postal code of a specific region, city or part of a city. Since Milan is the second-biggest city in Italy (after Rome), there are multiple postal codes in the city, dividing it into neighborhoods.
2. **NIL** is the name of the neighborhood in which the drug dealer is set. However, they are somehow smaller than the neighborhoods defined by the cap, so that multiple NILs can fall into the same CAP zone.


In [1]:
import numpy as np
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)

import requests
from bs4 import BeautifulSoup
from tabulate import tabulate

import json # library to handle JSON files
from geopy.geocoders import Nominatim # convert an address into latitude and longitude values

import requests # library to handle requests
from pandas.io.json import json_normalize # tranform JSON file into a pandas dataframe

# Matplotlib and associated plotting modules
import matplotlib.cm as cm
import matplotlib.colors as colors

# import k-means from clustering stage
from sklearn.cluster import KMeans
import warnings
import folium # map rendering library
warnings.filterwarnings('ignore') #ignore red warnings, can be set on 'once' if want display only once

In [2]:
#code to get Foursquare credentials from a txt file
with open('Foursquare_Credentials.txt','r') as cred:
    credentials=cred.read()
    credentials=credentials.split(',')

client_id=credentials[1]
client_secret=credentials[3]

The dataset below shows the location of **pharma distributors** in the city of **Milano, IT**

Please note that the file is given with a semicolon as separator. So, we have to specify this delimiter in our *read_csv()* command

In [3]:
pharma=pd.read_csv('distributori-di-farmaci_v3_geo.csv',sep=';')
pharma.drop(['DATAFINEVALIDITA'],axis=1,inplace=True)
pharma.head()

Unnamed: 0,CODICEIDENTIFICATIVOSITO,DENOMINAZIONESITOLOGISTICO,INDIRIZZO,PARTITAIVA,CAP,CODICECOMUNEISTAT,DESCRIZIONECOMUNE,CODICEPROVINCIAISTAT,SIGLAPROVINCIA,DESCRIZIONEPROVINCIA,CODICEREGIONE,DESCRIZIONEREGIONE,DATAINIZIOVALIDITA,TIPOLOGIADITRIBUTORE,DESCRIZIONEDISTRIBUTORE,LOCALIZE,NUMEROCOMPLETO,CODICE_VIA,MUNICIPIO,ID_NIL,NIL,LONG_WGS84,LAT_WGS84,Location
0,18,Stabilimento di Milano,via matteo civitali 1,748210150,20148,15146,Milano,15,MI,Milano,30,Lombardia,2005-05-01,P,Produttore,1,1,6569.0,7.0,57.0,SELINUNTE,9.13659,45.468945,"45.4689449631, 9.13658985457"
1,27,Lofarma S.p.A.,"viale cassala, 40",713510154,20143,15146,Milano,15,MI,Milano,30,Lombardia,2005-05-01,P,Produttore,1,40,5275.0,6.0,44.0,NAVIGLI,9.16597,45.444011,"45.4440112757, 9.16597016441"
2,28,Mipharm S.p.A.,"via b. quaranta, 12",12304990158,20141,15146,Milano,15,MI,Milano,30,Lombardia,2005-05-01,P,Produttore,1,,,5.0,38.0,RIPAMONTI,9.204009,45.436331,"45.4363314, 9.2040092"
3,30,SCHWARZ PHARMA S.p.A.,"via gadames, snc",7254500155,20151,15146,Milano,15,MI,Milano,30,Lombardia,2005-05-01,P,Produttore,1,,,,,,,,
4,59,LA COMMERCIALE FARMACEUTICA srl,via desenzano 6/a,55560775,20146,15146,Milano,15,MI,Milano,30,Lombardia,2005-05-01,G,Grossista,1,6A,6610.0,7.0,52.0,BANDE NERE,9.140103,45.462419,"45.4624185522, 9.14010264607"


In [4]:
pharma[['Latitude','Longitude']]= pharma['Location'].str.split(',',expand=True)

#Here we are going to change in English the labels we will use later on
pharma.rename(columns={'DENOMINAZIONESITOLOGISTICO':'Name','INDIRIZZO':'Address','DESCRIZIONEDISTRIBUTORE':'Description'},inplace=True)

In [5]:
df=pharma[['CAP','Name','Address','Description','NIL','Latitude','Longitude']]
df['Description'].replace({'Produttore':'Maker','Grossista':'Wholesaler','Depositario':'Depot'}, inplace=True)
df.head()

Unnamed: 0,CAP,Name,Address,Description,NIL,Latitude,Longitude
0,20148,Stabilimento di Milano,via matteo civitali 1,Maker,SELINUNTE,45.4689449631,9.13658985457
1,20143,Lofarma S.p.A.,"viale cassala, 40",Maker,NAVIGLI,45.4440112757,9.16597016441
2,20141,Mipharm S.p.A.,"via b. quaranta, 12",Maker,RIPAMONTI,45.4363314,9.2040092
3,20151,SCHWARZ PHARMA S.p.A.,"via gadames, snc",Maker,,,
4,20146,LA COMMERCIALE FARMACEUTICA srl,via desenzano 6/a,Wholesaler,BANDE NERE,45.4624185522,9.14010264607


We now assume we are Farmacia Giambellino, a small but historical pharmacy in the South-West part of Milan, sited in **Via Giambellino 64, 20146 Milano**

In [6]:
address = 'Via Giambellino 64, Milano, IT'

geolocator = Nominatim(user_agent="foursquare_agent")
location = geolocator.geocode(address)
latitude = location.latitude
longitude = location.longitude
print(latitude, longitude)

45.4495142 9.1446303


In [7]:
float(df['Latitude'][0])

45.4689449631

In [8]:
#Formula to calculate the distance between two points on Earth, given their coordinates
from math import radians, cos, sin, asin, sqrt
def haversine(lon1, lat1, lon2, lat2):
       """
       Calculate the great circle distance between two points 
       on the earth (specified in decimal degrees)
       """
       # convert decimal degrees to radians 
       lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])
       # haversine formula 
       dlon = lon2 - lon1 
       dlat = lat2 - lat1 
       a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
       c = 2 * asin(sqrt(a)) 
       # Radius of earth in kilometers is 6371
       km = 6371* c
       return km

Now we iterate the formula for each drug dealer in the __df__

In [9]:
df['Description'][3]
df['Description'][4]

'Wholesaler'

In [10]:
lat2=latitude
lon2=longitude
types=[]
names=[]
results=[]
for a in range(0,len(df['Latitude'])):
    types.append(df['Description'][a])
    names.append(df['Name'][a])
    lat1=float(df['Latitude'][a])
    lon1=float(df['Longitude'][a])
    results.append(haversine(lon1,lat1,lon2,lat2))
print('Appending completed')

Appending completed


In [11]:
dict1={'Type':types,'Names':names,'Distance':results}
distance_df=pd.DataFrame(dict1)

Print the distance dataframe, to see what are the closest *n* dealers

In [12]:
distance_df.sort_values(by='Distance',ascending=True, inplace=True)
distance_df.head()

Unnamed: 0,Type,Names,Distance
21,Maker,MEDIOLANUM FARMACEUTICI SPA,0.82514
22,Maker,MEDIOLANUM FARMACEUTICI SPA,0.82514
68,Maker,NEOPHARMED GENTILI SRL,0.82514
42,Wholesaler,Robecchi srl,1.217115
4,Wholesaler,LA COMMERCIALE FARMACEUTICA srl,1.477717


In [13]:
min_dis=min(distance_df.Distance)
closest=distance_df.loc[distance_df['Distance']==min_dis,['Names']]
tipo=distance_df.loc[distance_df['Distance']==min_dis,['Type']]

closest_list=list(closest.Names)
type_list=list(tipo.Type)
print('The closest drug {} to Farmacia Giambellino is {}'.format(type_list[0],closest_list[0]))
print('It is just {}km far from this pharmacy'.format(round(min_dis,2)))

The closest drug Maker to Farmacia Giambellino is MEDIOLANUM FARMACEUTICI SPA
It is just 0.83km far from this pharmacy


**Creating a Map**

In [14]:
df.dropna(inplace=True)
venues_map = folium.Map(location=[latitude, longitude], zoom_start=13,width='100%', height='100%') # generate map centred around FarmaZullo


# add Ecco as a red circle mark
folium.features.CircleMarker(
    [float(latitude), float(longitude)],
    radius=10,
    popup=folium.Popup('Farmacia Giambellino'),
    fill=True,
    color='red',
    fill_color='red',fill_opacity=0.6).add_to(venues_map)


# add popular spots to the map as blue circle markers
for lat, lng, label in zip(df.Latitude, df.Longitude, df.Name):
    folium.features.CircleMarker(
        [float(lat), float(lng)],
        radius=5,
        popup=folium.Popup(label),
        fill=True,
        color='blue',
        fill_color='blue',fill_opacity=0.6).add_to(venues_map)

# display map
venues_map
