# Capstone Project - The Battle of the Neighborhoods
### Applied Data Science Capstone by IBM/Coursera

## Table of contents
* [Introduction: Business Problem](#introduction)
* [Data](#data)
* [Methodology](#methodology)
* [Analysis](#analysis)
* [Results and Discussion](#results)
* [Conclusion](#conclusion)

## Business problem

Belém is an poor Amazon city and capital of Pará State, in Brazil. It has about 1.5 million residents. On the last decades, the city had a exponencial growth. But this growth was not organized. So, while some places had a good development, with lots of new big and small businness, others were simply forgotten or occupied in a illegal way.

The idea here is to explore the city trying to find the distribution of a particular business over the city, trying to find some gaps, wich may help enterpreneurs to find places to start new businness.

We will demonstrate the solution with pharmacies. But you may notice that with just small changes, it will work with any sort of business.

## Data

Based on definition of our problem, factors that will influence our decission are:

- number of existing pharmacies in the neighborhood
- the size of district/neighborhood

We will get the names of districts on an Wikipedia page. Then, the geolocation will be returned by geocoder python library.

Another source of data is the list of pharmacies from Foursquare API.

On that API, there is no district information, so we must apply a function to link the venue/place to some neighborhood.

Following data sources will be needed to extract/generate the required information:

- Wikipedia
- Foursquare API
- geocode python library

### Initial definitions

Imports, the main class to search at Foursquare and some auxiliar functions

In [2]:
import pandas as pd
import folium
import requests
import json
import requests
from bs4 import BeautifulSoup
from sklearn.neighbors import DistanceMetric


The credentials CLIENT_ID, CLIENT_SECRET and ACCESS_TOKEN were previously acquired on Foursquare. The config file will be not uploaded for they sensitive content.

The main class SearchFS has some usefull methods to comunicate to Foursquare API.

In [3]:
data = json.load(open('config.json', 'r'))

CLIENT_ID = data['CLIENT_ID'] # your Foursquare ID
CLIENT_SECRET = data['CLIENT_SECRET'] # your Foursquare Secret
ACCESS_TOKEN = data['ACCESS_TOKEN'] # your FourSquare Access Token
VERSION = '20180604'
LIMIT = 30

belem = {
    'latitude' : -1.4301318, 
    'longitude': -48.4725925,
    'name': 'Belém'
}

city = belem

class SearchFS(object):

    def __init__(self, CLIENT_ID, CLIENT_SECRET, ACCESS_TOKEN):

        self.CLIENT_ID = CLIENT_ID
        self.CLIENT_SECRET = CLIENT_SECRET
        self.ACCESS_TOKEN = ACCESS_TOKEN
        self.VERSION = '20180604'
        self.LIMIT = 50
        self.RADIUS = 500

    def venues_search(self, search_query, latitude, longitude, radius=0):
        'Search for query at a given area'

        if not radius:
            radius = self.RADIUS

        url = """https://api.foursquare.com/v2/venues/search?client_id={}&client_secret={}&ll={},{}&oauth_token={}&v={}&query={}&radius={}&limit={}""".format(
            self.CLIENT_ID,
            self.CLIENT_SECRET, 
            latitude,
            longitude,
            self.ACCESS_TOKEN,
            self.VERSION,
            search_query,
            radius,
            self.LIMIT
            )
        result = requests.get(url).json()
        if 'venues' in result['response']:
            if(len(result['response']['venues'])):
                return self.__venues_clean(result['response']['venues'])
            else:
                return []
        else:
            print('venues_search error. result:')
            print(result)
    
    def venue_explore(self, venue_id):
        'Explore venues'

        url = 'https://api.foursquare.com/v2/venues/{}?client_id={}&client_secret={}&oauth_token={}&v={}'.format(
            venue_id,
            self.CLIENT_ID,
            self.CLIENT_SECRET,
            self.ACCESS_TOKEN,
            self.VERSION)

        result = requests.get(url).json()

        if 'venue' in result['response']:
            return result['response']['venue']

    def venue_tips(self, venue_id, limit=15):
        'Venue tips'

        url = 'https://api.foursquare.com/v2/venues/{}/tips?client_id={}&client_secret={}&oauth_token={}&v={}&limit={}'.format(
            venue_id,
            self.CLIENT_ID,
            self.CLIENT_SECRET,
            self.ACCESS_TOKEN,
            self.VERSION,
            limit)
        result = requests.get(url).json()
        return result['response']['tips']['items']

    def user(self, user_id):
        'User details'

        url = 'https://api.foursquare.com/v2/users/{}/?client_id={}&client_secret={}&oauth_token={}&v={}'.format(
            user_id,
            self.CLIENT_ID,
            self.CLIENT_SECRET,
            self.ACCESS_TOKEN,
            self.VERSION)
        results = requests.get(url).json()
        if 'response' in results and 'user' in results['response']:
            return results['response']['user']

    def user_tips(self, user_id, limit=15):
        'User tips'

        url = 'https://api.foursquare.com/v2/users/{}/tips?client_id={}&client_secret={}&oauth_token={}&v={}&limit={}'.format(
            user_id,
            self.CLIENT_ID,
            self.CLIENT_SECRET,
            self.ACCESS_TOKEN,
            self.VERSION,
            limit)
        results = requests.get(url).json()
        return results['response']['tips']['items']

    def location_explore(self, latitude, longitude, radius=0):
        'Explore location'

        if not radius:
            radius = self.RADIUS

        url = 'https://api.foursquare.com/v2/venues/explore?client_id={}&client_secret={}&ll={},{}&v={}&radius={}&limit={}'.format(
            self.CLIENT_ID,
            self.CLIENT_SECRET,
            latitude,
            longitude,
            self.VERSION,
            radius,
            self.LIMIT)
        results = requests.get(url).json()
        return results['response']['groups'][0]['items']

    def venues_search_entire_city(self, query, df):
        'search venues for entire city and return a dict'
        venues = pd.DataFrame({
            'id': [],
            'name': [],
            'latitude': [],
            'longitude': [],
            'city': []
        })

        for index, row in df.iterrows():
            svenues = searchfs.venues_search(query, row['latitude'], row['longitude'], 5000)
            print('{} venues for {}'.format(len(svenues), row['bairro']))

            if len(svenues):
                if(len(venues) == 0):
                    venues = svenues
                else:
                    venues = venues.append(svenues[~svenues.index.isin(venues.index)])

        return venues[venues.city.isin(['Belém',None])]

        

    def __venues_clean(self, venues):
        """Transform venues list into a clean DataFrame"""

        for v in venues:
            v['latitude'] = v['location']['lat']
            v['longitude'] = v['location']['lng']
            v['city'] = None
            v['category'] = v['categories'][0]['name']
            if 'city' in v['location']:
                v['city'] = v['location']['city']

        if len(venues):

            df_venues = pd.DataFrame(venues)
            df_venues.set_index('id', inplace=True)
            df_venues.drop('location', axis='columns', inplace=True)
            df_venues.drop('categories', axis='columns', inplace=True)
            df_venues.drop('referralId', axis='columns', inplace=True)
            df_venues.drop('hasPerk', axis='columns', inplace=True)
            
            return df_venues


searchfs = SearchFS(CLIENT_ID, CLIENT_SECRET, ACCESS_TOKEN)

Here some functions to help the objectives.

With plotvenuesmap(), you may plot some points using Folium. Its parameters are the center point and the list of venues.

We use venue_neighboor() to link the place to some district. It measure the distance of a given point to all the districts and choose the minimum distance. It is important to notice that the district are represented by a point, not an area. So, it may not represent acuratelly the situation, but it is a good approximation.

In [4]:
#auxiliar functions

def plotvenuesmap(city, df):
    """Plot a map with many points"""

    citymap = folium.Map(location=[city['latitude'], city['longitude']], zoom_start=11)

    # add a red circle marker to represent the city
    folium.CircleMarker(
        [city['latitude'], city['longitude']],
        radius=8,
        color='red',
        popup='city',
        fill = True,
        fill_color = 'red',
        fill_opacity = 0.6
    ).add_to(citymap)

    # add a blue circle for every place
    for index, row in df.iterrows():
        folium.CircleMarker(
            [row['latitude'], row['longitude']],
            radius=5,
            color='blue',
            popup=row['name'].replace("'", '"'),
            fill = True,
            fill_color='blue',
            fill_opacity=0.6
        ).add_to(citymap)

    return citymap

def venue_neighbor(row, df):
    point = [row['latitude'], row['longitude']]
    df2 = df.copy()
    distances = []
    dist = DistanceMetric.get_metric('euclidean')
    for i, r in df2.iterrows():
        df2point = [r['latitude'], r['longitude']]
        distances.append(dist.pairwise([df2point, point])[0][1])
    df2['distance'] = distances
    bairro = df2.loc[df2['distance']==df2['distance'].min(), 'bairro'].iloc[0]
    return bairro # df2['bairro'].iloc[df2['distance'].idxmin()]



## Let's get the neighborhoods/districts in Belém, Brazil

We get from Wikipedia, format and clean.

In [5]:
# scrap neighbourhoods
url = "https://pt.wikipedia.org/wiki/Lista_de_bairros_de_Bel%C3%A9m_(Par%C3%A1)"
response = requests.get(url)
soup = BeautifulSoup(response.content)
lis = soup.find_all('li')

In [6]:
bairros = []
for li in lis:
    if li.text.find('moradores):') > 0:
        bairro_moradores = li.text.split(':')[0] # Ex Paracuri (9 934 moradores) 
        bm_list = bairro_moradores.split('(')
        bairro = bm_list[0]
        moradores = int(bm_list[-1].replace('moradores)', '').replace(' ', '').replace('.', ''))
        bairros.append(
            {
                'bairro': bairro,
                'habitants': moradores
            }
        )
df = pd.DataFrame(bairros)

#find coordinates for every neighboohood
import geocoder

def get_latlng(neighborhood):
    lat_lng_coords = None
    while(lat_lng_coords is None):
        g = geocoder.arcgis('{}, Belém, Brazil'.format(neighborhood))
        lat_lng_coords = g.latlng
    return lat_lng_coords

coords = [ get_latlng(neighborhood) for neighborhood in df["bairro"].tolist() ]
df_coords = pd.DataFrame(coords, columns=['lat', 'lng'])
df['latitude']  = df_coords['lat']
df['longitude'] = df_coords['lng']
print(df.shape)
df.head()

(61, 4)


Unnamed: 0,bairro,habitants,latitude,longitude
0,Batista Campos,19136,-1.45962,-48.49187
1,Campina,6156,-1.45358,-48.49768
2,Cidade Velha,12128,-1.46015,-48.50161
3,Fátima,12385,-1.444,-48.47219
4,Nazaré,20504,-1.45221,-48.48308


The neghborhoods 'Parque Guajará' and 'Canudos' has not been found by geocoder in the city.

Instead, they were found on other cities. So, they must be removed manually.

In [7]:

print(df.sort_values(['latitude']).head())
df.drop(18, inplace=True, axis=0)
df.drop(31, inplace=True, axis=0)
print('-')
print(df.sort_values(['latitude']).head())

             bairro  habitants   latitude  longitude
31  Parque Guarajá       34778 -23.981671 -46.222625
18         Canudos       13804  -9.583330 -36.483330
19          Condor       42758  -1.473210 -48.474040
22         Jurunas       64478  -1.469440 -48.494660
21          Guamá        94610  -1.463510 -48.468740
-
            bairro  habitants  latitude  longitude
19         Condor       42758  -1.47321  -48.47404
22        Jurunas       64478  -1.46944  -48.49466
21         Guamá        94610  -1.46351  -48.46874
24  Universitário        2557  -1.46331  -48.44504
20       Cremação       31264  -1.46044  -48.47837


### show the map

Notice that the city is in a peninsula. It makes border with Ananindeua, and has some islands (43 to be exact). Ilha (island) do Mosqueiro is a far disctrict within the city jurisdiction. It's terrestrial access is made only by a road outside the city.

The map below represents the neighborhoods, with the pintpoints proportional to their population.

In [79]:
belemmap = folium.Map(location=[city['latitude'], city['longitude']], zoom_start=11)

# add a red circle marker to represent the city
folium.CircleMarker(
    [city['latitude'], city['longitude']],
    radius=8,
    color='red',
    popup='city',
    fill = True,
    fill_color = 'red',
    fill_opacity = 0.6
).add_to(belemmap)

for index, row in df.iterrows():
    folium.CircleMarker(
        [row['latitude'], row['longitude']],
        radius=row['habitants']/5000,
        color='green',
        popup="{} {}".format(row['bairro'], row['habitants']),
        fill = True,
        fill_color='green',
        fill_opacity=0.6
    ).add_to(belemmap)

# display map
belemmap.save('bairros.html')
belemmap

### Search for pharmacies

Search for the therms 'farmacia' and 'drogaria' on every neighborhood. In Brazil, these are both names of these places.
It was prefered this way than to make a general search and filter latter, but just due to small amount of free searchs on API.

They will be searched for every neighborhood and joined to a pandas DataFrame.

In [9]:
pharmacies = searchfs.venues_search_entire_city('farmacia', df)


50 venues for Batista Campos 
50 venues for Campina 
50 venues for Cidade Velha 
50 venues for Fátima  
50 venues for Nazaré  
50 venues for Reduto 
50 venues for São Brás  
50 venues for Umarizal 
50 venues for Marco 
50 venues for Curió-Utinga 
48 venues for Águas Lindas 
16 venues for Aurá 
50 venues for Castanheira 
50 venues for Guanabara 
50 venues for Mangueirão 
50 venues for Marambaia 
50 venues for Souza 
50 venues for Val-de-Cans 
50 venues for Condor 
50 venues for Cremação 
50 venues for Guamá  
50 venues for Jurunas 
50 venues for Montese 
50 venues for Universitário 
34 venues for Águas Negras 
34 venues for Agulha 
25 venues for Campina de Icoaraci 
25 venues for Cruzeiro 
27 venues for Maracacueira 
32 venues for Paracuri 
27 venues for Ponta Grossa 
5 venues for Aeroporto 
1 venues for Ariramba 
1 venues for Bonfim 
0 venues for Caruará 
5 venues for Mangueiras 
5 venues for Maracajá 
5 venues for Murubira 
5 venues for Natal do Murubira 
0 venues for Sucurijuquara 
4

In [80]:
drugstores = searchfs.venues_search_entire_city('drogaria', df)
drugstores

22 venues for Batista Campos 
22 venues for Campina 
22 venues for Cidade Velha 
28 venues for Fátima  
24 venues for Nazaré  
24 venues for Reduto 
25 venues for São Brás  
28 venues for Umarizal 
32 venues for Marco 
30 venues for Curió-Utinga 
8 venues for Águas Lindas 
5 venues for Aurá 
23 venues for Castanheira 
9 venues for Guanabara 
20 venues for Mangueirão 
22 venues for Marambaia 
29 venues for Souza 
17 venues for Val-de-Cans 
22 venues for Condor 
22 venues for Cremação 
22 venues for Guamá  
21 venues for Jurunas 
25 venues for Montese 
21 venues for Universitário 
10 venues for Águas Negras 
10 venues for Agulha 
9 venues for Campina de Icoaraci 
9 venues for Cruzeiro 
9 venues for Maracacueira 
10 venues for Paracuri 
10 venues for Ponta Grossa 
2 venues for Aeroporto 
0 venues for Ariramba 
0 venues for Bonfim 
0 venues for Caruará 
2 venues for Mangueiras 
2 venues for Maracajá 
2 venues for Murubira 
2 venues for Natal do Murubira 
0 venues for Sucurijuquara 
3 venue

Unnamed: 0_level_0,name,latitude,longitude,city
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
5bb600acd552c7002cc7ed00,Drogarias Globo,-1.449356,-48.48835,Belém
4e4eb38a62e14b77e39180eb,Drogaria Big Ben,-1.457005,-48.502004,Belém
5b68b944029a550039714583,Drogaria Maxi Popular,-1.455226,-48.481081,Belém
558c45b9498e22f38a0775b8,Drogaria novo Rio,-1.472924,-48.487236,
4f0df725e4b0e45bde375f9a,Drogaria Farmagell,-1.451639,-48.472511,
50477de8e4b0dd9df732e060,Drogaria Big Bem,-1.466766,-48.469051,
5265698311d2e3044883793d,drogaria pague menos,-1.466556,-48.468704,
525d5b84498e947cb913e1f2,Drogaria BigBen Castelo 2,-1.443975,-48.471091,Belém
4dfd2940aeb7594e8622c4df,Big Ben - drogaria,-1.447115,-48.488506,Belém
4f0ca24fe4b0c67b451bd8d3,drogarias big ben,-1.461979,-48.487251,


In [91]:
pharmacies = pd.concat([pharmacies, drugstores])

### Do some cleaning

As said before, there is not neighborhood in venue address. So, we apply a method consisting in measure the minimum distance of venue to neighborhoods point. Thus, the neighborhood of venue is the nearest neighbor name point, wich may or may not be acurate. This is acomplished by function venue_neighbor above.

Additionally, some returned places are located outside the city. We had to manually remove them.

In [92]:
pharmacies['bairro'] = pharmacies.apply(venue_neighbor, df=df, axis=1)

In [93]:
# we must remove manually the pharmacies returned by search, but not belonging to Belém

exclude = [
    'Farmácia Popular De Belém Ananindeua',
    'Farmácia Deus Proverá',
    'Farmácia hora certa',
    'Farmácia Viva Bem',
    'Farmacia do Povo Paraense',
    'Farmacia do Naldo',
    'farmacia do trabalhador brasil Paar2',
    'Farmácia Da Cilene',
    'Farmácia Bom de Preço. sua saúde em boas mãos',
    'Farmacia SAE',
    'Personale - Farmácia Arsenal',
    'Farmacia Universo'
    ]

for index, f in pharmacies.iterrows():
    if f['name'] in exclude:
        print(f['name'])
pharmacies2 = pharmacies[~pharmacies.name.isin(exclude)]


Farmacia do Povo Paraense
Farmácia Popular De Belém Ananindeua
Farmácia Deus Proverá
Farmácia hora certa
Farmacia SAE
Farmácia Viva Bem
Farmácia Da Cilene
Personale - Farmácia Arsenal
Farmacia do Naldo
farmacia do trabalhador brasil Paar2
Farmacia Universo
Farmácia Bom de Preço. sua saúde em boas mãos


In [94]:
pharmacies2.to_csv('pharmacies.csv')

### 3.1 Ahhh, the map

It is always a beautifull view.

We may notice the big amount of venues in lower left of the city, with lack of pharmacies around airport. The island of Mosqueiro has just one pharmacy, and Island of Caratateua, just two, side by side.

In [95]:
#pharmacies2 = pharmacies.copy()
pharmap = plotvenuesmap(city, pharmacies2)
pharmap.save('pharmacies.htm')
pharmap

## Methodology

Now that we have the main data and the views, we can see the gaps on the map.

We may now intersect the amount of pharmacies on each district and their population, to find the best places where an entepreneur may use the data. With luck, we may find a good way to represent that.

## 3. Analysis

Let's get the number of pharmacies per neighborhood


In [96]:
pharmacies2_total = pharmacies2[['bairro', 'name']].groupby(['bairro']).count()
pharmacies2_total.rename(columns={'name': 'pharmacies'}, inplace=True)
bairros2=pd.merge(df, pharmacies2_total, on=['bairro'], how='left')
bairros2.fillna(0,inplace=True)
bairros2.head()

Unnamed: 0,bairro,habitants,latitude,longitude,pharmacies
0,Batista Campos,19136,-1.45962,-48.49187,17.0
1,Campina,6156,-1.45358,-48.49768,12.0
2,Cidade Velha,12128,-1.46015,-48.50161,1.0
3,Fátima,12385,-1.444,-48.47219,8.0
4,Nazaré,20504,-1.45221,-48.48308,13.0


And now the number of pharmacies per 1000 inhabitants on each neighborhood. Notice that some neighborhoods have zero pharmacies.

In [112]:
bairros2['pharmacies_per_1000'] = bairros2.apply(lambda row: 1000*row.pharmacies/row.habitants, axis=1)
bairros2.sort_values('pharmacies_per_1000')

Unnamed: 0,bairro,habitants,latitude,longitude,pharmacies,pharmacies_per_1000
39,Sucurijuquara,1074,-1.08462,-48.36956,0.0,0.0
16,Souza,13190,-1.41361,-48.45962,0.0,0.0
37,Murubira,1519,-1.12581,-48.43917,0.0,0.0
14,Mangueirão,36224,-1.38334,-48.44901,0.0,0.0
38,Natal do Murubira,1098,-1.13327,-48.44604,0.0,0.0
24,Águas Negras,6555,-1.30385,-48.46181,0.0,0.0
40,Água Boa,8553,-1.25208,-48.45574,0.0,0.0
42,Itaiteua[8][9],1939,-1.273158,-48.447653,0.0,0.0
36,Maracajá,3345,-1.16707,-48.46052,0.0,0.0
43,São João do Outeiro,12134,-1.26234,-48.46829,0.0,0.0


### The map

On this map, we combined the size of neighborhood (green) with the number of pharmacies per 1000 inhabitants (blue). The red circles has no pharmacies.

We may see that some big districts have a small number of pharmacies, and that may be some a good opportunitty. In general, the bigger the green aura, the better the opportunity.

In [116]:
pha1000 = folium.Map(location=[city['latitude'], city['longitude']], zoom_start=11)


for index, row in bairros2.iterrows():
    bairro = row['bairro']
    people = row['habitants']
    fnumber = row['pharmacies']
    f1000 = row['pharmacies_per_1000']
    folium.CircleMarker(
        [row['latitude'], row['longitude']],
        radius=row['habitants']/4000,
        color='green',
        popup="<b>{}</b><br>{} inhabitants<br>{} pharmacies<br>{}".format(bairro, people, int(fnumber), round(f1000,2)),
        fill = True,
        fill_color='green',
        fill_opacity=0.9
    ).add_to(pha1000)

for index, row in bairros2.iterrows():
    if row['pharmacies_per_1000'] == 0:
        color = 'red'
    else:
        color = 'blue'
    bairro = row['bairro']
    people = row['habitants']
    fnumber = row['pharmacies']
    f1000 = row['pharmacies_per_1000']
    folium.CircleMarker(
        [row['latitude'], row['longitude']],
        radius=3+row['pharmacies_per_1000']*8,
        color=color,
        popup="<b>{}</b><br>{} inhabitants<br>{} pharmacies<br>{}".format(bairro, people, int(fnumber), round(f1000,2)),
        fill = True,
        fill_color=color,
        fill_opacity=0.6
    ).add_to(pha1000)


# display map
pha1000

### Opportunity index

We can see visually in the map that some big neighborhoods have a small number of pharmacies. So, we can create an opportinity index based on that observation. The index may use the size of neighborhood and the number of pharmacies per capita.



In [130]:
bairros2['opportunity'] = bairros2.apply(lambda row: row.habitants/4000-(3+row['pharmacies_per_1000']*8), axis=1)
bairros2.sort_values('opportunity', ascending=False)

Unnamed: 0,bairro,habitants,latitude,longitude,pharmacies,pharmacies_per_1000,opportunity
20,Guamá,94610,-1.46351,-48.46874,12.0,0.126836,19.637808
56,Pedreira,69608,-1.42202,-48.46762,14.0,0.201126,12.79299
50,Tapanã,66669,-1.34297,-48.46784,8.0,0.119996,12.707284
21,Jurunas,64478,-1.46944,-48.49466,8.0,0.124073,12.126913
15,Marambaia,66708,-1.40154,-48.45344,15.0,0.224861,11.878115
22,Montese,61439,-1.45666,-48.45379,7.0,0.113934,11.448277
8,Marco,65844,-1.4327,-48.46357,17.0,0.258186,11.395512
46,Coqueiro,51776,-1.33812,-48.44285,5.0,0.09657,9.171441
57,Sacramenta,44413,-1.41357,-48.47516,3.0,0.067548,7.562868
58,Telégrafo,42953,-1.42717,-48.48921,2.0,0.046563,7.36575


## Results and discussion

The data shows that Belém is short of pharmacies. Some districs have 0 activities. Based only at location, we could say that any of that districts may be eligible for a new business. Visually, we have 4 highlights: the Island of Mosqueiro, the Island of Caratateua, Mangueirão district and around Pratinha neighborhood.

Other interesting view is the last map, wich show the relationship between the size of the district and the number of pharmacies. We even made a formula to show numerically this relationship. In this way, Guamá, Pedreira e Tapanã are very dense disctricts with relative low amount of pharmacies.

We must at last say that this analysis worked only in two dimensions: the number of pre-existent business and the population of districts. We know for experience that this is a violent city and maybe the low density areas are too risky to open a business. Some neighborhoods may have no basic services, like paved streets or bus line nearby.

For further work, we could make the use of spatial security data, like police stations location overall real state price. It is also interesting to play with Opportunity Index formula. For example, in another combination of parameters, Mangueirão was the second best due to lack of pharmacies in a 32K inhabitants in a small area.

## Conclusion 

The main goal of this work is to find oportunities to a pharmacy business in the city of Belém. Due to low density of competitors, it's not so hard to find a good place, although, as said before, the location and population are not the only dimensions to evaluate.