<br>
<img align="center" width="400" src="lito-raf.png">
<br>



# Análise de expansão

<img align="left" width="80" height="200" src="https://img.shields.io/badge/python-v3.6-blue.svg">
<br>


## Sumário

1. [Introdução](#Introdução)
2. [Dados](#Dados)

## Introdução
[[go back to the top]](#Sumário)

Vou fazer aqui primeiramente uma descrição dos dados que me parecem necessários e apresentar sugestões de como podem ser obtidos usando a plataforma `arcGIS` e outras fontes. Incluirei também algumas definições de novas variáveis, sintetizadas a partir de variáveis básicas.

## Dados
[[go back to the top]](#Sumário)

### 1) Dados sobre público alvo 

Com dados de plataformas como Google Analytics (GA), Facebook Analytics e outras semelhantes podemos obter uma proxy para o perfil dos cliente típicos. Em particular, na [seção Audience do GA](https://support.google.com/analytics/answer/1012034?hl=en) obtemos dados como informações demográficas e dados de localização dos clientes.

<br>
<img align="left" width="1000" src="ga_audience.png">

### 2) Variáveis obtidas a partir de dados espaciais

Ao comprar uma casa, os compradores buscam a proximidade de instalações como mercearias, farmácias, serviços de urgência, parques, etc. Estes incluem as propriedades de localização de uma casa. 

O módulo de geocoding da `Python` API do `ArcGIS` pode ser usada para procurar instalações (hospitais, restaurantes, farmácias, parques, etc) dentro de uma distância especificada em torno de uma ponto escolhido no mapa.

Outro aspecto importante é o tempo que se leva para ir e vir do trabalho. O `ArcGIS` também fornece ferramentas para cálculo do tempo de tráfego em dias e horas escolhidos. Podemos adicionar paradas, como por exemplo academia que o cliente visita no seu trajeto. 

O codigo abaixo faz um cálculo semelhante usando uma base de dados de casas nos EUA.

<img align="left" width="800" src="saopaulo_esri.png">

In [3]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
pd.set_option('display.max_columns', None)
import warnings      
warnings.filterwarnings('ignore')
from arcgis.gis import GIS
gis = GIS(username="marcotav", password="Parmalat65#")
from arcgis.geocoding import geocode, batch_geocode
from arcgis.features import Feature, FeatureLayer, FeatureSet, GeoAccessor, GeoSeriesAccessor, SpatialDataFrame
from arcgis.geometry import Geometry, Point
from arcgis.geometry.functions import buffer
from arcgis.network import RouteLayer
from arcgis.geocoding import reverse_geocode

PATH = 'resources/houses_oregon_lito.csv'
df = pd.read_csv(PATH)
df.drop(columns=['Unnamed: 0'], inplace=True)
df = pd.DataFrame.spatial.from_xy(df, 'LONGITUDE','LATITUDE')
map_df = gis.map('Sao Paulo, Brazil')
map_df.basemap = 'streets-navigation-vector'

(beds, baths, hoa_per_month, year_built, square_feet, price) = (3, 2, 50, 
                                                                2000, 2800, 660000)

sl_df = df[(df['BEDS']>=beds) & (df['BATHS']>baths) & 
           (df['HOA PER MONTH']<=hoa_per_month) & 
           (df['YEAR BUILT']>=year_built) & 
           (df['SQUARE FEET'] > square_feet) & 
           (df['PRICE']<=price)]


def rev_geo(row, df): 
    return reverse_geocode(Point(df['SHAPE'][row]))

s = 'no';

if s=='yes':
    sl_df['ADDRESS'] = [rev_geo(i, sl_df)['address']['LongLabel'] for i in range(sl_df.shape[0])]
    cols = ['SALE TYPE', 'PROPERTY TYPE',  'ADDRESS', 'CITY', 'STATE', 'ZIP', 
            'PRICE', 'BEDS', 'BATHS', 'LOCATION', 'SQUARE FEET', 
            'LOT SIZE', 'YEAR BUILT', 'DAYS ON MARKET', 'PRICE PER SQFT', 'HOA PER MONTH', 
            'STATUS', 'SOURCE', 'MLS', 'LATITUDE', 'LONGITUDE', 'SHAPE']
    sl_df = sl_df[cols]
else:
    pass

sl_df = pd.read_csv('listings_with_address.csv', index_col=0)

prop1 = sl_df[sl_df['MLS']==18389440]
paddress = prop1.ADDRESS + ", " + prop1.CITY + ", " + prop1.STATE
prop_geom_fset = geocode(paddress.values[0], 
                         as_featureset=True)
prop_geom = prop_geom_fset.features[0]
prop_buffer = buffer([prop_geom.geometry], 
                     in_sr = 102100, buffer_sr=102100,
                     distances=0.05, unit=9001)[0]
prop_buffer_f = Feature(geometry=prop_buffer)
prop_buffer_fset = FeatureSet([prop_buffer_f])

neighborhood_data_dict = {}

groceries = geocode('groceries', search_extent=prop_buffer.extent, 
                    max_locations=20, as_featureset=True)

neighborhood_data_dict['groceries'] = []

for place in groceries:
    popup={"title" : place.attributes['PlaceName'], 
    "content" : place.attributes['Place_addr']}
    neighborhood_data_dict['groceries'].append(place.attributes['PlaceName'])

restaurants = geocode('restaurant', search_extent=prop_buffer.extent, max_locations=200)
neighborhood_data_dict['restauruants'] = []

for place in restaurants:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['restauruants'].append(place['attributes']['PlaceName'])

hospitals = geocode('hospital', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['hospitals'] = []

for place in hospitals:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['hospitals'].append(place['attributes']['PlaceName'])

coffees = geocode('coffee', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['coffees'] = []

for place in coffees:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['coffees'].append(place['attributes']['PlaceName'])

bars = geocode('bar', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['bars'] = []

for place in bars:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['bars'].append(place['attributes']['PlaceName'])

gas = geocode('gas station', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['gas'] = []

for place in gas:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['gas'].append(place['attributes']['PlaceName'])


shops_service = geocode("",category='shops and service', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['shops'] = []

for place in shops_service:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['shops'].append(place['attributes']['PlaceName'])

transport = geocode("",category='travel and transport', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['transport'] = []

for place in transport:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['transport'].append(place['attributes']['PlaceName'])

parks = geocode("",category='parks and outdoors', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['parks'] = []

for place in parks:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['parks'].append(place['attributes']['PlaceName'])

education = geocode("",category='education', search_extent=prop_buffer.extent, max_locations=50)
neighborhood_data_dict['education'] = []

for place in education:
    popup={"title" : place['attributes']['PlaceName'], 
    "content" : place['attributes']['Place_addr']}
    neighborhood_data_dict['education'].append(place['attributes']['PlaceName'])
    
    
neighborhood_df = pd.DataFrame.from_dict(neighborhood_data_dict, orient='index')
neighborhood_df = neighborhood_df.transpose()
neighborhood_df.shape
neighborhood_df.head()

## Commute to work duration!

# - Set start time to `8:00 AM` on Mondays
# - `ArcGIS` routing service uses historic averages.

route_service_url = gis.properties.helperServices.route.url
route_service = RouteLayer(route_service_url, gis=gis)

stops = [paddress.values[0], 
         '309 SW 6th Ave #600, Portland, OR 97204']

from arcgis.geocoding import geocode, batch_geocode
stops_geocoded = batch_geocode(stops)

stops_geocoded = [item['location'] for item in stops_geocoded]
stops_geocoded2 = '{},{};{},{}'.format(stops_geocoded[0]['x'],stops_geocoded[0]['y'],
                                       stops_geocoded[1]['x'],stops_geocoded[1]['y'])

modes = route_service.retrieve_travel_modes()['supportedTravelModes']

route_service.properties.impedance;

route_result = route_service.solve(stops_geocoded2, return_routes=True, 
                             return_stops=True, return_directions=True,
                             impedance_attribute_name='TravelTime',
                             start_time=644511600000,
                             return_barriers=False, return_polygon_barriers=False,
                             return_polyline_barriers=False)

route_length = route_result['directions'][0]['summary']['totalLength']
route_duration = route_result['directions'][0]['summary']['totalTime']
route_duration_str = "{}m, {}s".format(int(route_duration), 
                                       round((route_duration %1)*60,2))
print("route length: {} miles, route duration: {}".format(round(route_length,3),
                                                         route_duration_str))

route_features = route_result['routes']['features']
route_fset = FeatureSet(route_features)
stop_features = route_result['stops']['features']
stop_fset = FeatureSet(stop_features)

route_pop_up = {'title':'Name',
               'content':'Total_Miles'}


route_service_url = gis.properties.helperServices.route.url
route_service = RouteLayer(route_service_url, gis=gis)

prop_list_df = sl_df.copy()

destination_address = '309 SW 6th Ave #600, Portland, OR 97204'

prop_list_df.head()

route_service_url = gis.properties.helperServices.route.url
route_service = RouteLayer(route_service_url, gis=gis)

prop_list_df = sl_df.copy()
prop_list_df = prop_list_df.iloc[:5,:]
prop_list_df

## Loop through each property and build the neighborhood facility table

groceries_count = []
restaurants_count = []
hospitals_count = []
coffee_count = []
bars_count = []
gas_count = []
shops_service_count = []
travel_transport_count = []
parks_count = []
education_count = []
route_length = []
route_duration = []

count=0
for index, prop in prop_list_df.iterrows():
    count+=1
    paddress = prop['ADDRESS'] + ", " + prop['CITY'] + ", " + prop['STATE']
    paddress = prop['CITY'] + ", " + prop['STATE']
    prop_geom_fset = geocode(paddress, as_featureset=True)
    prop_geom = prop_geom_fset.features[0]
    
    # create buffer of 5 miles
    prop_buffer = buffer([prop_geom.geometry], 
                     in_sr = 102100, buffer_sr=102100,
                     distances=0.05, unit=9001)[0]

    prop_buffer_f = Feature(geometry=prop_buffer)
    prop_buffer_fset = FeatureSet([prop_buffer_f])
    
   
    groceries = geocode('groceries', search_extent=prop_buffer.extent, 
                    max_locations=20, as_featureset=True)
    groceries_count.append(len(groceries.features))
    
    restaurants = geocode('restaurant', search_extent=prop_buffer.extent, max_locations=200)
    restaurants_count.append(len(restaurants))
    

    hospitals = geocode('hospital', search_extent=prop_buffer.extent, max_locations=50)
    hospitals_count.append(len(hospitals))
    
    
    
    coffees = geocode('coffee', search_extent=prop_buffer.extent, max_locations=50)
    coffee_count.append(len(coffees))
    
    
    
    bars = geocode('bar', search_extent=prop_buffer.extent, max_locations=50)
    bars_count.append(len(bars))
    
    

    gas = geocode('gas station', search_extent=prop_buffer.extent, max_locations=50)
    gas_count.append(len(gas))
    
    
    
    shops_service = geocode("",category='shops and service', 
                            search_extent=prop_buffer.extent, max_locations=50)
    shops_service_count.append(len(shops_service))
    
    
    parks = geocode("",category='parks and outdoors', 
                    search_extent=prop_buffer.extent, max_locations=50)
    parks_count.append(len(parks))
    
    education = geocode("",category='education', search_extent=prop_buffer.extent, 
                        max_locations=50)
    education_count.append(len(education))
    
    stops = [paddress, destination_address]
    stops_geocoded = batch_geocode(stops)

    stops_geocoded = [item['location'] for item in stops_geocoded]
    stops_geocoded2 = '{},{};{},{}'.format(stops_geocoded[0]['x'],stops_geocoded[0]['y'],
                                           stops_geocoded[1]['x'],stops_geocoded[1]['y'])

    route_result = route_service.solve(stops_geocoded2, return_routes=True, 
                             return_stops=False, return_directions=True,
                             impedance_attribute_name='TravelTime',
                             start_time=644511600000,
                             return_barriers=False, return_polygon_barriers=False,
                             return_polyline_barriers=False)
    route_length.append(route_result['directions'][0]['summary']['totalLength'])
    route_duration.append(route_result['directions'][0]['summary']['totalTime'])
    print("Route")

prop_list_df['grocery_count'] = groceries_count

prop_list_df['restaurant_count']= restaurants_count

prop_list_df['hospitals_count']= hospitals_count

prop_list_df['coffee_count']= coffee_count

prop_list_df['bars_count']=bars_count

prop_list_df['gas_count']=gas_count

prop_list_df['shops_count']=shops_service_count

prop_list_df['parks_count']=parks_count

prop_list_df['edu_count']=education_count

prop_list_df['commute_length']=route_length

prop_list_df['commute_duration']=route_duration

facility_list = ['grocery_count', 'restaurant_count', 'hospitals_count', 'coffee_count',
       'bars_count', 'gas_count', 'shops_count', 'parks_count',
       'edu_count', 'commute_length', 'commute_duration']


def set_scores(row):
    score = ((row['PRICE']*-1.5) + 
             (row['BEDS']*1)+
             (row['BATHS']*1)+
             (row['SQUARE FEET']*1)+
             (row['LOT SIZE']*1)+
             (row['YEAR BUILT']*1)+
             (row['HOA PER MONTH']*-1)+  
             (row['grocery_count']*1)+
             (row['restaurant_count']*1)+
             (row['hospitals_count']*1.5)+ 
             (row['coffee_count']*1)+
             (row['bars_count']*1)+
             (row['shops_count']*1)+
             (row['parks_count']*1)+
             (row['edu_count']*1)+
             (row['commute_length']*-1)+  
             (row['commute_duration']*-2) 
            )
    return score

prop_list_df['scores'] = prop_list_df.apply(set_scores, axis=1)

prop_list_df.head()
print('Ok!')

(50, 10)

Unnamed: 0,groceries,restauruants,hospitals,coffees,bars,gas,shops,transport,parks,education
0,Bales Market Place,Coffee. Cup,Providence St Vincent Medical Center-ER,Coffee. Cup,,Shell,Powell Paint Center,MAX-Elmonica & SW 170th Ave,Jqay House Park,Oregon College of Art & Craft
1,Safeway,Papa Murphy's,Providence St Vincent Medical Center,Starbucks,,ARCO,Retied,Powder Lodging,Mitchell Park,Cedar Mill Elementary School
2,QFC,Tilly's Gelato,,Starbucks,,76,Chrisman's Picture Frame & Gallery,Homestead Studio Suites-Beaverton,Jackie Husen Park,French American School
3,QFC,Oak Hills Brew Pub,,Poppa's Haven,,Costco,Team Uniforms,MAX-Sunset TC,The Bluffs,Goddard School
4,Dinihanian's Farm Market,Starbucks,,Tazza Cafe,,76,T-Mobile,Rodeway Inn & Suites-Portland,Bonny Slope Park,St Pius X Elementary School


'TravelTime'

route length: 10.693 miles, route duration: 29m, 17.75s


Unnamed: 0,SALE TYPE,PROPERTY TYPE,ADDRESS,CITY,STATE,ZIP,PRICE,BEDS,BATHS,LOCATION,SQUARE FEET,LOT SIZE,YEAR BUILT,DAYS ON MARKET,PRICE PER SQFT,HOA PER MONTH,STATUS,SOURCE,MLS,LATITUDE,LONGITUDE,SHAPE
1414,MLS Listing,Single Family Residential,"6917 SE 155th Ave, Portland, OR, 97236, USA",Portland,OR,97236.0,454900.0,4.0,3.0,Portland Southeast,3126.0,6098.0,2003.0,109.0,146.0,0.0,Active,RMLS,18352241,45.472435,-122.504185,"{'x': -122.50418476054743, 'y': 45.47243463945..."
1427,MLS Listing,Multi-Family (2-4 Unit),"14719 NE Couch St, Portland, OR, 97230, USA",Portland,OR,97236.0,456789.0,6.0,6.0,POWELLHURST-GILBERT,2812.0,6969.0,2006.0,28.0,162.0,0.0,Active,RMLS,18415529,45.523889,-122.511421,"{'x': -122.5114205262322, 'y': 45.523888573767..."
1644,MLS Listing,Single Family Residential,"17104 SE Kelly St, Portland, OR, 97236, USA",Portland,OR,97236.0,499000.0,3.0,2.5,Portland Southeast,3350.0,8276.0,2000.0,62.0,149.0,10.0,Active,RMLS,18613304,45.499326,-122.487188,"{'x': -122.48718775214779, 'y': 45.49932634785..."
1650,MLS Listing,Single Family Residential,"15701-15999 NE Glisan St, Portland, OR, 97230,...",Portland,OR,97233.0,499000.0,4.0,2.5,Portland Southeast,2843.0,6969.0,2014.0,1.0,176.0,0.0,Active,RMLS,18035240,45.52642,-122.500293,"{'x': -122.50029325523987, 'y': 45.52642034476..."
1669,MLS Listing,Single Family Residential,"Lents, Portland, OR, USA",Portland,OR,97266.0,499700.0,5.0,3.5,PLEASANT VALLEY,3662.0,21344.0,2004.0,48.0,136.0,0.0,Active,RMLS,18134679,45.463353,-122.54936,"{'x': -122.54935952325859, 'y': 45.46335337674..."


Unnamed: 0,SALE TYPE,PROPERTY TYPE,ADDRESS,CITY,STATE,ZIP,PRICE,BEDS,BATHS,LOCATION,SQUARE FEET,LOT SIZE,YEAR BUILT,DAYS ON MARKET,PRICE PER SQFT,HOA PER MONTH,STATUS,SOURCE,MLS,LATITUDE,LONGITUDE,SHAPE
1414,MLS Listing,Single Family Residential,"6917 SE 155th Ave, Portland, OR, 97236, USA",Portland,OR,97236.0,454900.0,4.0,3.0,Portland Southeast,3126.0,6098.0,2003.0,109.0,146.0,0.0,Active,RMLS,18352241,45.472435,-122.504185,"{'x': -122.50418476054743, 'y': 45.47243463945..."
1427,MLS Listing,Multi-Family (2-4 Unit),"14719 NE Couch St, Portland, OR, 97230, USA",Portland,OR,97236.0,456789.0,6.0,6.0,POWELLHURST-GILBERT,2812.0,6969.0,2006.0,28.0,162.0,0.0,Active,RMLS,18415529,45.523889,-122.511421,"{'x': -122.5114205262322, 'y': 45.523888573767..."
1644,MLS Listing,Single Family Residential,"17104 SE Kelly St, Portland, OR, 97236, USA",Portland,OR,97236.0,499000.0,3.0,2.5,Portland Southeast,3350.0,8276.0,2000.0,62.0,149.0,10.0,Active,RMLS,18613304,45.499326,-122.487188,"{'x': -122.48718775214779, 'y': 45.49932634785..."
1650,MLS Listing,Single Family Residential,"15701-15999 NE Glisan St, Portland, OR, 97230,...",Portland,OR,97233.0,499000.0,4.0,2.5,Portland Southeast,2843.0,6969.0,2014.0,1.0,176.0,0.0,Active,RMLS,18035240,45.52642,-122.500293,"{'x': -122.50029325523987, 'y': 45.52642034476..."
1669,MLS Listing,Single Family Residential,"Lents, Portland, OR, USA",Portland,OR,97266.0,499700.0,5.0,3.5,PLEASANT VALLEY,3662.0,21344.0,2004.0,48.0,136.0,0.0,Active,RMLS,18134679,45.463353,-122.54936,"{'x': -122.54935952325859, 'y': 45.46335337674..."


Route
Route
Route
Route
Route


Unnamed: 0,SALE TYPE,PROPERTY TYPE,ADDRESS,CITY,STATE,ZIP,PRICE,BEDS,BATHS,LOCATION,SQUARE FEET,LOT SIZE,YEAR BUILT,DAYS ON MARKET,PRICE PER SQFT,HOA PER MONTH,STATUS,SOURCE,MLS,LATITUDE,LONGITUDE,SHAPE,grocery_count,restaurant_count,hospitals_count,coffee_count,bars_count,gas_count,shops_count,parks_count,edu_count,commute_length,commute_duration,scores
1414,MLS Listing,Single Family Residential,"6917 SE 155th Ave, Portland, OR, 97236, USA",Portland,OR,97236.0,454900.0,4.0,3.0,Portland Southeast,3126.0,6098.0,2003.0,109.0,146.0,0.0,Active,RMLS,18352241,45.472435,-122.504185,"{'x': -122.50418476054743, 'y': 45.47243463945...",20,50,10,50,1,37,42,50,50,0.917188,6.068344,-670851.053875
1427,MLS Listing,Multi-Family (2-4 Unit),"14719 NE Couch St, Portland, OR, 97230, USA",Portland,OR,97236.0,456789.0,6.0,6.0,POWELLHURST-GILBERT,2812.0,6969.0,2006.0,28.0,162.0,0.0,Active,RMLS,18415529,45.523889,-122.511421,"{'x': -122.5114205262322, 'y': 45.523888573767...",20,50,10,50,1,37,42,50,50,0.917188,6.068344,-673119.553875
1644,MLS Listing,Single Family Residential,"17104 SE Kelly St, Portland, OR, 97236, USA",Portland,OR,97236.0,499000.0,3.0,2.5,Portland Southeast,3350.0,8276.0,2000.0,62.0,149.0,10.0,Active,RMLS,18613304,45.499326,-122.487188,"{'x': -122.48718775214779, 'y': 45.49932634785...",20,50,10,50,1,37,42,50,50,0.917188,6.068344,-734613.553875
1650,MLS Listing,Single Family Residential,"15701-15999 NE Glisan St, Portland, OR, 97230,...",Portland,OR,97233.0,499000.0,4.0,2.5,Portland Southeast,2843.0,6969.0,2014.0,1.0,176.0,0.0,Active,RMLS,18035240,45.52642,-122.500293,"{'x': -122.50029325523987, 'y': 45.52642034476...",20,50,10,50,1,37,42,50,50,0.917188,6.068344,-736402.553875
1669,MLS Listing,Single Family Residential,"Lents, Portland, OR, USA",Portland,OR,97266.0,499700.0,5.0,3.5,PLEASANT VALLEY,3662.0,21344.0,2004.0,48.0,136.0,0.0,Active,RMLS,18134679,45.463353,-122.54936,"{'x': -122.54935952325859, 'y': 45.46335337674...",20,50,10,50,1,37,42,50,50,0.917188,6.068344,-722266.553875


Ok!
