# Use Location Data Services

## Introduction

CARTOframes provides the functionality of using the [CARTO Data Services API](https://carto.com/developers/data-services-api/). This API consists of a set of location based functions that can be applied to your data in order to perform geospatial analyses without leaving the context of your notebook.

For instance, you can **geocode** a pandas DataFrame with addresses on the fly, and then perform trade areas analysis by computing **isodistances** or **isochrones** programatically.

In this guide we go through the use case of, given a set of ten Starbucks store addresses, finding good location candidates to open another store.

> Based on your account plan, some of these location data services are subject to different [quota limitations](https://carto.com/developers/data-services-api/support/quota-information/)

## Data

We will be using the same dataset of fake locations used along these guides [starbucks_brooklyn.csv]()

## Authentication

Using Location Data Services requires to be authenticated. For more information about how to authenticate, please read the [Login to CARTO Platform guide](/developers/cartoframes/guides/Login-to-CARTO-Platform/)

In [None]:
from cartoframes.auth import Credentials, set_default_credentials

set_default_credentials('creds.json')

## Geocoding

The first step is to read and understand the data we have. Once we've the Starbucks store data in a DataFrame, we can see we've two columns that can be used in the **geocoding** service: `name` and `address`. There's also a third column that reflects the anual revenue of the store.

In [None]:
import pandas as pd

df = pd.read_csv('../files/starbucks_brooklyn.csv')
df

### Quota consumption

Each time you run a Location Data Service, you're consuming quota. For this reason, we provide the hability to check in advance the **amount of credits** this operation will consume by using the `dry_run` parameter when running the service function.

In addition, to prevent having to geocode records that have been **previously geocoded**, and thus spend quota **unnecessarily**, you should always preserve the ``the_geom`` and ``carto_geocode_hash`` columns generated by the geocoding process.

This will happen **automatically** in these cases:

1. Your input is a **table** from CARTO processed in place (without a ``table_name`` parameter)
2. If you save your results in a CARTO table using the ``table_name`` parameter, and only use the resulting table for any further geocoding.

Because of this, we're going to check first the `requried_quota` the `geocode` function returns when running it with `dry_run=True` parameter, and then, save the results in a CARTO table using `table_name` and `cache` parameters.

Also, it is possible to check the available quota by running the `available_quota` function.

In [None]:
from cartoframes.data.services import Geocoding

geo_service = Geocoding()

_, geo_dry_metadata = geo_service.geocode(
    df,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'},
    dry_run=True
)

In [None]:
geo_dry_metadata

In [None]:
geo_service.available_quota()

In [None]:
geo_gdf, geo_metadata = geo_service.geocode(
    df,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'}
)

If the CSV file should ever change, cached results will only be applied to unmodified
records, and new geocoding will be performed only on new or changed records.

In order to be able to use cached results, we have to save the results in a CARTO table using `table_name` and `cached=True` parameters.

In [None]:
geo_gdf, geo_metadata = geo_service.geocode(
    df,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'},
    table_name='starbucks_cache',
    cached=True
)

Let's compare the `geo_dry_metadata` and the `geo_metadata` to see the differences between the information when using or not the `dry_run` option. As we can see, this information reflects that all the locations have been geocoded successfully and that it has consumed 10 credits of quota.

In [None]:
geo_metadata

The resulting data is a `GeoDataFrame` that contains three new columns:

* `geometry`: The resulting geometry
* `gc_status_rel`: The percentage of accuracy of each location
* `carto_geocode_hash`: Geocode information

In [None]:
geo_gdf.head()

If try to geocode now this DataFrame, which contains both ``the_geom`` and the ``carto_geocode_hash``, we can see that the required quota is 0 cause it has already been geocoded.

In [None]:
_, repeat_geo_metadata = geo_service.geocode(
    geo_gdf,
    street='address',
    city={'value': 'New York'},
    country={'value': 'USA'},
    dry_run=True
)

In [None]:
repeat_geo_metadata.get('required_quota')

### Precision

The `address` column is more complete than the `name` column, and therefore, the resulting coordinates calculated by the service will be more accurate. If we check this, the accuracy values using the `name` column (`0.95, 0.93, 0.96, 0.83, 0.78, 0.9`) are lower than the ones we get by using the `address` column for geocoding (`0.97, 0.99, 0.98`)

In [None]:
geo_name_gdf, geo_name_metadata = geo_service.geocode(
    df,
    street='name',
    city={'value': 'New York'},
    country={'value': 'USA'}
)

In [None]:
geo_name_gdf.head()

In [None]:
geo_name_gdf.gc_status_rel.unique()

In [None]:
geo_gdf.head()

## Visualize the results

Finally, we can visualize through CARTOframes helpers the geocoding results by precision.

In [None]:
from cartoframes.viz.helpers import color_bins_layer
from cartoframes.viz import popup_element

color_bins_layer(
    geo_gdf,
    'gc_status_rel',
    method='equal',
    bins=geo_gdf.gc_status_rel.unique().size,
    title='Geocoding Precision',
    hover_popup=[
        popup_element('address', 'Address'),
        popup_element('gc_status_rel', 'Precision')
    ]
)

## Isolines

The Isolines service generates contoured lines that display equally calculated levels over a given surface area. Isoline functions are calculated as the intersection areas from the origin point, measured by:

* **Time**, named **Isochrones**
* **Distance**, named **Isodistances**


In this guide we're using the `Isochrones` to know the walking area by time for each Starbucks store, and the `Isodistances` to discover the walking area by distance.

### Isochrones

We're going to use these values to set the ranges: 5, 15 and 30 min. These ranges are in `seconds`, so they will be **300**, **900**, and **1800** respectively.

In [None]:
from cartoframes.data.services import Isolines

iso_service = Isolines()

_, isochrones_dry_metadata = iso_service.isochrones(geo_gdf, [300, 900, 1800], mode='walk', dry_run=True)

Remember to always **check the quota** using `dry_run` parameter and `available_quota` method before running the service!

In [None]:
print('available {0}, required {1}'.format(
    iso_service.available_quota(),
    isochrones_dry_metadata.get('required_quota'))
)

In [None]:
isochrones_gdf, isochrones_metadata = iso_service.isochrones(geo_gdf, [300, 900, 1800], mode='walk')

In [None]:
isochrones_gdf.head()

### The isolines helper

The most straight forward way of visualizing the the resulting geometries is by using the `isolines_layer` helper. It will use the `range_label` column added automatically by the service to classify each polygon by category.

In [None]:
from cartoframes.viz.helpers import isolines_layer

isolines_layer(isochrones_gdf)

### Isodistances

The isoline services accepts several options to manually change the `resolution` or the `quality` of the polygons. There's more information about these settings in the [Isolines Reference](/developers/cartoframes/reference/#heading-Isolines)

In [None]:
isodistances_gdf, isodistances_dry_metadata = iso_service.isodistances(
    geo_gdf,
    [900, 1800, 3600],
    mode='walk',
    resolution=16.0,
    quality=1,
    dry_run=True
)

In [None]:
print('available {0}, required {1}'.format(
    iso_service.available_quota(),
    isodistances_dry_metadata.get('required_quota'))
)

In [None]:
isodistances_gdf, isodistances_metadata = iso_service.isodistances(
    geo_gdf,
    [900, 1800, 3600],
    mode='walk',
    mode_traffic='enabled',
    resolution=16.0,
    quality=2
)

In [None]:
isodistances_gdf.head()

In [None]:
from cartoframes.viz.helpers import isolines_layer

isolines_layer(isodistances_gdf)

## All together

In [None]:
from cartoframes.viz import Map
from cartoframes.viz.helpers import size_continuous_layer

Map([
    isolines_layer(
        isochrones_gdf,
        title='Walking Time'
    ),
    size_continuous_layer(
        geo_gdf,
        'revenue',
        title='Revenue $',
        color='white',
        opacity='0.2',
        stroke_color='blue',
        size=[20, 80],
        hover_popup=[
            popup_element('address', 'Address'),
            popup_element('gc_status_rel', 'Precision'),
            popup_element('revenue', 'Revenue')
        ]
    )
])

We observe the store at 228 Duffield st, Brooklyn, NY 11201 is really close to another store with higher revenue, which means we could even think about closing that one in favor to another one with a better location.

We could try to calculate where to place a new possible store between other stores that don't have as much revenue as others and that are placed separately.

Now, let's calculate the **centroid** of three different stores that we've identified previously and use it as a possible location for a new spot:

In [None]:
from shapely import geometry

new_store_location = [
    geo_gdf.iloc[6].the_geom,
    geo_gdf.iloc[9].the_geom,
    geo_gdf.iloc[1].the_geom
]

# Create a polygon using three points from the geo_gdf
polygon = geometry.Polygon([[p.x, p.y] for p in new_store_location])

In [None]:
from geopandas import GeoDataFrame, points_from_xy
from cartoframes.viz import Layer

new_store_gdf = GeoDataFrame(
    [['New Store', points_from_xy(polygon.centroid.x, polygon.centroid.y)]],
    columns=['name', 'geometry'])

isochrones_new_gdf, isochrones_new_metadata = iso_service.isochrones(new_store_gdf, [300, 900, 1800], mode='walk')

In [None]:
Map([
    isolines_layer(
        isochrones_gdf,
        title='Walking Time - Current',
        opacity='0.2'
    ),
    isolines_layer(
        isochrones_new_gdf,
        title='Walking Time - New',
    ),
    size_continuous_layer(
        geo_gdf,
        'revenue',
        title='Revenue $',
        color='white',
        opacity='0.2',
        stroke_color='blue',
        size=[20, 80],
        hover_popup=[
            popup_element('address', 'Address'),
            popup_element('gc_status_rel', 'Precision'),
            popup_element('revenue', 'Revenue')
        ]
    ),
    Layer(new_store_gdf)
])

## Conclusion

In this example we've explained how to use the Location Data Services to perform trade areas analysis easily using CARTOframes built-in functionality without leaving the notebook.

As a result, we've calculated a possible new location for our store, and we can check how the isoline areas of our interest can influence in our decission.

Take into account that finding optimal spots for new stores is not an easy task and requires more analysis, but this is a great first step!