# Getting a web user's location

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Overview" data-toc-modified-id="Overview-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Overview</a></span><ul class="toc-item"><li><span><a href="#User-location" data-toc-modified-id="User-location-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>User location</a></span></li><li><span><a href="#Weather-station-location" data-toc-modified-id="Weather-station-location-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Weather station location</a></span></li><li><span><a href="#Weather-forecast-location" data-toc-modified-id="Weather-forecast-location-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Weather forecast location</a></span></li></ul></li><li><span><a href="#Bringing-the-data-together" data-toc-modified-id="Bringing-the-data-together-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Bringing the data together</a></span></li><li><span><a href="#The-Challenges" data-toc-modified-id="The-Challenges-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>The Challenges</a></span></li><li><span><a href="#Data-sources" data-toc-modified-id="Data-sources-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Data sources</a></span><ul class="toc-item"><li><span><a href="#GeoLocation-API" data-toc-modified-id="GeoLocation-API-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>GeoLocation API</a></span></li><li><span><a href="#GeoNames:-Postcodes" data-toc-modified-id="GeoNames:-Postcodes-4.2"><span class="toc-item-num">4.2&nbsp;&nbsp;</span>GeoNames: Postcodes</a></span></li><li><span><a href="#Australian-Bureau-of-Meteorology-(BOM)" data-toc-modified-id="Australian-Bureau-of-Meteorology-(BOM)-4.3"><span class="toc-item-num">4.3&nbsp;&nbsp;</span>Australian Bureau of Meteorology (BOM)</a></span></li><li><span><a href="#Azure-Maps" data-toc-modified-id="Azure-Maps-4.4"><span class="toc-item-num">4.4&nbsp;&nbsp;</span>Azure Maps</a></span></li></ul></li><li><span><a href="#Outcomes" data-toc-modified-id="Outcomes-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Outcomes</a></span></li><li><span><a href="#Citations" data-toc-modified-id="Citations-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Citations</a></span></li></ul></div>

## Overview

For the web UI I wanted to have a static site calling into APIs.

The user requirement can be described in the following user story:

**As a user I want to locate my nearest weather station so that I can access
observations and forecasts relevant to me**

The core aspects of the work are:

1. Determine the user's location
2. Find the nearest weather station
3. Find the relevant forecast district/region

### User location

We're likely to be able to determine the user's location via either :

- Use the [HTML 5 Geolocation API](https://developer.mozilla.org/en-US/docs/Web/API/Geolocation_API)
- Ask the user to enter their postcode or town
- Ask the user to click on a map (e.g. to select a station or area)

### Weather station location

Various weather agencies list their station details - e.g.:

* [Australian BoM](http://www.bom.gov.au/climate/data/stations/)
* [UK Met Office](https://www.metoffice.gov.uk/public/weather/climate-network/#?tab=climateNetwork)

These will have the GPS co-ordinate of the station.

### Weather forecast location

These appear to mainly utilise a town/area name (e.g. "Brisbane") but can also
have an associated [shape file](https://en.wikipedia.org/wiki/Shapefile) that
designates the forecast area.

The Australian BoM describes the forecast spatial data in [a brief
guide](http://reg.bom.gov.au/catalogue/spatialdata.pdf). The shape files are
available via the [BOM's FTP site](ftp://ftp.bom.gov.au/anon/home/adfd/spatial/) under the `IDM00001.*` files.

## Bringing the data together

Once I know where the user is, I could use the `Haversine
formula <https://en.wikipedia.org/wiki/Haversine_formula>`_ to determine their
nearest weather station. The trouble with the Haversine approach is that I'll
need to limit the stations in some manner (e.g. using a lat/long box based on
some calculation). I could assume that if there's nothing within 100km of the
user then the system will need to let the user know that there's no available
data. Alternatively, I could use a map/GIS service to do the heavy lifting for
me.

For an Azure Solution, Azure Search supports the [Edm.GeographyPoint](https://docs.microsoft.com/en-us/azure/search/search-what-is-an-index) field type.

For forecasts, I'd like to try and work out the user's forecast area by placing
them within the shapefile. I don't know how to do this so that'll be a big
challenge.

As an aside, [SpatiaLite](https://www.gaia-gis.it/fossil/libspatialite/index) is an
extension to SQLite and could be used for local or even Web App dev.

## The Challenges

This workbench aims to explore the following:

* How do I get the user's current location via the HTML 5 Geolocation API?
* Can I use the GeoNames data to locate a user?
* Can I Azure Maps data to locate a user?
* Can I use a postcode listing to locate a user?
* Once I have the location, can I work out their closest weather station?
* Once I have the location, can I work out their forecast region?

## Data sources

### GeoLocation API

The API worked as advertised (hardly surprising). 

The very basic code is in the [Workbench repository](https://dev.azure.com/weatherballoon/Weather%20Balloon/_git/Workbench?path=%2Fweb-user-location)

### GeoNames: Postcodes

In [21]:
from urllib.request import urlopen, urlretrieve
from zipfile import ZipFile
from itertools import islice
import pandas as pd
import numpy as np
import pycountry

In [26]:
postcode_url = 'http://download.geonames.org/export/zip/AU.zip'
postcode_download_file = 'data/postcodes.zip'
postcode_out_file = 'data/postcodes.txt'

In [27]:
urlretrieve(postcode_url, postcode_download_file)

('data/postcodes.zip', <http.client.HTTPMessage at 0x110ba9c88>)

In [28]:
with ZipFile(postcode_download_file) as zip_file:
    with open(postcode_out_file, 'wb') as out_file:
        out_file.write(zip_file.read('AU.txt'))

In [29]:
with open(postcode_out_file) as myfile:
    head = list(islice(myfile, 10))
    
head

['AU\t0200\tAustralian National University\tAustralian Capital Territory\tACT\tCANBERRA\t\t\t\t-35.2777\t149.1189\t1\n',
 'AU\t0221\tBarton\tAustralian Capital Territory\tACT\t\t\t\t\t-35.3049\t149.1412\t4\n',
 'AU\t2540\tWreck Bay\tAustralian Capital Territory\tACT\t\t\t\t\t-35.1627\t150.6907\t4\n',
 'AU\t2540\tJervis Bay\tAustralian Capital Territory\tACT\tNEW CNTRY WEST\t\t\t\t-35.1333\t150.7\t4\n',
 'AU\t2540\tHmas Creswell\tAustralian Capital Territory\tACT\tNEW CNTRY WEST\t\t\t\t-35.028\t150.5501\t3\n',
 'AU\t2600\tCanberra\tAustralian Capital Territory\tACT\tCANBERRA\t\t\t\t-35.2835\t149.1281\t4\n',
 'AU\t2600\tYarralumla\tAustralian Capital Territory\tACT\tCANBERRA\t\t\t\t-35.2998\t149.1058\t4\n',
 'AU\t2600\tRussell\tAustralian Capital Territory\tACT\tCANBERRA\t\t\t\t-35.2977\t149.151\t4\n',
 'AU\t2600\tBarton\tAustralian Capital Territory\tACT\tCANBERRA\t\t\t\t-35.3049\t149.1412\t4\n',
 'AU\t2600\tHarman\tAustralian Capital Territory\tACT\tCANBERRA\t\t\t\t-35.35\t149.2\t4\n']

In [32]:
columns = {
    'country_code': np.str,
    'postal_code': np.str,
    'place_name': np.str,
    'state': np.str,
    'admin_code1': np.str,
    'admin_name2': np.str,
    'admin_code2': np.str,
    'admin_name3': np.str,
    'admin_code3': np.str,
    'lat': np.float64,
    'long': np.float64,
    'accuracy': np.str
}

df = pd.read_csv(postcode_out_file,
                 sep='\t',
                 header=0,
                 names=columns.keys(),
                 dtype=columns,
                 low_memory=False,
                 usecols=['country_code', 'place_name',
                          'state', 'postal_code', 'lat', 'long', 'accuracy'])

df['country'] = df.apply(
    lambda row: pycountry.countries.get(alpha_2=row['country_code']).name, axis=1)

#with open("data/postcodes.json", 'w') as f:
#    f.write(df.to_json(orient='records'))

print('-' * 79)
print(df.info())
print('-' * 79)
print(df.describe())
print('-' * 79)
print(df.head(5))
print('-' * 79)

-------------------------------------------------------------------------------
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16874 entries, 0 to 16873
Data columns (total 8 columns):
country_code    16874 non-null object
postal_code     16874 non-null object
place_name      16874 non-null object
state           16874 non-null object
lat             16874 non-null float64
long            16874 non-null float64
accuracy        16870 non-null object
country         16874 non-null object
dtypes: float64(2), object(6)
memory usage: 1.0+ MB
None
-------------------------------------------------------------------------------
                lat          long
count  16874.000000  16874.000000
mean     -32.108868    143.717291
std        5.952857     10.718973
min      -43.558300     96.862800
25%      -36.033300    141.783300
50%      -33.647200    146.966700
75%      -28.633300    151.118250
max      -10.120100    159.076800
---------------------------------------------------------------

### Australian Bureau of Meteorology (BOM)

[The AU BoM Data notebook](AU%20BoM%20Data.ipynb#Station-list) describes the BoM station list data.

### Azure Maps

The [Azure Maps service](https://azure.microsoft.com/en-au/services/azure-maps/) provides a number of useful APIs.

The following query for the suburb of "Burpengary": https://atlas.microsoft.com/search/address/json?api-version=1.0&query=burpengary&countrySet=AU&subscription-key=KEY

Yielded the response below (snippet):

````json
  {
    "summary": {
      "query": "burpengary",
      "queryType": "NON_NEAR",
      "queryTime": 22,
      "numResults": 6,
      "offset": 0,
      "totalResults": 6,
      "fuzzyLevel": 1
    },
    "results": [
      {
        "type": "Geography",
        "id": "AU/GEO/p0/9961",
        "score": 4.5,
        "info": "search:ta:036043075000418-AU",
        "entityType": "MunicipalitySubdivision",
        "address": {
          "municipalitySubdivision": "Burpengary",
          "municipality": "Brisbane",
          "countrySecondarySubdivision": "Brisbane",
          "countrySubdivision": "Queensland",
          "countryCode": "AU",
          "country": "Australia",
          "countryCodeISO3": "AUS",
          "freeformAddress": "Brisbane Burpengary, Queensland"
        },
        "position": {
          "lat": -27.15282,
          "lon": 152.97663
        },
        "viewport": {
          "topLeftPoint": {
            "lat": -27.12433,
            "lon": 152.91752
          },
          "btmRightPoint": {
            "lat": -27.18634,
            "lon": 152.98447
          }
        },
        "boundingBox": {
          "topLeftPoint": {
            "lat": -27.12433,
            "lon": 152.91752
          },
          "btmRightPoint": {
            "lat": -27.18634,
            "lon": 152.98447
          }
        },
        "dataSources": {
          "geometry": {
            "id": "00005831-3200-1200-0000-00007d320280"
          }
        }
      }
    ]
  }
````

Entering just a postcode gives a reasonable result: https://atlas.microsoft.com/search/address/json?api-version=1.0&query=4000&countrySet=AU&subscription-key=KEY

with the properties listing the suburbs covered by the postcode:

    "municipalitySubdivision": "Spring Hill, Petrie Terrace, Brisbane CBD"
    
The [GeoLocation](https://docs.microsoft.com/en-gb/rest/api/maps/geolocation) feature is in preview and returns the ISO country code for an IP address. This isn't enough information to be useful for this work.

## Outcomes

The work undertaken here helped me determine a direction for fulfilling the user story being explored:

**As a user I want to locate my nearest weather station so that I can access observations and forecasts relevant to me**

The GeoNames data was easily wrangled and gave a list postcodes for towns/suburbs.
These could be put into a service such as Azure Search for
easy lookups. There's a [webcast regarding Geo-spatial search with Azure Search]
(https://azure.microsoft.com/en-us/resources/videos/azure-search-and-geospatial-data/)
with the [EDM.GeographyPoint data type]
(https://docs.microsoft.com/en-gb/rest/api/searchservice/Supported-data-types)
in the index.

The following approach will be taken:

- Store the BOM weather station data in Azure Search.
- Store the GeoNames postcode in Azure Search.
- Provide the user with various UI inputs to help them find their closest
  weather station:
  - Pre-set weather stations for capital cities
  - HTML 5 GeoLocation API
  - Postcode lookup
  - Suburb/town lookup (we don't need their full address)
- Use Azure Search's Geo Spatial functionality to find nearby weather stations.

The Azure Maps API appears to be a viable service for this work as well. I'll place the web lookups behind an API Gateway solution (Azure API Management), allowing me to replace the Azure Search at a latter date (if required). The rationale for preferencing the Azure Search approach is that I'll need to do it anyway for the weather station locations.

## Citations

- The GeoNames data is licensed under a [Creative Commons Attribution 3.0
  License](http://creativecommons.org/licenses/by/3.0/)

- The Australian Bureau of Meteorology data is [Copyright Commonwealth of Australia 2018, Bureau of Meteorology](http://www.bom.gov.au/other/copyright.shtml?ref=ftr)