# Quick Start - Point Query
## Your First Query
The most basic Geospatial Analytics query is the *point query*. Now you have made a point query with the user interface we are going to get you started with the Geospatial Analytics API by using it to do a point query:

In [1]:
import os
import pandas as pd
import ibmpairs.authentication as authentication
import ibmpairs.client as client
import ibmpairs.query as query

# Best practice is not to include secrets in source code so we read 
# a user name and password from operating system environment variables. 
# You could set the user name and password in-line here but we don't  
# recommend it for security reasons.
EIS_USERNAME=os.environ.get('EIS_USERNAME')
EIS_APIKEY=os.environ.get('EIS_APIKEY')

# Create an authentication object with credentials.
credentials  = authentication.OAuth2(username = EIS_USERNAME,
                                     api_key  = EIS_APIKEY)

# Add the credentials object to a client object.
eis_client = client.Client(authentication = credentials)

# The Geospatial Analytics query expressed as a JSON structure
query_json = {
    "layers" : [
      {"type" : "raster", "id" : "16100"}
    ],
    "spatial" : {"type" : "point", "coordinates" : ["35.7", "139.7"]},
    "temporal" : {"intervals" : [
      {"start" : "2015-01-01T00:00:00Z", "end" : "2018-10-31T00:00:00Z"}
    ]}
  }

# Submit the query
query_result = query.submit(query_json)

# Convert the results to a dataframe
point_df = query_result.point_data_as_dataframe()
# Convert the timestamp to a human readable format
point_df['datetime'] = pd.to_datetime(point_df['timestamp'] * 1e6, errors = 'coerce')
point_df

2021-12-06 11:34:52 - paw - INFO - TASK: submit STARTING.
2021-12-06 11:34:57 - paw - INFO - TASK: submit COMPLETED.


Unnamed: 0,layer_id,layer_name,dataset,timestamp,longitude,latitude,value,datetime
0,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1421528400000,139.7,35.7,273.918212890625,2015-01-17 21:00:00
1,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1421539200000,139.7,35.7,276.215087890625,2015-01-18 00:00:00
2,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1421550000000,139.7,35.7,279.27020263671875,2015-01-18 03:00:00
3,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1421560800000,139.7,35.7,280.5292053222656,2015-01-18 06:00:00
4,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1421571600000,139.7,35.7,278.1947021484375,2015-01-18 09:00:00
...,...,...,...,...,...,...,...,...
11029,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1540900800000,139.7,35.7,292.9964904785156,2018-10-30 12:00:00
11030,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1540911600000,139.7,35.7,290.8324279785156,2018-10-30 15:00:00
11031,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1540922400000,139.7,35.7,289.5807189941406,2018-10-30 18:00:00
11032,16100,Ground temperature,16 day weather forecast (GFS) (latest predicti...,1540933200000,139.7,35.7,288.4697265625,2018-10-30 21:00:00


The above query requests about 3 years of Ground Temperature forecasts from Geospatial Analytics layer 16100, the *Global Forecast System* (GFS), for a location somewhere in Tokyo -- the coordinates 35.7/139.7 (latitude/longitude). 

Geospatial Analytics returns about 11,000 rows of data, which are now stored in the ``point_df`` dataframe.

<div class="alert alert-info">
Point queries such as the above are unique in that they instantly return a response. This makes them particularly suited to testing as well as exploration and experimentation. If unsure about the data you are interested in- its spatial coverage frequency, or temporal extent- start with a point query. Having said that, note that some advanced features -- most notably [user defined functions]() are not available for point queries.
</div>

<div class="alert alert-info">
Time intervals such as:

```python
{"start" : "2015-01-01T00:00:00Z", "end" : "2018-10-31T00:00:00Z"}
```

are defined as follows: The start time is excluded, the end time is included. In other words, the interval is open at the beginning and closed at the end: ``2015-01-01T00:00:00Z < t <= 2018-10-31T00:00:00Z``.
</div>


## Understanding the Example
We start with various import statements:
```python
import os                                        # used to read environment variables
import ibmpairs.authentication as authentication # deals with PAIRS authentication using our credentials 
import ibmpairs.client as client                 # represents an authenticated HTTP client
import ibmpairs.query as query                   # manages the submission of queries and retrieval of results
```
After the imports we create an OAuth2 credentials object and use this to create an authenticated HTTP client. 
```python
credentials  = authentication.OAuth2(username = EIS_USERNAME,
                                     api_key  = EIS_APIKEY)
pairs_client = client.Client(authentication = credentials)
```
This is a required step before you start doing queries but you only need to do it once. 

The most intersting part of the above example is the definition of the actual query JSON that we send to Geospatial Analytics. 
```python
query_json = {
    "layers" : [                                                           
      {"type" : "raster", "id" : "16100"}                                  # What - the data layer
    ],                                                                     
    "spatial" : {"type" : "point", "coordinates" : ["35.7", "139.7"]},     # Where - the spatial location
    "temporal" : {"intervals" : [
      {"start" : "2015-01-01T00:00:00Z", "end" : "2018-10-31T00:00:00Z"}   # When - the temporal range
    ]}
  }
```
In general, the ``query_json`` object answers the following questions: *what?*, *where?* and *when?*. What we are requesting is specified by the value associated to ``layers``. Here, we are requesting a single raster layer with ID 16100. This is the *ground temperature* layer in the *16 day weather forecast (GFS) (latest predictions)* dataset. Next we define the spatial coverage of the query with the ``spatial`` key. In the above, we only request data for a single point in the format ``[latitude, longitude]``. Note that longitudes in PAIRS range from -180 to +180 degrees. Using values larger than +180 will lead to error messages. Similarly, latitudes range of course from -90 to +90 degrees. Finally we define a single time range via the ``temporal`` field.

Subsequently we submit the query to Geospatial Analytics. As this is a point query, the result is returned directly from the submit method call:

```python
query_result = query.submit(query_json)
```

Note that we don't explicitly need to tell the query object to use the authenticated client we created previously as it finds it automatically.

Geospatial Analytics returns the result of a point query as JSON data. We use a helper method to turn this data into a local data frame:

```python
point_df = query_result.point_data_as_dataframe()
```
From this point on all the data is in a local data frame and we can operate on it as we would any other data frame.

## A Not So Minimal Working Example

The largest part of this documentation will be concerned with extensions to the ``query_json`` object. Once again let's just jump into a working example:

<div class="alert alert-info">
The layer IDs used here can be found in the *Data Explorer* in the Geospatial Analytics GUI. See :ref:`gui-tutorial-data_explorer`.
</div>


In [3]:
query_json = {
    "layers" : [
        {
            "type" : "raster", "id" : "91",
            "temporal" : {"intervals" : [
                {"start" : "2000-01-01T00:00:00Z", "end" : "2017-10-31T00:00:00Z"}
            ]},
            "aggregation" : "Mean"
        },
        {
            "type" : "raster", "id" : "48873",
            "temporal" : {"intervals" : [
                {"start" : "2005-01-01T00:00:00Z", "end" : "2018-10-31T00:00:00Z"}
            ]},
            "aggregation" : "Max"
        },
        {
            "type" : "raster", "id" : "51",
            "temporal" : {"intervals" : [
                {"start" : "2010-01-01T00:00:00Z", "end" : "2010-06-31T00:00:00Z"}
            ]}
        }
    ],
    "spatial" : {"type" : "point",  "coordinates" : ["40.7128", "-74.006", "37.7749", "-122.4194"]},
    "temporal" : {"intervals" : [
        {"start" : "2005-01-01T00:00:00Z", "end" : "2018-10-31T00:00:00Z"}
    ]}
  }

query_result = query.submit(query_json)
point_df = query_result.point_data_as_dataframe()
# convert the timestamp to a human readable format
point_df['datetime'] = pd.to_datetime(point_df['timestamp'] * 1e6, errors = 'coerce')
point_df

2021-12-03 15:29:39 - paw - INFO - TASK: submit STARTING.
2021-12-03 15:29:42 - paw - INFO - TASK: submit COMPLETED.


Unnamed: 0,layer_id,layer_name,dataset,longitude,latitude,value,aggregation,timestamp,datetime
0,48873,Maximum temperature,16 day weather forecast (GFS) (latest predicti...,-122.4194,37.7749,306.8578796386719,Max,,NaT
1,48873,Maximum temperature,16 day weather forecast (GFS) (latest predicti...,-74.006,40.7128,310.2509155273437,Max,,NaT
2,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-122.4194,37.7749,0.0859,,1262995000000.0,2010-01-09
3,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-74.006,40.7128,0.1232,,1262995000000.0,2010-01-09
4,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-122.4194,37.7749,0.0836,,1264378000000.0,2010-01-25
5,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-74.006,40.7128,0.1204,,1264378000000.0,2010-01-25
6,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-122.4194,37.7749,0.0846,,1265760000000.0,2010-02-10
7,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-74.006,40.7128,0.0277,,1265760000000.0,2010-02-10
8,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-122.4194,37.7749,0.1218,,1267142000000.0,2010-02-26
9,51,Normalized difference vegetation index (NDVI),16 day 250 m res imagery (NASA MODIS Aqua),-74.006,40.7128,0.1282,,1267142000000.0,2010-02-26


There is quite a lot going on in the above example. To begin, we are requesting data from three different layers: 

91 - Daily precipitation from the *Daily US weather (PRISM)* dataset
```python
        {
            "type" : "raster", "id" : "91",
            "temporal" : {"intervals" : [
                {"start" : "2000-01-01T00:00:00Z", "end" : "2017-10-31T00:00:00Z"}
            ]},
            "aggregation" : "Mean"
        }
```
48873 - Maximum temperature from the GFS weather forecast we encountered in the previous example
```python
        {
            "type" : "raster", "id" : "48873",
            "temporal" : {"intervals" : [
                {"start" : "2005-01-01T00:00:00Z", "end" : "2018-10-31T00:00:00Z"}
            ]},
            "aggregation" : "Max"
        },
```
51 - *Normalized Difference Vegetation Index* (NDVI) calculated from satellite imagery from the MODIS Aqua satellite 
```python
        {
            "type" : "raster", "id" : "51",
            "temporal" : {"intervals" : [
                {"start" : "2010-01-01T00:00:00Z", "end" : "2010-06-31T00:00:00Z"}
            ]}
        }
```
For each of these we use a different temporal range and we are aggregating the first two over their respective time ranges. ``Mean`` in the case of 91 and ``Max`` in the case of 48873. A layer can appear multiple times, for example, once with ``Mean`` aggregation, once with ``Sum`` aggregation and once without and the results will reflect the three different requests. The possible aggregation functions for temporal aggregations supported at this stage are ``Mean``, ``Max``, ``Min`` and ``Sum``.

The ``spatial`` specification describes two points using an array:
```python
    "spatial" : {"type" : "point",  "coordinates" : ["40.7128", "-74.006", "37.7749", "-122.4194"]},
```
The format is ``[lat-point-1, long-point1, lat-point2, long-point2]``. You will see in the results that data is returned for each layer, for each timestamp (or once for an aggregation) and for each point. 

<div class="alert alert-info">
The ``temporal`` section appearing at the end the above query -- outside the ``layers`` block -- gives a *default* time range that is used if a an element of the ``layers`` block comes without a time range. In the above example it is redundant. However, the current implementation requires its presence even if the information is not used.
<div>
    
