# Running a query
In the previous tutorial ([Getting started with FlowClient](01-getting-started-with-flowclient.ipynb)), you learned how to use the FlowClient library to connect to a FlowKit server.

In this tutorial you will learn how to use FlowClient to run a FlowKit query and get the result as a pandas dataframe.

- What is a query?  
- Limitations: only aggregates, only redacted, only what's allowed by token (which is more granular than just query kind)  
    - Create connection (and link to previous tutorial for more details)
    - Create (simple) query object (and explain what this is - not the result, and not yet communicated with server in any way). Explain parameters, but point to geography tutorial for more details of aggregation unit parameter
    - Get result. Explain what this does - i.e. sends query to server to run, and waits until it can get the result, then gets it
    - Explain progress bar (with details about FlowKit splitting query into sub-queries, and why (caching and re-use). For this simple query, probably only one component, but will see more complex ones later)
    - Show result is a dataframe
    - Long-running queries: run and status
        - Some queries can take a long time - don't want to have to wait for them to finish before doing something else
        - Can use run / check status / get result workflow
        - Create new query
        - status: not running
        - run
        - status: queued or executing
        - Create and run another query
        - Check status of both
        - Once it's finished, status is "finished"
        - At any time, can use get_result to wait and get result. If it's already finished, will return result immediately
        - This way, can run many queries at once, rather than waiting for one to finish before starting the next
    - Any other query object methods/attributes (if there are any - I'll cover result format in the geography tutorial)

In [1]:
import flowclient as fc

## Connect to FlowKit
Follow the steps in the [previous tutorial](getting-started-with-flowclient.ipynb) to connect to the FlowKit API.

In [2]:
token = 

conn = fc.connect(
    url="https://api.flowcloud-ghana.flowminder.org",
    token=token,
)

## Define a query
We will run a 'unique subscriber counts' query (explain what this means). First, define the query:  
(explain the parameters. Aggregation unit will be explained in more detail in the next tutorial)

In [8]:
query = fc.unique_subscriber_counts(
    connection=conn,
    start_date="2016-01-01",
    end_date="2016-01-02",
    aggregation_unit="admin2",
)

We have created an APIQuery object, which represents a query that we can run in FlowKit.  
**Note:** We don't yet have the result of the query; in fact, we haven't yet asked the FlowKit server to run it.

In [4]:
type(query)

flowclient.api_query.APIQuery

## Get the query result
Now that we have defined our 'unique subscriber counts' query, we can ask the FlowKit server to give us the result of the query.

In [9]:
result = query.get_result()

- Explain what just happened - i.e. sends query to server to run, and waits until it can get the result, then gets it  
- Explain progress bar (with some details about FlowKit splitting query into sub-queries, and why (caching and re-use). For this simple query, probably only one component, but will see more complex ones later)

`result` is the result of our query, as a pandas dataframe:

In [10]:
result

Unnamed: 0,pcod,value
0,GHA.10.10_1,64922
1,GHA.10.1_1,13965
2,GHA.10.11_1,9010
3,GHA.10.12_1,14396
4,GHA.10.13_1,22824
...,...,...
132,GHA.9.5_1,12638
133,GHA.9.6_1,5254
134,GHA.9.7_1,13289
135,GHA.9.8_1,20700


## Long-running queries
Some queries can take a long time to run. If we use `get_result` straight away, will have to wait until the query finishes running before we can do anything else.

Instead, we can use the `run` method to start running a query without waiting for the result, and the `status` property to check whether it's ready.

Start by defining two new queries: a 'location event counts' (explain) and a 'total network objects' (explain).

In [None]:
events_query = fc.location_event_counts(
    connection=conn,
    start_date="2016-01-01",
    end_date="2016-01-02",
    aggregation_unit="admin2",
    count_interval="day",
)

network_objects_query = fc.total_network_objects(
    connection=conn,
    start_date="2016-01-01",
    end_date="2016-01-02",
    aggregation_unit="admin2",
    total_by="day",
)

Query objects have a `status` property, which will tell us the status of the query(e.g. 'executing' or 'finished'). The status of the two new queries is 'not running', because we haven't asked FlowKit to run them yet:

In [None]:
events_query.status

In [None]:
network_objects_query.status

If we now called `events_query.get_result()`, we would have to wait until `events_query` finished before we could start running `network_objects_query`. Instead, we can use the `run` method to set both queries running without waiting for the results:

In [None]:
events_query.run()
network_objects_query.run()

If we check the status again, we should find that both queries are now either 'queued', 'executing' or 'completed':  
(Note to self: ideally the queries should take long enough that there's time to check the status and see that they're running but not finished, but not so long that it delays the tutorial. Of course, if the queries are already cached - which they will be if anyone else has already worked through the tutorial - then they will already be 'completed')

In [None]:
events_query.status

In [None]:
network_objects_query.status

At any time, can use `get_result` to wait and get result, as we did previously. If it's already finished, will return result immediately:

In [None]:
events_result = events_query.get_result()
network_objects_result = network_objects_query.get_result()

In [None]:
events_result

In [None]:
network_objects_result

## "closing remarks"

- What have we learned?  
- Now ready to take a more in-depth look at what's possible with FlowKit.  
- In the next tutorial ([Geography](03-geography.ipynb)), will learn how to aggregate to different spatial units, and how to get the associated geometry of those spatial aggregation units.