# Using Treasure Data with Python and Pandas

Treasure Data has a [python client](https://github.com/treasure-data/td-client-python), which means pandas/python users can connect directly from their iPython Notebooks.

All you need is a Treasure Data account, which you can get from [here](https://console.treasuredata.com/users/sign_up)

In [0]:
import tdclient
import pandas as pd
import numpy as np
%matplotlib inline

## Getting Treasure Data's apikey

You need to get your Treasure Data API key. There are two ways to fetch your API keys after you sign up for Treasure Data.

1. **From web console**: Please access [this URL](https://console.treasuredata.com/users/current). At the right most column, you can retrieve the API key. You want to use the Normal, not Write-Only API keys to run queries.
2. **From CLI**: If you are the `td` command user, running the following command exposes your API key.
    ```
    td apikey:show
    ```

In [0]:
apikey = 'Your API key here' # Setting your API key

In [0]:
client = tdclient.Client(apikey) # instantiating the client

## Running a query against the sample dataset

As you can see below, running queries is easy. Just use the `query` method, which accepts three arguments.

1. The first argument is the name of the database
2. The second argument is the query string (Make sure you use single quotes if you are using the Presto engine!)
3. The optional keyword arguments. I am using `type='presto'` here to use Presto and not Hive.

In [0]:
job = client.query('sample_datasets',
                   "SELECT TD_TIME_FORMAT(time, 'yyyy') AS t, SUM(volume) "
                   "FROM nasdaq "
                   "WHERE symbol='AMZN' "
                   "GROUP BY TD_TIME_FORMAT(time, 'yyyy') "
                   "ORDER BY t", type='presto')

### Asynchronous execution

Your query creates a job asynchronously. Please check the job is

1. finished (`job.finished()` should return `True`)
2. successful (`job.status()` should return `success`)

In [0]:
[job.status(), job.finished()]

In [0]:
results = [r for r in job.result()]

In [0]:
results_df = pd.DataFrame.from_records(results, columns=('year', 'AMZN trade volume'))

In [0]:
results_df.plot(x='year')