# Visualizing Traffic Accidents Around LA with IBM Cloud SQL Query and PixieDust

This notebook contains a demonstration on how to use IBM Cloud SQL Query (SQL Query) in a Jupyter Notebook with PixieDust to visualize traffic accidents on a map throughout LA. We'll use SQL Query to search for traffic accidents occurring between 5pm and 8pm and have victims whose ages are between 20 and 35. We've taken care of uploading the data set to IBM Cloud Object Storage (COS) for you so you don't need to download and then upload the data set to your COS bucket.

To use this notebook, you'll need (in this order):

- IBM Cloud Object Storage - lite plan
- IBM Cloud SQL Query

Once you've provisioned the SQL Query service, click the **Open UI** button. This will generate a COS bucket for you that will store the results of your SQL queries. We'll need this bucket in order tell SQL Query where to store and retrieve the result sets. 

To get started, run the next cell to download the latest versions of ibmcloudsql and PixieDust.

In [None]:
!pip -q install ibmcloudsql
!pip install --upgrade pixiedust

Import both into the notebook:

In [None]:
import ibmcloudsql
import pixiedust

In order to keep your credentials safe, import `getpass`, which will store your IBM Cloud API Key so it's not visible to people viewing the notebook. You can get the API Key from [Manage > Security > Platform API](https://console.bluemix.net/iam/#/apikeys) Keys at the top of your IBM Cloud account.

Using `getpass`, you can enter any prompt you'd like as a string. Once you run the cell, a box will appear to store your key. Paste your IBM Cloud API Key into the box and hit return. Now, all you have to do is use the variable `cloud_api_key` to use it.

In [None]:
import getpass
cloud_api_key = getpass.getpass('Enter your IBM Cloud API Key')

We'll need two more pieces of information before we can start querying the data.

**1) IBM Cloud SQL Query CRN (cloud resource number)**

To get the CRN, go to your SQL Query service page. Click on the **Manage** tab and under _REST API_ there is a button **Instance CRN**. Click that to copy the CRN for the service. Then add that to the variable `sql_crn`.

**2) SQL Query generated COS bucket URL**

We suggest you use the COS bucket generated by SQL Query to keep things simple. But, you can use any bucket in COS to store your results. To get that COS bucket URL, go to COS then look for the generated bucket. Then click the kabob menu button at the end of that bucket name. You'll have the option to view the **Bucket SQL URL**. Once that's clicked, you will see a pop-up window with the URL. Add that to the variable `sql_cos_endpoint`.

We've added the suffix `/accidents` at the end of this URL. This will be the prefix of all the SQL query results that will be saved as CSV files in that bucket.

In [None]:
sql_crn = '<your_SQL_Query_CRN>' 
sql_cos_endpoint = 'cos://us-south/<your_COS_sql_bucket>/accidents'

To have access to the SQL Query API functions, you'll run `ibmcloudsql.SQLQuery` with your API Key, CRN, and COS endpoint. You'll then have access to the `run_sql` method to run your SQL queries on CSV data.  

The following query gets the time, area, age (between 20-35), victim sex, and location of accidents between 5pm and 8pm. 

**Note:** The URL in `data_source` is the traffic collision data. We've used the same URL for subsequent queries. 

In [None]:
sqlClient = ibmcloudsql.SQLQuery(cloud_api_key, sql_crn, sql_cos_endpoint)

data_source = "cos://us-geo/sqldata-032018/Traffic_Collision_Data_from_2010_to_Present.csv"

query = """
SELECT 
    `Time Occurred` AS time, 
    `Area Name` AS area, 
    `Victim Age` AS age, 
    `Victim Sex` AS sex, 
    `Location` AS location 
FROM  {}
WHERE 
    `Time Occurred` >= 1700 AND `Time Occurred` <= 2000 AND 
    `Victim Age` >= 20 AND `Victim Age` <= 35
""".format(data_source)

traffic_collisions = sqlClient.run_sql(query)

After the query runs, we can look at a sample of the results.

In [None]:
traffic_collisions.head()

SQL Query can also handle more advanced queries like CTEs (common table expressions). In the following example, the CTE formats the _location_ column of the previous query and divides the coordinates into separate latitude and longitude columns.

In [None]:
sqlClient = ibmcloudsql.SQLQuery(cloud_api_key, sql_crn, sql_cos_endpoint)

data_source = "cos://us-geo/sqldata-032018/Traffic_Collision_Data_from_2010_to_Present.csv"

query = """
WITH location AS ( 
    SELECT 
        id, 
        cast(split(coordinates, ',')[0] as float) as latitude, 
        cast(split(coordinates, ',')[1] as float) as longitude 
    FROM (SELECT 
            `Dr Number` as id, 
            regexp_replace(Location, '[()]', '') as coordinates 
        FROM {0}
    ) 
) 
SELECT  
    d.`Dr Number` as id, 
	d.`Date Occurred` as date, 
    d.`Time Occurred` AS time, 
    d.`Area Name` AS area, 
    d.`Victim Age` AS age, 
    d.`Victim Sex` AS sex, 
    l.latitude, 
    l.longitude 
FROM {0} AS d 
    JOIN 
    location AS l 
    ON l.id = d.`Dr Number` 
WHERE 
    d.`Time Occurred` >= 1700 AND 
    d.`Time Occurred` <= 2000 AND 
    d.`Victim Age` >= 20 AND 
    d.`Victim Age` <= 35 AND 
    l.latitude != 0.0000 AND 
    l.latitude != 0.0000
""".format(data_source)

traffic_location = sqlClient.run_sql(query)

In [None]:
traffic_location.head()

Using PixieDust's `display` feature, we can view the locations of these traffic accidents on a map.

Select **Map** as chart type and **mapbox** as renderer and configure the map view options as follows:
* **Keys**: `latitude`,  `longitude`
* **Values**: `id`, `age`, `sex`, `date`
    
Hover over a marker without a number to display the age, gender and accident date. Zoom in to explore the map in more detail.

In [None]:
display(traffic_location)