<h1>Spatial Queries in PySpark</h1>


This notebook shows you show to use use spatial queries in Spark environments. The notebook uses the spatio-temporal library that is pre-installed on all Spark environments in Watson Studio. You will learn how to perform common spatial queries in Spark. 

The types of spatial queries you will learn to use are:
- In a set of points, find all the points that are within a certain distance to a particular point. For example, find all the hospitals that are within a certain distance to a given location.
- In a set of polygons, find all the polygons that contain a particular point. For example, find all the risk areas for fires, floods, or hurricanes that contain a particular location.
- In a set of points, find all the points that are contained within a particular polygon. For example, find all the retail outlets in a particular region.

Often, a spatial function has one parameter that refers to a spatial column in one table and a second parameter that refers to a spatial constant or to a spatial column in another table. This notebook shows you how to use functions to access and combine data of different types to perform spatial queries.

This notebook runs on Python and Spark.


## Table of Contents


1.	[Register the Spark SQL spatial functions](#register)
2.	[Get sample data](#getData)
3.	[Create a geometry column](#createColumn)
4.	[Register the data frames](#registerDataframe)
5.  [Run spatial queries](#runQueries)  
6.	[Summary](#summary)




<a id="register"></a>
## 1. Register the Spark SQL spatial functions

Register the Spark SQL spatial functions:

In [1]:
spark._jvm.org.apache.spark.sql.types.SqlGeometry.registerAll(spark._jsparkSession)

Waiting for a Spark session to start...
Spark Initialization Done! ApplicationId = app-20191106224935-0001
KERNEL_ID = 5db42561-439c-461b-9e7e-be2d192c5454


<a id="getData"></a>
## 2. Get sample data

This notebook uses a sample data set that is available in the IBM Watson Studio Gallery. Direct links are used by default to make sure this notebook is publicly runnable.

In your own cases, you should use your preferred way of loading data into a Spark dataframe, depending on where your data source sits.

Here are some hints if you are using IBM Cloud Object Storage:
- If your data is uploaded directly into the current project, you can simply click `Find and Add Data` button in the top right corner on the menu bar, then click `Insert SparkSession DataFrame` on the data you want. A code will be generated automatically and returns a Spark data frame.
- If your data is hosted in a designated bucket, you can use `ibmos2spark` to read the data into a Spark data frame.


Read the hospital data where each hospital's location is a latitude-longitude point:

In [2]:
import pandas as pd
from pyspark.sql.types import *

In [3]:
hospital_pdf = pd.read_csv('https://api.dataplatform.cloud.ibm.com/v2/gallery-assets/entries/5562ced564e776edc5f91e13d48d8309/data?accessKey=4d77701840fcb2f21587e39fdb063eeb')

If you run into an error running the above code due to link not found, please download the `hospitals.csv` data set in the Watson Studio gallery and insert it manually using the method given above.

In [4]:
hospital_schema = StructType([StructField('id', IntegerType()),
                              StructField('name', StringType()),
                              StructField('city', StringType()),
                              StructField('state', StringType()),
                              StructField('lon', DoubleType()),
                              StructField('lat', DoubleType())])

In [5]:
hospital_df = spark.createDataFrame(hospital_pdf, hospital_schema)

In [6]:
hospital_df.show(3)

+---+--------------------+------------+-----+----------+------------------+
| id|                name|        city|state|       lon|               lat|
+---+--------------------+------------+-----+----------+------------------+
|  1|Southern Hills Me...|  BERRY HILL|   TN|-86.721939|         36.077843|
|  2|Sycamore Shoals H...|ELIZABETHTON|   TN|-82.247635|         36.346218|
|  3|     Tokona Hospital| GREENEVILLE|   TN|-82.845711|36.151771999999994|
+---+--------------------+------------+-----+----------+------------------+
only showing top 3 rows



Read the county data where each county is a polygon/multipolygon:

In [7]:
counties_pdf = pd.read_csv('https://api.dataplatform.cloud.ibm.com/v2/gallery-assets/entries/c8cc28f4c30dc4d8c0b13f18c50c3244/data?accessKey=c8cc28f4c30dc4d8c0b13f18c50fa2d5'
                          )[['NAME', 'STATE_NAME', 'POP2000', 'shape_WKT']]

In [8]:
counties_schema = StructType([StructField('NAME', StringType()),
                              StructField('STATE_NAME', StringType()),
                              StructField('POP2000', IntegerType()),
                              StructField('shape_WKT', StringType())])

In [9]:
counties_df = spark.createDataFrame(counties_pdf, counties_schema)

In [10]:
counties_df.show(3)

+-----------------+----------+-------+--------------------+
|             NAME|STATE_NAME|POP2000|           shape_WKT|
+-----------------+----------+-------+--------------------+
|Lake of the Woods| Minnesota|   4522|MULTIPOLYGON (((-...|
|            Ferry|Washington|   7260|MULTIPOLYGON (((-...|
|          Stevens|Washington|  40066|MULTIPOLYGON (((-...|
+-----------------+----------+-------+--------------------+
only showing top 3 rows



<a id="createColumn"></a>
## 3. Create a geometry column for hospital and county data

The raw spatial data in the data frame can be of various types, for example **columns indicating latitude and longitude** or **column indicating wkt string for the geometry**, and so on.

Therefore, the first step is to use a spatial query to generate a new spatial column that combines the data in these columns.  
For example, use the function:
- `ST_Point(lon_col, lat_col)` if the raw spatial data is in a latitude column and a longitude column  
- `ST_WKTTOSQL(wkt_col)` if the raw spatial data is in a column containing the wkt string form of the geometry  

For the full list of possible query functions, see [Geospatial Toolkit functions](https://www.ibm.com/support/knowledgecenter/en/SSCJDQ/com.ibm.swg.im.dashdb.analytics.doc/doc/geo_functions.html).

Create a geometry column for the hospital data using `ST_Point(lon, lat)`:

In [11]:
hospital_df.createOrReplaceTempView("hospitals")
hospital_df = spark.sql("SELECT *, ST_Point(lon, lat) as location from hospitals")
hospital_df.show(3, False)

+---+-----------------------------+------------+-----+----------+------------------+------------------------------------------------+
|id |name                         |city        |state|lon       |lat               |location                                        |
+---+-----------------------------+------------+-----+----------+------------------+------------------------------------------------+
|1  |Southern Hills Medical Center|BERRY HILL  |TN   |-86.721939|36.077843         |PointEG: lat=36.077843, long=-86.721939         |
|2  |Sycamore Shoals Hospital     |ELIZABETHTON|TN   |-82.247635|36.346218         |PointEG: lat=36.346218, long=-82.247635         |
|3  |Tokona Hospital              |GREENEVILLE |TN   |-82.845711|36.151771999999994|PointEG: lat=36.151771999999994, long=-82.845711|
+---+-----------------------------+------------+-----+----------+------------------+------------------------------------------------+
only showing top 3 rows



Create a geometry column for the county data using `ST_WKTToSQL(wkt_string)`:

In [12]:
counties_df.createOrReplaceTempView('counties')
counties_df = spark.sql("SELECT NAME, STATE_NAME, POP2000, ST_WKTToSQL(shape_WKT) as shape from counties")
counties_df.show(3)

+-----------------+----------+-------+--------------------+
|             NAME|STATE_NAME|POP2000|               shape|
+-----------------+----------+-------+--------------------+
|Lake of the Woods| Minnesota|   4522|AcceleratedMultiP...|
|            Ferry|Washington|   7260|AcceleratedMultiP...|
|          Stevens|Washington|  40066|AcceleratedMultiP...|
+-----------------+----------+-------+--------------------+
only showing top 3 rows



<a id="registerDataframe"></a>
## 4. Register the hospital and county data frames as a temporary view

A data frame can also be used to create a temporary view. Registering a data frame as a table allows you to run SQL queries over its data. Register the hospital and county data frames as a temporary view: 

In [13]:
hospital_df.createOrReplaceTempView('hospitals')
counties_df.createOrReplaceTempView('counties')

<a id="runQueries"></a>
## 5. Run spatial queries

1. [Example 1: Query to determine points closest to another point](#ex1)
1. [Example 2: Queries to determine which polygon contains a point](#ex2)
1. [Example 3: Queries to determine the points in a polygon](#ex3)
1. [Example 4: Spatial join queries to determine points in a polygon](#ex4)
1. [Example 5: Spatial join queries with additional predicates and aggregation](#ex5)
1. [Example 6: Window queries](#ex6)
1. [Example 7: Distance queries](#ex7)

<a id = "ex1"></a>
### Example 1: Query to determine points closest to another point

This sample query shows you how to find the hospitals that are within a certain distance of a given location (which is constructed using the `ST_Point` constructor).

In [14]:
spark.sql("""
SELECT name, city, state
FROM hospitals
WHERE ST_Distance(location, ST_Point(-77.574722, 43.146732)) < 10000.0
""").show()

+--------------------+-----------+-----+
|                name|       city|state|
+--------------------+-----------+-----+
|Park Avenue Hospital|   BRIGHTON|   NY|
|County Home and I...|   BRIGHTON|   NY|
|    Genesee Hospital|   BRIGHTON|   NY|
|   Highland Hospital|   BRIGHTON|   NY|
|Rochester General...|IRONDEQUOIT|   NY|
|Saint Marys Hospital|   BRIGHTON|   NY|
|Strong Memorial H...|   BRIGHTON|   NY|
+--------------------+-----------+-----+



<a id = "ex2"></a>
### Example 2: Queries to determine which polygon contains a point

The following sample queries show you how to use spatial functions to determine which polygon contains a given point. The examples use the following functions:

1. `ST_Contains(geom1, geom2)`: returns TRUE if the `geom2` values are completely contained by the polygons identified by `geom1`.
2. `ST_Within(geom1, geom2)`: returns TRUE if the `geom1` values are within the polygons identified by `geom2`.
3. `ST_Intersects(geom1, geom2)`: returns TRUE if `geom1` and `geom2` intersect spatially in any way. This can be that they  touch, cross, or contain one other.

In [15]:
spark.sql("""
SELECT NAME 
FROM counties 
WHERE ST_Contains(shape, ST_Point(-74.237, 42.037))
""").show()

+------+
|  NAME|
+------+
|Ulster|
+------+



In [16]:
spark.sql("""
SELECT NAME
FROM counties
WHERE
ST_Within(ST_Point(-74.237, 42.037), shape)
""").show()

+------+
|  NAME|
+------+
|Ulster|
+------+



In [17]:
spark.sql("""
SELECT NAME
FROM counties
WHERE
ST_Intersects(shape, ST_Point(-74.237, 42.037))
""").show()

+------+
|  NAME|
+------+
|Ulster|
+------+



<a id = "ex3"></a>
### Example 3: Queries to determine the points in a polygon

Each of the following queries determine which hospitals are located within the specified polygon, which is defined as a constant using the  well-known text (WKT) representation. The polygon definition consists of the character string POLYGON followed by a pair of $x$ and $y$ coordinates for each vertex, separated by a comma. The individual $x$ and $y$ values are separated by a space. The entire list of coordinate pairs must be in parentheses.

In [18]:
spark.sql("""
SELECT name
FROM hospitals
WHERE
ST_Contains(ST_WKTToSQL('POLYGON ((-74.0 42.0, -73.0 42.0, -73.0 43.0, -74.0 43.0, -74.0 42.0))'), location)
""").show(3)

+--------------------+
|                name|
+--------------------+
|   Marshall Hospital|
|Southwestern Medi...|
|  Hillcrest Hospital|
+--------------------+
only showing top 3 rows



In [19]:
spark.sql("""
SELECT name 
FROM hospitals 
WHERE ST_Within(location, ST_WKTToSQL('POLYGON ((-74.0 42.0, -73.0 42.0, -73.0 43.0, -74.0 43.0, -74.0 42.0))'))
""").show(3)

+--------------------+
|                name|
+--------------------+
|   Marshall Hospital|
|Southwestern Medi...|
|  Hillcrest Hospital|
+--------------------+
only showing top 3 rows



In [20]:
spark.sql("""
SELECT name 
FROM hospitals 
WHERE ST_Intersects(location, ST_WKTToSQL('POLYGON ((-74.0 42.0, -73.0 42.0, -73.0 43.0, -74.0 43.0, -74.0 42.0))'))
""").show(3)

+--------------------+
|                name|
+--------------------+
|   Marshall Hospital|
|Southwestern Medi...|
|  Hillcrest Hospital|
+--------------------+
only showing top 3 rows



<a id = "ex4"></a>
### Example 4: Spatial join queries to determine points in a polygon

Just as a regular join function can join two tables based on the values in columns that contain character or numeric data, spatial join functions can be used to join tables based on the values in the columns that contain spatial data. The following examples use the **counties** and **hospitals** tables.

You can use the spatial join function to find the hospitals located within a specific county. For example, the following query returns a list of all the hospitals in the Dutchess county:

In [21]:
spark.sql("""
SELECT c.NAME, h.name 
FROM counties AS c, hospitals AS h 
WHERE c.NAME = 'Dutchess' 
AND ST_Intersects(c.shape, h.location)
""").show()

+--------+--------------------+
|    NAME|                name|
+--------+--------------------+
|Dutchess|Hudson River Stat...|
|Dutchess|Vassar Brothers H...|
|Dutchess|      Bowne Hospital|
|Dutchess|Harlem Valley Sta...|
|Dutchess|Matteawan State H...|
|Dutchess|New York State Ho...|
|Dutchess|Saint Francis Hos...|
|Dutchess|United States Vet...|
+--------+--------------------+



Alternatively, you can use the SQL `JOIN ... ON ...` notation, which is equivalent to a spatial predicate in the `WHERE` clause. For example, the following query produces the same result set as the previous query:

In [22]:
spark.sql("""
SELECT h.name, c.NAME
FROM counties AS c
JOIN hospitals AS h
ON c.NAME = 'Dutchess'
AND ST_Intersects(h.location, c.shape)
""").show()

+--------------------+--------+
|                name|    NAME|
+--------------------+--------+
|Hudson River Stat...|Dutchess|
|Vassar Brothers H...|Dutchess|
|      Bowne Hospital|Dutchess|
|Harlem Valley Sta...|Dutchess|
|Matteawan State H...|Dutchess|
|New York State Ho...|Dutchess|
|Saint Francis Hos...|Dutchess|
|United States Vet...|Dutchess|
+--------------------+--------+



The following query returns the name of the county in which a particular hospital is located:

In [23]:
spark.sql("""
SELECT c.NAME, h.name
FROM hospitals AS h, counties AS c
WHERE ST_Intersects(h.location, c.shape)
AND h.name = 'Vassar Brothers Hospital'
""").show()

+--------+--------------------+
|    NAME|                name|
+--------+--------------------+
|Dutchess|Vassar Brothers H...|
+--------+--------------------+



<a id = "ex5"></a>
### Example 5: Spatial join queries with additional predicates and aggregation

This example shows you how to use spatial joins in conjunction with additional predicates and aggregation, which can address business problems. These examples continue to use the hospitals and counties tables, but the same principles could be applied to any other type of data.

The following example queries the hospitals within each county in New York state, qualifying by the state name in the counties table.

In [24]:
spark.sql("""
SELECT c.NAME, h.name
FROM counties AS c, hospitals AS h
WHERE ST_Intersects(h.location, c.shape)
AND c.STATE_NAME='New York'
ORDER BY c.NAME, h.name
""").show(3)

+------+--------------------+
|  NAME|                name|
+------+--------------------+
|Albany|     Albany Hospital|
|Albany|Albany Hospital F...|
|Albany|Albany Hospital S...|
+------+--------------------+
only showing top 3 rows



The same results can be obtained by rewriting the above query and using the fields from the hospitals table:

In [25]:
spark.sql("""
SELECT c.NAME, h.name
FROM hospitals AS h, counties AS c
WHERE ST_Intersects(h.location, c.shape)
AND h.state='NY'
ORDER BY c.NAME, h.name
""").show(3)

+------+--------------------+
|  NAME|                name|
+------+--------------------+
|Albany|     Albany Hospital|
|Albany|Albany Hospital F...|
|Albany|Albany Hospital S...|
+------+--------------------+
only showing top 3 rows



The following example lists the number of hospitals per county in New York:

In [26]:
spark.sql("""
SELECT c.NAME, COUNT(h.name) AS hospital_count
FROM counties AS c, hospitals AS h
WHERE ST_Intersects(h.location, c.shape)
AND c.STATE_NAME='New York'
GROUP BY c.NAME
""").show(3)

+------+--------------+
|  NAME|hospital_count|
+------+--------------+
|Cayuga|             1|
| Kings|            26|
|Monroe|             9|
+------+--------------+
only showing top 3 rows



To identify counties where the population is underserved by hospitals, an interesting metric might be the number of people per hospital in each county. Using the population of each county in the year 2000, you can calculate this number.

In [27]:
spark.sql("""
SELECT c.NAME, 
COUNT(h.name) AS hospital_count, 
c.POP2000 AS Population, 
c.POP2000/COUNT(h.name) AS people_per_hospital
FROM counties AS c, hospitals AS h
WHERE c.STATE_NAME='New York'
AND ST_Intersects(h.location, c.shape)
GROUP BY c.NAME, c.POP2000
ORDER BY people_per_hospital DESC
""").show(3)

+----------+--------------+----------+-------------------+
|      NAME|hospital_count|Population|people_per_hospital|
+----------+--------------+----------+-------------------+
|     Bronx|             9|   1332650| 148072.22222222222|
|Chautauqua|             1|    139750|           139750.0|
|    Oswego|             1|    122377|           122377.0|
+----------+--------------+----------+-------------------+
only showing top 3 rows



With additional detail, such as number of beds, number of doctors per hospital, you could determine a better measure for health care coverage per state and population.

<a id = "ex6"></a>
### Example 6: Window queries

A common use case for mapping applications, and in particular for web mapping, is to select objects that fall within a specific rectangular region. This can be done by creating a polygon to represent the rectangle and using the `ST_Intersects` spatial predicate.

In [28]:
spark.sql("""
SELECT name
FROM hospitals
WHERE ST_Intersects(location, ST_WKTToSQL(
 'POLYGON ((-74.0 42.0, -73.0 42.0, -73.0 43.0, -74.0 43.0, -74.0 42.0))'))
""").show(3)

+--------------------+
|                name|
+--------------------+
|   Marshall Hospital|
|Southwestern Medi...|
|  Hillcrest Hospital|
+--------------------+
only showing top 3 rows



Another spatial predicate that does the same is `EnvelopesIntersect`, which can be used to select objects whose envelope intersects a rectangular region called a window. `EnvelopesIntersect` takes the name of the spatial column as a parameter and four Double values representing the lower-left, and upper-right corners of the rectangle. This spatial predicate is simpler to use and is more efficient than `ST_Intersects` for rectangular windows.

In [29]:
spark.sql("""
SELECT name
FROM hospitals
WHERE EnvelopesIntersect(location, -74.0, 42.0, -73.0, 43.0)
""").show(3)

+--------------------+
|                name|
+--------------------+
|   Marshall Hospital|
|Southwestern Medi...|
|  Hillcrest Hospital|
+--------------------+
only showing top 3 rows



Because this predicate is true if any portion of a line or polygon geometry falls within the specified window, parts of the line or polygon might lie outside of the window. This is generally not a problem with mapping applications, which will discard geometries that lie outside the display window.

When building web-mapping applications with widely used web-mapping APIs from providers such as Google Maps, Yahoo! Maps, Bing Maps, and others, it is necessary to provide the longitude and latitude values to be used in placing custom markers on the map. You can get this information by using a query such as the following:

In [30]:
spark.sql("""
SELECT name, ST_X(location) AS longitude, ST_Y(location) AS latitude
FROM hospitals
WHERE EnvelopesIntersect(location, -74.0, 42.0, -73.0, 43.0)
""").show(3)

+--------------------+----------+---------+
|                name| longitude| latitude|
+--------------------+----------+---------+
|   Marshall Hospital|-73.678177|42.718971|
|Southwestern Medi...|-73.206772|42.874249|
|  Hillcrest Hospital|-73.280113|42.457863|
+--------------------+----------+---------+
only showing top 3 rows



<a id = "ex7"></a>
### Example 7: Distance queries

Another common spatial query is to find things within a specified distance of a particular location. You have probably used web-mapping applications to get this kind of information. You can issue SQL queries from your application for questions like:

- Find customers within 10 miles of a store
- Find ATMs within 500 meters of the current location
- Find competitive stores within 10 kilometers of a proposed store location

The spatial function used for these queries is `ST_Distance`, which computes the distance between the spatial values and returns a result in meters. 

The following query generates eight results:

In [31]:
spark.sql("""
SELECT name
FROM hospitals
WHERE ST_Intersects(location, ST_WKTToSQL(
 'POLYGON ((-74.0 42.0, -73.0 42.0, -73.0 43.0, -74.0 43.0, -74.0 42.0))'))
""").show(3)

+--------------------+
|                name|
+--------------------+
|   Marshall Hospital|
|Southwestern Medi...|
|  Hillcrest Hospital|
+--------------------+
only showing top 3 rows



A different way of querying the same location above is to use the `ST_Buffer` function, where a circular buffer is created around the given geometry and the desired geometries within that buffer are determined. The `ST_Buffer` function takes as parameters a spatial geometry and a distance in meters to the buffer around this spatial value. The results are the same as when you us `ST_Intersects`.

In [32]:
spark.sql("""
SELECT name
FROM hospitals
WHERE
ST_Intersects(location,
  ST_Buffer(ST_Point(-74.237, 42.037), 46800.0))
ORDER BY name
""").show(3)

+--------------------+
|                name|
+--------------------+
|      Adventist Home|
|      Bowne Hospital|
|Columbia Memorial...|
+--------------------+
only showing top 3 rows



The following query returns the distance from a specified point to each object within a 30 mile (or approximately 46800m) radius:

In [33]:
spark.sql("""
SELECT name, ST_Distance(location, ST_Point(-74.237, 42.037)) AS distance
FROM hospitals
WHERE ST_Distance(location, ST_Point(-74.237, 42.037)) < 46800.0
ORDER BY distance
""").show(3)

+--------------------+------------------+
|                name|          distance|
+--------------------+------------------+
|Greene County Mem...| 36634.88406671428|
|      Adventist Home|39014.090792071736|
|Hudson River Stat...| 43362.54508885234|
+--------------------+------------------+
only showing top 3 rows



You could also use `ST_Buffer` to compute the spatial relation and then determine the distance as is shown in the following query:

In [34]:
spark.sql("""
SELECT name, ST_Distance(location, ST_Point(-74.237, 42.037)) AS distance
FROM hospitals
WHERE
  ST_Intersects(location,
  ST_Buffer(ST_Point(-74.237, 42.037), 46800.0))
ORDER BY distance
""").show(3)

+--------------------+------------------+
|                name|          distance|
+--------------------+------------------+
|Greene County Mem...| 36634.88406671428|
|      Adventist Home|39014.090792071736|
|Hudson River Stat...| 43362.54508885234|
+--------------------+------------------+
only showing top 3 rows



A key difference to be noted here is that the `ST_Buffer` in this package supports buffering of arbitrary geometries and can be used to compute in that manner. Note that:
- The `ST_Buffer` query on large geometries can be expensive.
- For a large number of geometries, the user is advised to calculate the buffers separately, store the buffers in columns, and operate on the stored buffers.

In [35]:
spark.sql("""
SELECT name, ST_Distance(location, ST_WKTToSQL(
 'LINESTRING (-74.0 42.0, -73.0 42.0)'))
FROM hospitals
WHERE ST_Intersects(location, ST_Buffer(ST_WKTToSQL(
 'LINESTRING (-74.0 42.0, -73.0 42.0)'), 46800.0))
""").show(3)

+--------------------+-------------------------------------------------------------------------------+
|                name|UDF:ST_Distance(location, UDF:ST_WKTToSQL(LINESTRING (-74.0 42.0, -73.0 42.0)))|
+--------------------+-------------------------------------------------------------------------------+
|      Avery Hospital|                                                              38777.96495714722|
|   Hartford Hospital|                                                               38204.0148377887|
|Springfield Munic...|                                                              39761.18673983218|
+--------------------+-------------------------------------------------------------------------------+
only showing top 3 rows



<a id="summary"></a>
##  Summary

In this notebook, you learned how to query spatial data you downloaded from the IBM Watson Studio Gallery. You registered each data frame (one with data on hospitals and another with county information) as a table to run your queries on. The sample queries showed you how to determine the hospitals within a certain distance or in a polygon, to find the name of the county in which a hospital is located, or to identify the counties where the population is underserved by hospitals. The sample queries showed you how to use and combine the most common Spark SQL spatial functions in queries. 

### Author

**Linsong Chu**, Research Engineer at IBM Research

Copyright © 2019 IBM. This notebook and its source code are released under the terms of the MIT License.

<div style="background:#F5F7FA; height:110px; padding: 2em; font-size:14px;">
<span style="font-size:18px;color:#152935;">Love this notebook? </span>
<span style="font-size:15px;color:#152935;float:right;margin-right:40px;">Don't have an account yet?</span><br>
<span style="color:#5A6872;">Share it with your colleagues and help them discover the power of Watson Studio!</span>
<span style="border: 1px solid #3d70b2;padding:8px;float:right;margin-right:40px; color:#3d70b2;"><a href="https://ibm.co/wsnotebooks" target="_blank" style="color: #3d70b2;text-decoration: none;">Sign Up</a></span><br>
</div>