<header style="padding:1px;background:#f9f9f9;border-top:3px solid #00b2b1"><img id="Teradata-logo" src="https://www.teradata.com/Teradata/Images/Rebrand/Teradata_logo-two_color.png" alt="Teradata" width="220" align="right" />

<b style = 'font-size:28px;font-family:Arial;color:#E37C4D'>4D Analytics using the New York City Taxi dataset --Geospatial</b>
</header>

<p style = 'font-size:18px;font-family:Arial;color:#E37C4D'><b>Introduction</b></p>

<p style = 'font-size:16px;font-family:Arial'>
This is a demonstration of Vantage capabilities for
    <li style = 'font-size:16px;font-family:Arial'>
Geospatial using the ST_GEOMETRY data type and ST_SphericalDistance method
    </li>
    <p>
        

<p style = 'font-size:18px;font-family:Arial;color:#E37C4D'> <b> Accessing the Data </b> </p>
<p style = 'font-size:16px;font-family:Arial'>These demos will work either with foreign tables accessed from Cloud Storage via NOS or you may import the tables to your machine. If you import data for multiple demos, you may need to use the Data Dictionary "Manage Your Space" routine to cleanup tables you no longer need.     
    
<p style = 'font-size:16px;font-family:Arial'>Use the link below to access the 2 options for using data from the data dictionary notebook:

[Click Here to get data for this notebook](../Data_Dictionary/Data_Dictionary.ipynb#TRNG_NYCTaxi)

[Click Here to Manage Your Space](../Data_Dictionary/Data_Dictionary.ipynb#Manage_Your_Space)

<p style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Connect to Vantage and explore the dataset</b></p>
Below command will connect to the Vantage environment.


In [None]:
%connect local

<p style = 'font-size:18px;font-family:Arial;color:#E37C4D'> <b> Access data in Vantage  </b> </p>
<p style = 'font-size:16px;font-family:Arial'>For this demo, data is already resident in Object Storage which we are accessing via ReadNOS.  Create a reference to the table, and sample the contents.  Data could just as easily reside in permanent tables, another RDBMS, or another Vantage system.<br>
This demonstration will use two tables: the taxi trip details and the fares for each trip. The queries below will sample each table and then show the range of the time period covered by the data. <br>
<i>*You can skip to Geospatial Analysis if you have seen the trip & fare tables in other notebook in the series</p>

In [None]:
SELECT top 10 * from TRNG_NYCTaxi.trip;

In [None]:
SELECT top 10 * from TRNG_NYCTaxi.trip_fare;

In [None]:
sel min(pickup_datetime), max(dropoff_datetime) from TRNG_NYCTaxi.trip;

<p style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Geospatial Analysis </b></p>
<p style = 'font-size:16px;font-family:Arial'> Now we have seen the trip and fare details, Let's define a few landmarks. </p>


In [None]:
CREATE VOLATILE TABLE dim_geo_locations
     (
      location VARCHAR(100),
      Lat FLOAT,
      Lon FLOAT,
      geo_point SYSUDTLIB.ST_GEOMETRY(16776192) INLINE LENGTH 9920)
PRIMARY INDEX ( location )
ON COMMIT PRESERVE ROWS;

In [None]:
insert into dim_geo_locations values('Columbia University',40.81,-73.96,'POINT(40.81 -73.96)');
insert into dim_geo_locations values('Empire State Building',40.75,-73.99,'POINT(40.75 -73.99)');
insert into dim_geo_locations values('Grand Central Station',40.75,-73.98,'POINT(40.75 -73.98)');
insert into dim_geo_locations values('JFK Airport',40.64,-73.79,'POINT(40.64 -73.79)');
insert into dim_geo_locations values('Madison Square Garden',40.75,-73.99,'POINT(40.75 -73.99)');
insert into dim_geo_locations values('New York Stock Exchange',40.71,-74.01,'POINT(40.71 -74.01)');
insert into dim_geo_locations values('Times Square',40.76,-73.99,'POINT(40.76 -73.99)');
insert into dim_geo_locations values('United Nations HQ',40.75,-73.97,'POINT(40.75 -73.97)');
insert into dim_geo_locations values('Yankee Stadium',40.83,-73.93,'POINT(40.83 -73.93)');

<p style = 'font-size:16px;font-family:Arial'> We are casting coordinates as a ST_GEOMETRY type. Here are the coordinates for the Yankee Stadium, and this is a point:

In [None]:
sel cast('POINT(40.75 -73.97)' as ST_GEOMETRY)

<p style = 'font-size:16px;font-family:Arial'> Let's filter the rides starting within 1km from a given 'Yankee Stadium'.

In [None]:
sel
l.location
,cast('POINT('||trim(r.pickup_latitude (Decimal(15,6)))||' '||trim(r.pickup_longitude (Decimal(15,6)))||')' as ST_GEOMETRY) pickup_point
,r.*
from TRNG_NYCTaxi.trip r
join dim_geo_locations l
on pickup_point.ST_SphericalDistance(l.geo_point)<1000
where (r.pickup_datetime (date)) = '2013-11-10'
    and l.location='Yankee Stadium'
;

<p style = 'font-size:16px;font-family:Arial'> What is the number of pickup at 'Yankee Stadium' throughout the month?

In [None]:
sel
$TD_TIMECODE_RANGE time_bucket_per
,l.location
,count(1) pickup_cnt
from TRNG_NYCTaxi.trip r
join dim_geo_locations l
	on cast('POINT('||trim(r.pickup_latitude (Decimal(15,6)))||' '||trim(r.pickup_longitude (Decimal(15,6)))||')' as ST_GEOMETRY).ST_SphericalDistance(l.geo_point)<1000
group by time(minutes(15) and l.location)
USING TIMECODE(pickup_datetime)
where extract(month from pickup_datetime)=11
        and l.location='Yankee Stadium'
order by 2,1;

<p style = 'font-size:16px;font-family:Arial'> What is the average demand at 'Yankee Stadium' throughout the day, based on November data?

In [None]:
sel *
from 
(
	sel
	begin($TD_TIMECODE_RANGE) (time) timeOfDay
	,l.location
	,count(1) pickup_cnt
	from TRNG_NYCTaxi.trip r
	join dim_geo_locations l
		on cast('POINT('||trim(r.pickup_latitude (Decimal(15,6)))||' '||trim(r.pickup_longitude (Decimal(15,6)))||')' as ST_GEOMETRY).ST_SphericalDistance(l.geo_point)<1000
	group by time(minutes(15) and l.location)
	USING TIMECODE(pickup_datetime)
	where extract(month from pickup_datetime)=11
) AS dt 
PIVOT(
    avg(pickup_cnt) FOR location IN (sel distinct location from dim_geo_locations where location='Yankee Stadium')
) dt
order by 1;

<p style = 'font-size:28px;font-family:Arial;color:#E37C4D'><b>Conclusion</b></p>
<p style = 'font-size:16px;font-family:Arial'>
In this demonstration we have seen Vantage can store common geometry datatypes like point, linestring etc in ST_GEOMETRY datatype and has inbuild functions which are fairly simple and easy to use. For more information on the geomrtry datatype and functions plesae refer to link below.

<ul style = 'font-size:16px;font-family:Arial'> 
       
  <li>Teradata® Geospatial Utilities User Guide: <a href = 'https://docs.teradata.com/r/Teradata-Geospatial-Utilities-User-Guide/June-2022/Teradata-Geospatial-Utilities-Overview/Welcome-to-Teradata-Tools-and-Utilities-Teradata-Geospatial-Utilities-User-Guide'>https://docs.teradata.com/r/Teradata-Geospatial-Utilities-User-Guide/June-2022/Teradata-Geospatial-Utilities-Overview/Welcome-to-Teradata-Tools-and-Utilities-Teradata-Geospatial-Utilities-User-Guide</a></li>
  
</ul>

<footer style="padding:10px;background:#f9f9f9;border-bottom:3px solid #394851">©2023 Teradata. All Rights Reserved</footer>