<header>
   <p  style='font-size:36px;font-family:Arial; color:#F0F0F0; background-color: #00233c; padding-left: 20pt; padding-top: 20pt;padding-bottom: 10pt; padding-right: 20pt;'>
       TD GeoDataFrame to ESRI conversion
  <br>
       <img id="teradata-logo" src="https://storage.googleapis.com/clearscape_analytics_demo_data/DEMO_Logo/teradata.svg" alt="Teradata" style="width: 125px; height: auto; margin-top: 20pt;">
    </p>
</header>

<p style = 'font-size:20px;font-family:Arial'><b>Introduction</b></p>
<p style = 'font-size:16px;font-family:Arial'>
Geospatial information identifies the geographic location of features and boundaries on the planet. 
Vantage provides geospatial types to represent geometries with up to three dimensions. Vantage provides the ST_Geometry, MBB , and MBR data types for creating and manipulating geometric shapes in the database. ST_Geometry is implemented as a user-defined type (UDT). Users can do complex computations involving geospatial data in Vantage using the functions available. This Python program enables users to extract Teradata geospatial data and export it into widely-used formats such as ESRI shapefiles, which can be used for visualizations.</p>
    

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>1. Connect to Vantage</b></p>

<p style = 'font-size:16px;font-family:Arial'>We start by importing the required libraries and set environment variables and environment paths (if required).</p>

In [None]:
!pip install --upgrade geopandas

<div class="alert alert-block alert-info">
<p style = 'font-size:16px;font-family:Arial'><b>Note: </b><i>The above library needs to be upgraded for some of the functions used in this demonstration. Please be sure to restart the kernel after installing/upgrading the library. The simplest way to restart the Kernel is by typing zero zero: <b> 0 0</b></i></p>
</div>

<p style = 'font-size:16px;font-family:Arial'>In the section, we import the required libraries.</p> 

In [None]:
# Standard libraries
import getpass
import warnings

# Third-party libraries
import geopandas as gpd
from shapely import wkt
import os

# Teradata libraries
from teradataml import *
display.max_rows = 5

# Suppress warnings
warnings.filterwarnings('ignore')
warnings.simplefilter(action='ignore', category=DeprecationWarning)
warnings.simplefilter(action='ignore', category=RuntimeWarning)
warnings.simplefilter(action='ignore', category=FutureWarning)

<p style = 'font-size:16px;font-family:Arial'>We will be prompted to provide the password. We will enter the password, press the Enter key, and then use the down arrow to go to the next cell. Begin running steps with Shift + Enter keys.</p>

In [None]:
%run -i ~/JupyterLabRoot/UseCases/startup.ipynb
eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)
print(eng)

In [None]:
%%capture
execute_sql('''SET query_band='DEMO=TD_Geo_Esri.ipynb;' UPDATE FOR SESSION; ''')

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>2. Getting Data for This Demo </b></p>
<p style = 'font-size:16px;font-family:Arial'>We have provided data for this demo on cloud storage. We are downloading the data to local storage.</p>

In [None]:
%run -i ~/JupyterLabRoot/UseCases/run_procedure.py "call get_data('DEMO_TelcoNetwork_local');" 

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>3. Geospacial data in tdml GeoDataFrame</b></p>

<p style = 'font-size:16px;font-family:Arial'>For our demo we are taking Cell Tower table from DEMO_TelcoNetwork database in teradataml GeoDataFrame. We use GeoDataFrame when we have any geometry datatype like Point, Linestring etc in our table otherwise we use teradataml DataFrame. </p>

In [None]:
res1 = GeoDataFrame(in_schema("DEMO_TelcoNetwork", "Cell_Towers"))
res1

<p style = 'font-size:16px;font-family:Arial'>We check the shape and teradata datatypes of the tdml GeoDataFrame.</p>

In [None]:
res1.shape

In [None]:
res1.tdtypes

<p style = 'font-size:16px;font-family:Arial'>From above we can see that we have 303 records and cell_geom is of Geometry datatype.<br>Now we see what is the equivalent pandas datatype we will get.</p>

In [None]:
res1.dtypes

<p style = 'font-size:16px;font-family:Arial'>From above we can see that the Geometry datatype of Teradata is taken as a str in pandas. Once we convert the tdml GeodataFrame to pandas DataFrame we have to convert the cell_geom column to the Geometry datatype.</p>

<p style = 'font-size:18px;font-family:Arial'><b>3.1 Converting Pandas DataFrame to GeoPandas DataFrame.</b></p>

In [None]:
df = res1.to_pandas()
type(df)

In [None]:
# Create a list of column names with GEOMETRY() datatype
geo_cols = [col.split()[0] for col in str(res1.tdtypes).split('\n') if col.split()[1] == 'GEOMETRY()']
geo_cols

<p style = 'font-size:16px;font-family:Arial'>Converting str to geometry in pandas dataframe and then converting the dataframe to GeoPandas Dataframe.</p>

In [None]:
for i in range(len(geo_cols)):
    df[geo_cols[i]] = gpd.GeoSeries.from_wkt(df[geo_cols[i]])

In [None]:
df.dtypes

In [None]:
type(df)

In [None]:
for i in range(len(geo_cols)):
    gdf = gpd.GeoDataFrame(df, geometry=geo_cols[i])
    #df[geo_cols[i]] = gpd.GeoSeries.from_wkt(df[geo_cols[i]])

In [None]:
type(gdf)

<p style = 'font-size:18px;font-family:Arial'><b>3.2 Shape and Tab files</b></p>
<p style = 'font-size:16px;font-family:Arial'>Once we have converted our dataframe to Geopandas dataframe we can store it in shape (.shp) or Mapinfo tab formats as needed.</p>

In [None]:
shp_output_path = "output_shapefile.shp"
tab_output_path = "output_tabfile.tab"

In [None]:
# Save to Shapefile
gdf.to_file(shp_output_path, driver="ESRI Shapefile")
print(f"Data saved to Shapefile: {shp_output_path}")

In [None]:
# Save to MapInfo TAB
gdf.to_file(tab_output_path, driver="MapInfo File")
print(f"Data saved to MapInfo TAB: {tab_output_path}")

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>4. Validate the shape files created</b></p>
<p style = 'font-size:16px;font-family:Arial'>We can read the shapefile created and validate them.</p>

In [None]:
# Load the shapefile
shapefile_path = "output_shapefile.shp"
test_gdf = gpd.read_file(shapefile_path)

# Display the first few rows
print(test_gdf.head())

# Check the geometry type
print("Geometry Type:", test_gdf.geom_type.unique())


In [None]:
from shapely.validation import explain_validity

# Check for invalid geometries
invalid_geometries = test_gdf[~test_gdf.is_valid]

if not invalid_geometries.empty:
    print("Invalid geometries found:")
    for idx, row in invalid_geometries.iterrows():
        print(f"Index: {idx}, Issue: {explain_validity(row.geometry)}")
else:
    print("All geometries are valid.")

<p style = 'font-size:20px;font-family:Arial'><b>Conclusion</b></p>
<p style = 'font-size:16px;font-family:Arial'>In this functional demo we have seen how we extract the geo data from Teradata and create a .shp or .tab files from it. </p>

<hr style="height:2px;border:none;">
<p style = 'font-size:20px;font-family:Arial'><b>5. Cleanup</b></p>

<p style = 'font-size:16px;font-family:Arial'>We will use the following code to clean up tables and databases created for this demonstration.</p>

In [None]:
%run -i ~/JupyterLabRoot/UseCases/run_procedure.py "call remove_data('DEMO_TelcoNetwork');" 
#Takes 10 seconds

In [None]:
remove_context()

<footer style="padding-bottom:35px; border-bottom:3px solid #91A0Ab">
    <div style="float:left;margin-top:14px">ClearScape Analytics™</div>
    <div style="float:right;">
        <div style="float:left; margin-top:14px">
            Copyright © Teradata Corporation - 2024. All Rights Reserved
        </div>
    </div>
</footer>