# Reduce Precision Node
1 -> 1

Verlaagt de precisie van elk coordinaat naar een bepaald aantal decimalen. Bijvoorbeeld EIFFEL Arnhem: `(5.9812899999999445, 51.9775966)` $\rightarrow$ `(5.98128, 51.97759)`.

*Let op:* vergeet niet `decimals` aan te passen in de code hier beneden!

In [18]:
###
### USER DEFINED
###

# Aantal decimalen achter de komma. In dezelfde eenheid als de projectie.
#   * RD New: meters
#   * WGS84: graden. In Nederland 1 m ~= 1e-5 graden
decimals = 5

In [8]:
###
### HEADER
###
import geopandas as gpd
import pandas as pd
from shapely import wkt as WKT
import numpy as np
import re

# geopandas (geometry) to pandas (wkt)
# AANGEPAST MET ROUNDING PRECISION
def gdfToDf(gdf, **kwargs):
    df = pd.DataFrame(gdf, copy=True)
    df['wkt'] = gdf.geometry.apply(lambda wkt: WKT.dumps(wkt, trim=True,\
                                                        rounding_precision=kwargs.get('rounding_precision', -1)))
    df.drop(columns='geometry', inplace=True)
    return df

# pandas (wkt) to geopandas (geometry)
def dfToGdf(df):
    gdf = gpd.GeoDataFrame(df, copy=True)
    gdf['geometry'] = df.wkt.apply(WKT.loads)
    gdf.drop(columns='wkt', inplace=True)
    return gdf


# bereken wat extra info (vind ik interessant)
def extractInfo(wkt, *args):
    # calculate specs
    info = {'points': len(wkt.split(',')),\
            'chars': len(wkt),\
            'precision': np.mean([len(decimals) for decimals in re.findall('\.([0-9]*)', wkt)])}
    # return dictionary or list with values of one spec
    if not args:
        return info
    else:
        return info.get(args[0])

In [20]:
###
### REDUCE PRECISION
###

# input
gdf = dfToGdf(input_table)
print('Shape input_table:', input_table.shape)
print('Average precision: %.1f decimals' % np.mean(input_table.wkt.apply(extractInfo, args=('precision',))))

# output and set the rounding precision
output_table = gdfToDf(gdf, rounding_precision=decimals)
print('Shape output_table:', output_table.shape)
print('Average precision: %.1f decimals' % np.mean(output_table.wkt.apply(extractInfo, args=('precision',))))

Shape input_table: (12, 3)
Average precision: 14.4 decimals
Shape output_table: (12, 3)
Average precision: 3.4 decimals


In [21]:
###
### FOOTER
###
print('Preview output_table (first 5 rows):')
try:
    # try pretty print in Jupyter Notebook
    display(output_table.head())
except NameError:
    print(output_table.head())

Preview output_table (first 5 rows):


Unnamed: 0,id,provincien,wkt
0,1,Noord-Holland,"MULTIPOLYGON (((5.166 53.001, 5.1663 53.001, 5..."
1,2,Groningen,"MULTIPOLYGON (((6.2875 53.343, 6.2847 53.343, ..."
2,3,Overijssel,"MULTIPOLYGON (((6.1101 52.442, 6.1096 52.442, ..."
3,4,Zeeland,"MULTIPOLYGON (((3.8392 51.759, 3.8419 51.758, ..."
4,5,Friesland,"MULTIPOLYGON (((6.192 53.412, 6.1917 53.412, 6..."


## Testing and showing results
Don't add this in KNIME

In [None]:
%matplotlib inline
gdf.plot()

## Load some input data to test the cells above
Run `Header` node first!

In [11]:
###
### SOURCE - ONLY FOR PREPARATION
###
from os.path import join
folder = '/home/ab/i/Open-data/shapefiles/shp-provincie'
filename = 'provincie-grenzen.shp'

# read the file
gdf = gpd.read_file( join(folder, filename) )
gdf = gdf.to_crs(epsg=4326)    # WGS84

# output
output_table = gdfToDf(gdf)

# copy output to input
input_table = output_table.copy()