In this chapter, we will cover the following topics:

<ul>
    <li>Clipping LineStrings to an area of interest</li>
    <li>Splitting polygons with lines</li>
    <li>Finding the location of a point on a line using linear referencing</li>
    <li>Snapping a point to the nearest line</li>
    <li>Calculating 3D ground distance and total elevation gain</li>
    </ul>

## Introduction

Vector data analysis is used in many, many application areas, starting from measuring the distance from point A to point B all the way through to complex routing algorithms. The first GIS systems were built on vector data and vector analysis, and then later expanded into the raster domain. In this chapter, we will start with simple vector operations, then work our way into a more complex model, chaining the various vector methods together to deliver new data that answers our spatial questions.

This process of data analysis is broken down into a couple of steps starting with an input dataset, performing a spatial operation on the data such as a buffer analysis, and, finally, we'll have some output in the form of a new dataset. The following diagram shows the flow of analysis in the simplest model form:

<img src="./B03543_05_01.jpg" height=400 width=400>

Converting simple questions into spatial operation methods and models takes experience and is not a simple task. For example, you may come across a simple task such as, "Identify and locate how many residential land parcels were affected by the flood." This would translate into the following:

<ul>
    <li>Firstly, an input dataset in the form of a flood polygon that defines the affected floods areas</li>
    <li>Secondly, the input dataset represents cadaster polygons</li>
    <li>Our spatial operation is an intersection function</li>
    <li>All of this results in a new polygon dataset</li>
</ul>

This would result in a spatial model that could look like this:

<img src="./B03543_05_02.jpg" height=400 width=400>

To tackle more complex questions, spatial modeling simply starts chaining more inputs along with more operations that output new data feeding into other new operations. This then leads us to a final set or sets of data.

## 5.1 Clipping LineStrings to an Area of Interest

A project involving spatial data is typically geographically limited to within a specified boundary area, the so-called project area. The input data can come from multiple sources and usually extends outside the project area. Removing this excess data is sometimes critical to speed up spatial processes, and at the same time, it reduces data volume. Reductions in data volumes can also result in secondary speed-ups, for example, less time to transfer or copy the data.

In this recipe, we will take a boundary polygon represented by a circle Shapefile, and then remove all excess LineStrings that are outside this circle.

This process of clipping will remove all lines outside the clip area—that is, our project area of interest.

<pre>
<strong>Note </strong>

A standard function called clip performs an intersection spatial operation. This is slightly different from a normal intersection function. The clip will NOT or should not retain the attributes attached to the clip area. Clipping involves two input datasets; the first defines the boundary that we want to clip our data to, and the second defines the data that will be clipped. Both sets contain attributes and these attributes of the clipping boundary are usually not included in a clip operation.

The new clipped data will only have the attributes from the original input dataset, excluding all the attributes from the clip polygon.

An intersection function will find geometries that overlap and output only lines within a circle. To demonstrate this concept better, the following graphical representation shows what we are going to achieve.</pre>

To demonstrate a simple clip operation, we will take a single LineString and polygon defining a clip boundary and perform a quick intersect. The result will look like what's represented in the following screenshot and can be viewed as a live web map in your browser. Refer to the HTML file located at /code/html/ch05-01-clipping.html to check out the result.

<img src="./B03543_05_03.jpg" height=400 width=400>

When running the simple intersection function, the line is cut into two new LineStrings as in the preceding screenshot.

Our second result will use two Shapefiles that represent our inputs. Our real data from OpenStreetMapconverted is converted to a Shapefile format for our input and output. The circle defines our polygon area of interest, while the road LineStrings are what we want to clip. Our result will be in the form of a new Shapefile that only shows the roads that are inside the circle.

### Getting ready

This recipe is in two parts. The first part is a simple clip demonstration using two GeoJSON files consisting of a single LineString and polygon. The second part uses data from OSM and can be found in your /ch05/geodata folder containing the circle polygon that represents our area of interest named clip_area_3857.shp. The roads_london_3857.shp file represents our input Shapefile of lines that we will clip to the circle polygon.

To visualize the first part, we use the leaflet JavaScript library in a very basic HTML page. Our second resulting Shapefile can then be opened with QGIS to see the resulting clipped set of roads.
How to do it...

We have two sets of code examples ahead of us. The first is a simple self-made set of GeoJSON inputs that are clipped and outputted as a GeoJSON representation. This is then visualized using a web page with the help of Leaflet JS.

The second code example takes in two Shapefiles and returns a clipped Shapefile that you can view using QGIS. Both examples use the same method and demonstrate how a clipping function works.

1. Now, let's take a look at our first code

In [None]:
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import os
import json
from shapely.geometry import asShape

# define output GeoJSON file
res_line_intersect = os.path.realpath("../geodata/ch05-01-geojson.js")

# input GeoJSON features
simple_line = {"type":"FeatureCollection","features":[{"type":"Feature","properties":{"name":"line to clip"},"geometry":{"type":"LineString","coordinates":[[5.767822265625,50.14874640066278],[11.901806640625,50.13466432216696],[4.493408203125,48.821332549646634]]}}]}
clip_boundary = {"type":"FeatureCollection","features":[{"type":"Feature","properties":{"name":"Clipping boundary circle"},"geometry":{"type":"Polygon","coordinates":[[[6.943359374999999,50.45750402042058],[7.734374999999999,51.12421275782688],[8.96484375,51.316880504045876],[10.1513671875,51.34433866059924],[10.8544921875,51.04139389812637],[11.25,50.56928286558243],[11.25,49.89463439573421],[10.810546875,49.296471602658094],[9.6240234375,49.03786794532644],[8.1298828125,49.06666839558117],[7.5146484375,49.38237278700955],[6.8994140625,49.95121990866206],[6.943359374999999,50.45750402042058]]]}}]}

# create shapely geometry from FeatureCollection
# access only the geomety part of GeoJSON
shape_line = asShape(simple_line['features'][0]['geometry'])
shape_circle = asShape(clip_boundary['features'][0]['geometry'])

# run the intersection
shape_intersect = shape_line.intersection(shape_circle)

# define output GeoJSON dictionary
out_geojson = dict(type='FeatureCollection', features=[])

# generate GeoJSON features
for (index_num, line) in enumerate(shape_intersect):
    feature = dict(type='Feature', properties=dict(id=index_num))
    feature['geometry'] = line.__geo_interface__
    out_geojson['features'].append(feature)

# write out GeoJSON to JavaScript file
# this file is read in our HTML and
# displayed as GeoJSON on the leaflet map
# called /html/ch05-01-clipping.html
with open(res_line_intersect, 'w') as js_file:
    js_file.write('var big_circle = {0}'.format(json.dumps(clip_boundary)))
    js_file.write("\n")
    js_file.write('var big_linestring = {0}'.format(json.dumps(simple_line)))
    js_file.write("\n")
    js_file.write('var simple_intersect = {0}'.format(json.dumps(out_geojson)))

This ends our first code demonstration using a simple self-made GeoJSON LineString that's clipped to a simple polygon. This quick recipe is found in the /code/ch05-01-1_clipping_simple.py file. After you run this file, you can go ahead and open the /code/html/ch05-01-clipping.html file in your local web browser to see the results.

It works by defining an output JavaScript file that is used to visualize our clipped results. This is followed by our input clipping areas and the LineString to be clipped as GeoJSON. We'll convert our GeoJSON to shapely geometry objects with the ashape() function so that we can run the intersection. The resulting intersection geometry is then converted from a shapely geometry into a GeoJSON file that is written to our output JavaScript file, which is used inside the .html file for visualization with Leaflet.

2. To begin our second code example located in the /code/ch05-01-2_clipping.py file, we will input two Shapefiles, create a new set of roads that are clipped to our circle polygon, and export them as Shapefiles:

In [None]:
#!/usr/bin/env python
# -*- coding: utf-8 -*-

import shapefile
import geojson
import os
# used to import dictionary data to shapely
from shapely.geometry import asShape
from shapely.geometry import mapping

# open roads Shapefile that we want to clip with pyshp
roads_london = shapefile.Reader(r"../geodata/roads_london_3857.shp")

# open circle polygon with pyshp
clip_area = shapefile.Reader(r"../geodata/clip_area_3857.shp")

# access the geometry of the clip area circle
clip_feature = clip_area.shape()

# convert pyshp object to shapely
clip_shply = asShape(clip_feature)

# create a list of all roads features and attributes
roads_features = roads_london.shapeRecords()

# variables to hold new geometry
roads_clip_list = []
roads_shply = []

# run through each geometry, convert to shapely geom and intersect
for feature in roads_features:
    roads_london_shply = asShape(feature.shape.__geo_interface__)
    roads_shply.append(roads_london_shply)
    roads_intersect = roads_london_shply.intersection(clip_shply)

    # only export linestrings, shapely also created points
    if roads_intersect.geom_type == "LineString":
        roads_clip_list.append(roads_intersect)

# open writer to write our new shapefile too
pyshp_writer = shapefile.Writer()

# create new field
pyshp_writer.field("name")

# convert our shapely geometry back to pyshp, record for record
for feature in roads_clip_list:
    geojson = mapping(feature)

    # create empty pyshp shape
    record = shapefile._Shape()

    # shapeType 3 is linestring
    record.shapeType = 3
    record.points = geojson["coordinates"]
    record.parts = [0]

    pyshp_writer._shapes.append(record)
    # add a list of attributes to go along with the shape
    pyshp_writer.record(["empty record"])

# save to disk
pyshp_writer.save(r"../geodata/roads_clipped2.shp")

### How it works...

For this recipe, we'll use Shapely for our spatial operation and pyshp to read in and out of our Shapefiles.

We'll begin with the import of the road LineStrings and our circle polygon for the demo project area. We'll use the pyshp module to handle the Shapefile input/output. Pyshp allows us to access the Shapefile bounds, feature geometry, feature attributes, and more.

Our first task is to convert the pyshp geometry object into something that Shapely can understand. We'll use the shape() function to get the pyshp geometry followed by the Shapely asShape() function. Next, we'll want all the records of roads so that we can use shapeRecords() to return these records for us.

Now, we'll get ourselves ready to perform the actual clipping by setting up two list variables to store our new data. The for loop runs over each record, that is, each line in the road dataset, converts it to a shapely geometry object using geo_interface, and is built in the pyshp function. This is then followed by the actual intersection shapely function that only returns geometry that intersects our circle. Finally, we'll check to see if the intersection geometry is a LineString. If it is, we'll append it to our output list.

<pre>
<strong>Note</strong>

During an intersection operation, Shapely will return points and LineStrings in a geometry collection. The reason for this is that, if two LineStrings touch at the ends, for example, or overlap each other, it will generate a point intersection location plus any overlapping segments.</pre>

At last, we can write out our new dataset to a new Shapefile. Using the pyshp writer() function, we create a new object and give it one single field called name. Looping through each feature, we can create a GeoJSON object using the shapely mapping function and an empty pyhsp record that we will add to it in a moment. We want to add the point coordinates from our GeoJSON and append them together.

Exiting the loop, we'll save our new Shapefile roads_clipped.shp to disk.

## 5.2. Splitting Polygons with Lines

Typically, in GIS, we work with data that influences other data in some form due to their inherit spatial relationship. This means that we need to work with one dataset to edit, update, and even delete another dataset. A typical example of this is an administrative boundary, which is a polygon that you cannot see on a physical surface but that influences feature information it crosses such as a lake. If we have a lake polygon and an administrative boundary, we might want to know how many square meters of lake belongs to each administrative boundary.

Another example could be a forest polygon that contains one species of trees that crosses a river. We might want to know the area on either side of the river. In the first scenario, we need to transform our administrative boundaries into LineStrings and then perform the cut.

To see what this looks like, take a look at this spoiler on how the results will look up front since we all like a good visual.

<img src="./B03543_05_04.jpg" height=400 width=400>

### Getting ready

For this recipe, we will once again use our GeoJSON LineString and polygon from the previous recipe. These homemade geometries will cut up our polygon into three new polygons. Be sure to fire up your virtual environment with the workon pygeoan_cb command.

### How to do it...

In [None]:
    #!/usr/bin/env python
    # -*- coding: utf-8 -*-
    from shapely.geometry import asShape
    from shapely.ops import polygonize
    import json
    import os

    # define output GeoJSON file
    output_result = os.path.realpath("../geodata/ch05-02-geojson.js")

    # input GeoJSON features
    line_geojs = {"type":"FeatureCollection","features":[{"type":"Feature","properties":{"name":"line to clip"},"geometry":{"type":"LineString","coordinates":[[5.767822265625,50.14874640066278],[11.901806640625,50.13466432216696],[4.493408203125,48.821332549646634]]}}]}
    poly_geojs = {"type":"FeatureCollection","features":[{"type":"Feature","properties":{"name":"Clipping boundary circle"},"geometry":{"type":"Polygon","coordinates":[[[6.943359374999999,50.45750402042058],[7.734374999999999,51.12421275782688],[8.96484375,51.316880504045876],[10.1513671875,51.34433866059924],[10.8544921875,51.04139389812637],[11.25,50.56928286558243],[11.25,49.89463439573421],[10.810546875,49.296471602658094],[9.6240234375,49.03786794532644],[8.1298828125,49.06666839558117],[7.5146484375,49.38237278700955],[6.8994140625,49.95121990866206],[6.943359374999999,50.45750402042058]]]}}]}

    # create shapely geometry from FeatureCollection
    # access only the geomety part of GeoJSON
    cutting_line = asShape(line_geojs['features'][0]['geometry'])
    poly_to_split = asShape(poly_geojs['features'][0]['geometry'])

    # convert circle polygon to linestring of circle boundary
    bndry_as_line = poly_to_split.boundary

    # combine new boundary lines with the input set of lines
    result_union_lines = bndry_as_line.union(cutting_line)

    # re-create polygons from unioned lines
    new_polygons = polygonize(result_union_lines)

    # stores the final split up polygons
    new_cut_ply = []

    # identify which new polygon we want to keep
    for poly in new_polygons:
        # check if new poly is inside original otherwise ignore it
        if poly.centroid.within(poly_to_split):
            print ("creating polgon")
            # add only polygons that overlap original for export
            new_cut_ply.append(poly)
        else:
            print ("This polygon is outside of the input features")

    # define output GeoJSON dictionary
    out_geojson = dict(type='FeatureCollection', features=[])

    # generate GeoJSON features
    for (index_num, geom) in enumerate(new_cut_ply):
        feature = dict(type='Feature', properties=dict(id=index_num))
        feature['geometry'] = geom.__geo_interface__
        out_geojson['features'].append(feature)

    # write out GeoJSON to JavaScript file
    # this file is read in our HTML and
    # displayed as GeoJSON on the leaflet map
    # called /html/ch05-02.html
    with open(output_result, 'w') as js_file:
        js_file.write('var cut_poly_result = {0}'.format(json.dumps(out_geojson)))

How it works...

### How it works...

Now the actual splitting of the polygons takes place in our /ch05/code/ch05-02_split_poly_with_line.py script.

The basic methodology to split a polygon based on a LineString follows this simple algorithm. First, we'll take our input polygon and convert the boundaries of this polygon into a new LineString dataset. Next up, we'll combine the LineString we want to use to cut the newly generated polygon LineStrings of boundaries. Finally, we use the polygonize method to rebuild polygons based on the new union set of LineStrings.

This rebuilding of polygons results in extra polygons that are created outside the original polygon. To identify these polygons, we'll use a simple trick. We can simply generate a centroid point inside each newly created polygon and then check to see if this point is inside the original polygon using the within predicate. If the point is not inside the original polygon, the predicate returns False and we do not need to include this polygon in our output.

## 5.3. Finding the location of a point on a line using linear referencing

The use of linear referencing is widespread, ranging from storing bus routes to oil and gas pipelines. Our ability to locate any position along a line based on a distance value from the start of the line is done using the interpolation methodology. We want to interpolate a point location at any position along a line. To determine the position, we'll use simple mathematics to calculate the position along a line based on the distance from the starting coordinate.

For our calculation, we'll measure the length of the line and find a coordinate located at a specified length from the start of the line. However, the question of where the start of the line is will soon arise. The starting point of the line is the first coordinate in the LineString's array of vertexes that make up the LineString because a LineString is nothing more than a collection of points chained together.

This will lead nicely to our next recipe, which is a little more complex.

### How to do it...

In [None]:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from shapely.geometry import asShape
import json
import os
from pyproj import Proj, transform

# define the pyproj CRS
# our output CRS
wgs84 = Proj("+init=EPSG:4326")
# output CRS
pseudo_mercator = Proj("+init=EPSG:3857")


def transform_point(in_point, in_crs, out_crs):
    """
    export a Shapely geom to GeoJSON and
    transform to a new coordinate system with pyproj
    :param in_point: shapely geometry as point
    :param in_crs: pyproj crs definition
    :param out_crs:  pyproj output crs definition
    :return: GeoJSON transformed to out_crs
    """
    geojs_geom = in_point.__geo_interface__

    x1 = geojs_geom['coordinates'][0]
    y1 = geojs_geom['coordinates'][1]

    # transform the coordinate
    x, y = transform(in_crs, out_crs, x1, y1)

    # create output new point
    new_point = dict(type='Feature', properties=dict(id=1))
    new_point['geometry'] = geojs_geom
    new_coord = (x, y)
    # add newly transformed coordinate
    new_point['geometry']['coordinates'] = new_coord

    return new_point


def transform_linestring(orig_geojs, in_crs, out_crs):
    """
    transform a GeoJSON linestring to
      a new coordinate system
    :param orig_geojs: input GeoJSON
    :param in_crs: original input crs
    :param out_crs: destination crs
    :return: a new GeoJSON
    """
    line_wgs84 = orig_geojs
    wgs84_coords = []
    # transfrom each coordinate
    for x, y in orig_geojs['geometry']['coordinates']:
        x1, y1 = transform(in_crs, out_crs, x, y)
        line_wgs84['geometry']['coordinates'] = x1, y1
        wgs84_coords.append([x1, y1])

    # create new GeoJSON
    new_wgs_geojs = dict(type='Feature', properties={})
    new_wgs_geojs['geometry'] = dict(type='LineString')
    new_wgs_geojs['geometry']['coordinates'] = wgs84_coords

    return new_wgs_geojs


# define output GeoJSON file
output_result = os.path.realpath("../geodata/ch05-03-geojson.js")

line_geojs = {"type": "Feature", "properties": {}, "geometry": {"type": "LineString", "coordinates": [[-13643703.800790818,5694252.85913249],[-13717083.34794459,6325316.964654908]]}}

# create shapely geometry from FeatureCollection
shply_line = asShape(line_geojs['geometry'])

# get the coordinates of each vertex in our line
line_original = list(shply_line.coords)
print line_original

# showing how to reverse a linestring
line_reversed = list(shply_line.coords)[::-1]
print line_reversed

# example of the same reversing function on a string for example
hello = 'hello world'
reverse_hello = hello[::-1]
print reverse_hello

# locating the point on a line based on distance from line start
# input in meters = to 360 Km from line start
point_on_line = shply_line.interpolate(360000)

# transform input linestring and new point
# to wgs84 for visualization on web map
wgs_line = transform_linestring(line_geojs, pseudo_mercator, wgs84)
wgs_point = transform_point(point_on_line, pseudo_mercator, wgs84)

# write to disk the results
with open(output_result, 'w') as js_file:
    js_file.write('var point_on_line = {0}'.format(json.dumps(wgs_point)))
    js_file.write('\n')
    js_file.write('var in_linestring = {0}'.format(json.dumps(wgs_line)))

After executing the /code/ch05-03_point_on_line.py file, you should see the following screenshot when you open the /code/html/ch05-03.html file in your web browser:

<img src="./B03543_05_05.jpg" height=400 width=400>

If you would like to reverse the LineString starting and ending points, you can use the list(shply_line.coords)[::-1] code to reverse the coordinate order as shown in the preceding code.

### How it works...

It all boils down to executing one single line of code to locate a point on a line specified at a certain distance. The shapely interpolate function does this for us. All you need is the shapely LineString geometry and a distance value. The distance value is the distance from the 0,0 start coordinate of the LineString.

Be careful in case the LineString direction is not the correct form in which you want to measure it. This would mean that you need to switch the LineString direction. Take a look at the line_reversed variable that holds the original LineString with a reversed order. To do the reverse, we'll use a simple python string operation, [::-1], to reverse our LineString list.

You can see this in action with our print statement reversing the LineString order on screen as follows:

<code>
[(-13643703.800790818, 5694252.85913249), (-13717083.34794459, 6325316.964654908)]
[(-13717083.34794459, 6325316.964654908), (-13643703.800790818, 5694252.85913249)]
</code>

### See also

If you are interested in more information regarding linear referencing, ESRI has a great reference of use cases and examples at http://resources.arcgis.com/en/help/main/10.1/0039/003900000001000000.htm and http://en.wikipedia.org/wiki/Linear_referencin.

### 5.4. Snapping a point to the nearest line

Building on our newly gained wisdom from the last recipe, we will now attack another common spatial problem. This super common spatial task is for all the GPS junkies who want their GPS coordinates to snap to an existing road. Imagine that you have some GPS tracks and you want to have these coordinates snap to your base road dataset. To accomplish such a task, we need to snap a point (GPS coordinates) to a line (roads).

The geos library is what Shapely is built on and can handle this problem with ease. We will combine the use of the shapely.interpolate and shapely.project functions to snap our point to the true nearest point on the line using linear referencing.

As you can see in the following diagram, our input point is located on the sun icon. The green line is what we want to snap our point to at the nearest location. The gray icon with a point on it is our result that represents the nearest point on the line from our original x position.

<img src="./B03543_05_06.jpg" height=400 width=450>

### How to do it...

1. Shapely is well suited for snapping a point to the nearest line, so let's get started:

In [None]:
#!/usr/bin/env python
# -*- coding: utf-8 -*-

from shapely.geometry import asShape
import json
import os
from pyproj import Proj, transform

# define the pyproj CRS
# our output CRS
wgs84 = Proj("+init=EPSG:4326")
# output CRS
pseudo_mercator = Proj("+init=EPSG:3857")


def transform_point(in_point, in_crs, out_crs):
    """
    export a Shapely geom to GeoJSON Feature and
    transform to a new coordinate system with pyproj
    :param in_point: shapely geometry as point
    :param in_crs: pyproj crs definition
    :param out_crs: pyproj output crs definition
    :return: GeoJSON transformed to out_crs
    """
    geojs_geom = in_point.__geo_interface__

    x1 = geojs_geom['coordinates'][0]
    y1 = geojs_geom['coordinates'][1]

    # transform the coordinate
    x, y = transform(in_crs, out_crs, x1, y1)

    # create output new point
    out_pt = dict(type='Feature', properties=dict(id=1))
    out_pt['geometry'] = geojs_geom
    new_coord = (x, y)
    # add newly transformed coordinate
    out_pt['geometry']['coordinates'] = new_coord

    return out_pt

def transform_geom(orig_geojs, in_crs, out_crs):
    """
    transform a GeoJSON linestring or Point to
      a new coordinate system
    :param orig_geojs: input GeoJSON
    :param in_crs: original input crs
    :param out_crs: destination crs
    :return: a new GeoJSON
    """

    wgs84_coords = []
    # transfrom each coordinate
    if orig_geojs['geometry']['type'] == "LineString":
        for x, y in orig_geojs['geometry']['coordinates']:
            x1, y1 = transform(in_crs, out_crs, x, y)
            orig_geojs['geometry']['coordinates'] = x1, y1
            wgs84_coords.append([x1, y1])
        # create new GeoJSON
        new_wgs_geojs = dict(type='Feature', properties={})
        new_wgs_geojs['geometry'] = dict(type='LineString')
        new_wgs_geojs['geometry']['coordinates'] = wgs84_coords

        return new_wgs_geojs

    elif orig_geojs['geometry']['type'] == "Point":

        x = orig_geojs['geometry']['coordinates'][0]
        y = orig_geojs['geometry']['coordinates'][1]
        x1, y1 = transform(in_crs, out_crs, x, y)
        orig_geojs['geometry']['coordinates'] = x1, y1
        coord = x1, y1
        wgs84_coords.append(coord)

        new_wgs_geojs = dict(type='Feature', properties={})
        new_wgs_geojs['geometry'] = dict(type='Point')
        new_wgs_geojs['geometry']['coordinates'] = wgs84_coords

        return new_wgs_geojs
    else:
        print("sorry this geometry type is not supported")

# define output GeoJSON file
output_result = os.path.realpath("../geodata/ch05-04-geojson.js")

line = {"type":"Feature","properties":{},"geometry":{"type":"LineString","coordinates":[[-49.21875,19.145168196205297],[-38.49609375,32.24997445586331],[-27.0703125,22.105998799750576]]}}
point = {"type":"Feature","properties":{},"geometry":{"type":"Point","coordinates":[-33.57421875,32.54681317351514]}}

new_line = transform_geom(line, wgs84, pseudo_mercator)
new_point = transform_geom(point, wgs84, pseudo_mercator)


shply_line = asShape(new_line['geometry'])
shply_point = asShape(new_point['geometry'])

# perform interpolation and project point to line
pt_interpolate = shply_line.interpolate(shply_line.project(shply_point))

# print coordinates and distance to console
print ("origin point coordinate")
print (point)

print ("interpolted point location")
print (pt_interpolate)

print "distance from origin to interploate point"
print (shply_point.distance(pt_interpolate))

# convert new point to wgs84 GeoJSON
snapped_pt = transform_point(pt_interpolate, pseudo_mercator, wgs84)

# our original line and point are transformed
# so here they are again in original coords
# to plot on our map
line_orig = {"type":"Feature","properties":{},"geometry":{"type":"LineString","coordinates":[[-49.21875,19.145168196205297],[-38.49609375,32.24997445586331],[-27.0703125,22.105998799750576]]}}
point_orig = {"type":"Feature","properties":{},"geometry":{"type":"Point","coordinates":[-33.57421875,32.54681317351514]}}

# write to disk the results
with open(output_result, 'w') as js_file:
    js_file.write('var input_pt = {0}'.format(json.dumps(snapped_pt)))
    js_file.write('\n')
    js_file.write('var orig_pt = {0}'.format(json.dumps(point_orig)))
    js_file.write('\n')
    js_file.write('var line = {0}'.format(json.dumps(line_orig)))

### How it works...

We'll use a tried and tested methodology called linear referencing to do the work. Let's kick it off with the imports needed to do this, including shapely.geometry asShape, json, and pyproj' Pyproj is used to quickly transform our coordinates to and from EPSG: 4326 and EPSG 3857. Shapely works on planar coordinates and cannot work directly with lat/lon values.

Extending our functions from the last recipe, we have the transform_point() function alongside the transform_geom() function. The transform_point() function converts a Shapely geometry to GeoJSON and transforms the point coordinate, while the transform_geom() function takes GeoJSON in and transforms it to the new coordinate system. Both functions use pyproj to execute the transformations.

Next, we'll define our output GeoJSON file and the input line and point features. Then, we'll execute our two new transform functions followed closely with the conversion into a Shapely geometry object. This new Shapely geometry is then run through the interpolate function.

Interpolate alone does not answer our question. We need to combine its usage with the Shapely project function that takes in the original point and projects it onto the line.

We then print out our results to screen and create a new JavaScript file called /geodata/ch05-04-geojson.js, used in our /code/html/ch05-04.html for viewing. Go ahead and open the HTML file in your browser to see the results.

Take a look at your console to see the print to console statements that show us the point coordinates and distances from the original as follows:

<code>>>> python python-geospatial-analysis-cookbook/ch05/code/ch05-04_snap_point2line.py </code>



## 5.5. Calculating 3D ground distance and total elevation gain

We've finished finding points on lines and returning points on a line, so now, it is time to calculate the true ground 3D distance that we actually ran or biked along a real 3D road. It is also possible to calculate the elevation profile and we will see this in the Chapter 7, Raster Analysis.

Calculating the ground distance sounds easy, but 3D calculations are more complicated to calculate than 2D. Our 3D LineString has a z-coordinate for each vertex that makes up our LineString. Therefore, we need to calculate the 3D distance between each set of coordinates, —that is, from vertex to vertex in our input LineString.

The mathematics to calculate the distance between two 3D Cartesian coordinates is relatively simple and uses the 3D form of the Pythagoras formula:

<code>
3d_distance = square root √ ( (x2 – x1) 2 + (y2 – y1) 2 + (z2 -z1)2) </code>

Here it is in Python:

In [None]:
import math
3d_dist = math.sqrt((x2 – x1)**2 + (y2 – y1)**2 + (z2 – z1)**2 )


### Getting ready

First up, we will get our hands on some 3D data to play with and what better to analyze than the Stage 16 Carcassonne/Bagnères-de-Luchon of the Tour de France 2014 mountain stage, a real killer. Here are some stats from www.letour.com, including the 237.5 km length, Michael Rogers' winning time of 6:07:10, the average speed of 38.811 km/h, and the highest point of 1753 m. You will find the data in your folder at /ch05/geodata/velowire_stage_16_27563_utf8.geojson.

The original KML was generously provided by Thomas Vergouwen (www.velowire.com) and it is free for us to use with his permission; thanks, Thomas. The original data is located at /ch05/geodata/velowire_stage_16-Carcassonne-Bagneres-de-Luchon.kml. The conversion to GeoJSON and the transformation to EPSG:27563 was done using the QGIS save as function.

Now, according to the LA Times web page (http://www.latimes.com/la-sp-g-tour-de-france-stage-elevation-profile-20140722-htmlstory.html), they've quoted a 3895 m elevation gain. As compared to team Strava (http://blog.strava.com/tour-de-france-2014/) where they've stated a 4715 m elevation gain. Now, who is correct and is this 237.5 km ground distance in 3D? Let's find out!

This is the official profile of Stage 16 for your visual pleasure:

<img src="./B03543_05_07.jpg" height=400 width=400>

To give you an idea of what accurate and simplified data looks like, take a look at this comparison of the velowire's site's (www.velowire.com) KML marked in purple (accurate) and the bikemap site's progression highlighted by yellow line (simplified). If you sum up the differences, the length and elevation are both significantly different for both. For a race that's 237.5 km long, every meter counts when you're planning and attacking on the course. In the following screenshot, you can see the comparison of the velowire site's KML marked in purple and the bikemap site's progression highlighted by the yellow line:

<img src="./B03543_05_08.jpg" height=400 width=400>
<em>Data source: http://www.mapcycle.com.au/LeTour2014/#</em>

### How to do it...

We'll start with looping through each vertex and calculating the 3D distance from one vertex to another in our LineString. Each vertex is nothing more than a point with x, y, and z (3D Cartesian) values.

Here is the code to calculate each vertex:

In [None]:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import math
import os
from shapely.geometry import shape, Point
import json

def pairs(lst):
    """
    yield iterator of two coordinates of linestring
    :param lst: list object
    :return: yield iterator of two coordinates
    """
    for i in range(1, len(lst)):
        yield lst[i - 1], lst[i]


def calc_3d_distance_2pts(x1, y1, z1, x2, y2, z2):
    """
    :input two point coordinates (x1,y1,z1),(x2,y2,2)
    :param x1: x coordinate first segment
    :param y1: y coordiante first segment
    :param z1: z height value first coordinate
    :param x2: x coordinate second segment
    :param y2: y coordinate second segment
    :param z2: z height value second coordinate
    :return: 3D distance between two input 3D coordinates
    """
    d = math.sqrt((x2 - x1) ** 2 + (y2 - y1) ** 2 + (z2 - z1) ** 2)
    return d


def readin_json(jsonfile):
    """
    input: geojson or json file
    """
    with open(jsonfile) as json_data:
        d = json.load(json_data)
        return d


geoj_27563_file = os.path.realpath("../geodata/velowire_stage_16_27563_utf8.geojson")
print (geoj_27563_file)
# create python dict type from geojson file object
json_load = readin_json(geoj_27563_file)

# set start lengths
length_3d = 0.0
length_2d = 0.0

# go through each geometry in our linestring
for f in json_load['features']:
    # create shapely shape from geojson
    s = shape(f['geometry'])

    # calculate 2D total length
    length_2d = s.length

    # set start elevation
    elevation_gain = 0

    # go through each coordinate pair
    for vert_start, vert_end in pairs(s.coords):
        line_start = Point(vert_start)
        line_end = Point(vert_end)

        # create input coordinates
        x1 = line_start.coords[0][0]
        y1 = line_start.coords[0][1]
        z1 = line_start.coords[0][2]
        x2 = line_end.coords[0][0]
        y2 = line_end.coords[0][1]
        z2 = line_end.coords[0][2]

        # calculate 3d distance
        distance = calc_3d_distance_2pts(x1, y1, z1, x2, y2, z2)

        # sum distances from vertex to vertex
        length_3d += distance

        # calculate total elevation gain
        if z1 > z2:
            elevation_gain = ((z1 - z2) + elevation_gain )
            z2 = z1
        else:
            elevation_gain = elevation_gain  # no height change
            z2 = z1


print ("total elevation gain is: {gain} meters".format(gain=str(elevation_gain)))

# print coord_pair
distance_3d = str(length_3d / 1000)
distance_2d = str(length_2d / 1000)
dist_diff = str(length_3d - length_2d)

print ("3D line distance is: {dist3d} meters".format(dist3d=distance_3d))
print ("2D line distance is: {dist2d} meters".format(dist2d=distance_2d))
print ("3D-2D length difference: {diff} meters".format(diff=dist_diff))

### How it works...

We need to transform our original KML file stored in EPSG: 4326 to a planar coordinate system to facilitate our calculations (refer to the upcoming table). So, we'll begin by transforming the KML into EPSG: 27563 NTF Paris / Lambert Sud France. For further information on this, refer to http://epsg.io/27563.

To begin with, we'll define three functions for our calculations starting with the pairs() function that takes a list and then uses the Python yield generator function to yield two sets of values. The first set of values is the starting x, y, and z coordinates, and the second set includes the ending x, y, and z coordinates of the coordinate pairs that we want to measure.

The calc_3d_distancte_2pts() function takes the two coordinate pairs, including the important z value, and calculates the distance between two points in 3D space using the Pythagorean theorem.

Our readin_json() function inputs a path to a file, and in this case, we can point it to our GeoJSON file stored in the /ch05/geodata folder. This will return a Python dictionary object for us to work with within the next few steps.

Now, let's define the variables to hold our GeoJSON file, load this file, and set the starting 3D/2D lengths to zero for initialization.

Next up, let's iterate through the GeoJSON LineString features and convert them into a Shapely object so that we can use Shapely to tell us the inherent 2D length as used by our length_2d variable and read the coordinates. This is followed by our for loop where all the action occurs.

Looping over our new list created by our pairs() function, we can loop over each vertex of a LineString. We define the line_start and line_end variables to identify the start of each new line segment that we need to access with a single LineString feature. We'll follow up by then defining our input parameters to do the 3D distance calculations by parsing our list object with standard Python positional slicing. At last, we'll call the calc_3d_distance_2pts() function to give us our distance in 3D.

We need to iteratively sum the distances together from one segment to the next. We can do this by adding the distance to our length_3d with += operator. Now, our length_3d variable is updated for each segment in the loop, giving us our desired 3D length.

The remaining part of the loop calculates our elevation gain. Our z1 and z2 elevation values need to be constantly compared to additively add the total elevation gain only if the next value is greater than the last. If not, set them to equal each other and continue to the next z value. The elevation_gain variable is then constantly updated to itself if there's no change; otherwise not, the difference between the two elevations is added.

At last, we'll print out our results to the screen; they should look like this:

<code>
total elevation gain is: 4322.0 meters
3D line distance is: 244.119162551
2D line distance is: 243.55802081
3D-2D length difference: 561.141741137 meters
<code>

With our data transformed and converted to GeoJSON, the 2D length according to our script is 243.558 km from the velowire KML as compared to 237.5 km from the official race page, which is a difference of 6.058 km. The original KML in EPSG:4326 was 302.805 km long, a difference of over 65 km, hence the necessary transformation. For a better comparison, take a look at this table:

<table>
    <th><tr><td>Source + EPSG</td>
        <td>2D Length</td>
        <td>3D Length</td>
        <td>Difference</td></tr></th>
    <tbody>
        <tr><td>Velowire <br> EPSG:4326</td>
            <td>302.805 km</td>
            <td>This is not calculated</td>
            <td></td>
        </tr>
        <tr><td> Velowire EPSG:27563 </td>
            <td> 243.558 km </td>
            <td> 244.12 </td>
            <td> 561.14 m </td>
        </tr>
        <tr><td>Mapcycle EPSG:4326</td>
            <td>293.473 km</td>
            <td>This data is not available</td>
            <td></td>
        </tr>
        <tr><td>Mapcylce EPSG:27563</td>
            <td>236.216 km</td>
            <td>This data is not available</td>
            <td></td>
        </tr>
        <tr><td>Letour official</td>
            <td>237.500 km (approximate)</td>
            <td>237.500 km (approximate)</td>
            <td></td>
        </tr>
    </tbody>
    </table>
    
The elevation gain is also very different between different sources.

<table style="border:1px solid black">
    <th><tr><td>Source</td>
        <td>Elevation gain</td>
        </tr></th>
    <tbody>
        <tr><td>Strava (http://blog.strava.com/)</td>
            <td>4715 m</td>
        </tr>
        <tr><td>Los Angeles Times</td>
            <td>3895 m</td>
        </tr>
        <tr><td>TrainingPeaks (www.trainingpeaks.com)</td>
            <td>3243 m</td>
        </tr>
        <tr><td>The Velowire KML data analysis</td>
            <td>4322 m</td>
        </tr>
    </tbody>
    </table>

### There's more...

The accuracy of all these calculations is based on the original KML data source. Each data source is/was derived by different people and, possibly, different methods. The more you know about your data source, the more you know about its accuracy. In this case, I assume that the Velowire data source was digitized by hand using Google Earth. Thus, the accuracy is only as accurate as that of the underlying Google Earth imagery and coordinate system, which is EPSG:3857.