### MicroPath Reconstruction From AIS Broadcast Points

Micropathing is the construction of a target's path from a limited set of a consecutive sequence of target points. Typically, the sequence is time-based, and the collection is limited to 2 or 3 target points.  The following is an illustration of 2 micropaths derived from 3 target points:

![](media/Micropath0.png)

Micropathing is different than path reconstruction, in such that the latter produced one polyline for the path of a target. Path reconstruction losses insightful in-path behavior, as a large number of attributes cannot be associated with the path parts. Some can argue that the points along the path can be enriched with these attributes. However, with the current implementations of Point objects, we are limited to only the extra `M` and `Z` to the necessary `X` and `Y`. You can also join the `PathID` and `M` to a lookup table and gain back that insight, but that joining is typically expensive and is difficult to derive from it the "expression" of the path using traditional mapping. A micropath overcomes today's limitations with today's traditional means to express the path insight better.

So, a micropath is a line composed of typically 2 points only and is associated with a set of attributes that describe that line.  These attributes are typical enrichment metrics derived from its two ends. An attribute can be, for example, the traveled distance, time, or speed.

In this notebook, we will construct "clean" micropaths using SparkSQL.  What do I mean by clean? As we all know, emitted target points are notoriously affected by noise, so using SparkSQL, we will eliminate that noise during the micropath construction.

### Import required modules.

In [None]:
import os
import arcpy
from spark_esri import spark_start, spark_stop

### Start Spark instance.

Note the `config` argument to [configure the Spark instance](https://spark.apache.org/docs/latest/configuration.html).

In [None]:
config = {"spark.driver.memory":"2G"}
spark = spark_start(config=config)

### Define the output spatial reference and fields.

Here we are emitting the `x` and `y` values in web mercator meters for easier displacement calculations later on.

In [None]:
sp_ref = arcpy.SpatialReference(3857)
fields = ["MMSI","SHAPE@X","SHAPE@Y","BaseDateTime"]

### Read the selected `Broadcast` features and create a Spark dataframe that is mapped to a SparkSQL view.

Note that we are extracting the hour value from the timestamp, and converting the timestamp to an epoch in seconds since 1970.

In [None]:
with arcpy.da.SearchCursor("Broadcast", fields, spatial_reference=sp_ref) as rows:
    spark\
        .createDataFrame(rows, "mmsi string,x double,y double,t timestamp")\
        .selectExpr("mmsi","x","y","hour(t) h","unix_timestamp(t) t")\
        .createOrReplaceTempView("v0")

### Start the micopath construction by using the SparkSQL window function to find the leading record to the current record.

In [None]:
spark\
    .sql("""
select mmsi,h,
x x1,
y y1,
t t1,
lead(x,1,0.0) over (partition by mmsi order by t) x2,
lead(y,1,0.0) over (partition by mmsi order by t) y2,
lead(t,1,0) over (partition by mmsi order by t) t2
from v0
""")\
    .createOrReplaceTempView("v1")

### Enrich the micropath with vertical and horizontal displacements in meters, and add the endpoint time difference in seconds.

In [None]:
spark.sql("select *,(x2-x1) dx,(y2-y1) dy,(t2-t1) dt from v1 where t1 < t2").createOrReplaceTempView("v2")

### Calculate the travel distance in meters.

```
dd = sqrt(dx*dx+dy*dy)
```

In [None]:
spark.sql("select mmsi,h,x1,y1,x2,y2,sqrt(dx*dx+dy*dy) dd,dt from v2").createOrReplaceTempView("v3")

### Calculate the travel speed in meters per second.

In [None]:
spark.sql("select *,dd/dt mps from v3").cache().createOrReplaceTempView("v4")

### Noise Elimination.

Here we approximate the 99th percentile of the distance, time and speed values, and we will use the resulting values as the noise reduction thresholds.

In [None]:
spark.sql("""
select
percentile_approx(mps,0.99) mps,
percentile_approx(dt,0.99) dt,
percentile_approx(dd,0.99) dd
from v4
""").show()

### Filter the data to create "clean" micropaths.

Here, we roughly doubled the `mps` (12.3) and `dd` (778.6) as a "tolerant" threshold.

In [None]:
spark.sql("""
select mmsi,h,dd,dt,mps,x1,y1,x2,y2
from v4
where dd between 1 and 1500
and mps < 25
and dt < 130
""")\
    .cache()\
    .createOrReplaceTempView("v5")

### Collect the micropaths as features where the shape is in WKT format.

The usage of the WKT format will relevant in the subsquent cell during the insert process.

In [None]:
rows = spark.sql("""
select mmsi,h,dd,dt,mps,concat('LINESTRING(',x1,' ',y1,',',x2,' ',y2,')') wkt
from v5
""")\
    .collect()

### Create an in-memory line feature class in the TOC.

In [None]:
ws = "memory"
nm = "MicroPaths"

fc = os.path.join(ws,nm)

arcpy.management.Delete(fc)

sp_ref = arcpy.SpatialReference(3857)
arcpy.management.CreateFeatureclass(ws,nm,"POLYLINE",spatial_reference=sp_ref)
arcpy.management.AddField(fc, "MMSI", "TEXT")
arcpy.management.AddField(fc, "HH", "LONG")
arcpy.management.AddField(fc, "DD", "DOUBLE")
arcpy.management.AddField(fc, "DT", "DOUBLE")
arcpy.management.AddField(fc, "MPS", "DOUBLE")

# Note shape is expected to be in WKT
with arcpy.da.InsertCursor(fc, ["MMSI","HH","DD","DT","MPS","SHAPE@WKT"]) as cursor:
    for row in rows:
        cursor.insertRow(row)

### Stop the spark instance.

In [None]:
spark_stop()