# Parse and Analyze GPS Data

Brett Deaton - Summer 2021

This notebook translates the raw data stream from a GPS receiver into meaningful data. The raw data stream is read in from the file `nmea-data_stream.csv`, represented in the NMEA-0183 format.

Here's a reference on the NMEA-0183 standard from [navspark](https://navspark.mybigcommerce.com/content/NMEA_Format_v0.1.pdf). We'll use only the Global Positioning System Fix Data line denoted `GNGGA`.

The GPS receiver is the [u-blox SAM-M8Q](https://www.u-blox.com/en/product/sam-m8q-module). You can buy one already attached to a PCB by [sparkfun](https://www.digikey.com/en/products/detail/sparkfun-electronics/GPS-15210/10064422), already optimized with a copper-plan serving as an antenna.

### Set up

In [None]:
import csv # to quickly read in comma-separated values
import matplotlib.pyplot as plt # to visually inspect the data

In [None]:
# read fix data from csv and convert to 2D list of strings
points = []
with open('nmea-data_stream.csv') as f:
    reader = csv.reader(f)
    for row in reader:
        if "GGA" in row[0] and row[6]!="0": # only use $GNGGA rows with fixes
            points.append(row[1:])

In [None]:
# look at the first point, for a sanity check
for x in points[0]:
    print(x, end=" ")

### Extract and Convert Data

Some fields we might be interested in, summarizing the NMEA-0183 format:

| index | meaning             | string format
|-------|---------------------|---------------
| 0     | UTC time            | hhmmss.s
| 1     | latitude            | ddmm.m
| 3     | longitude           | dddmm.m
| 6     | number of sats      | n
| 8     | altitude (m)        | x.x
| 10    | geoid-ellipsoid (m) | x.x

In [None]:
# make a list of timestamps, in hrs since midnight
times = []
for x in points:
    times.append(float(x[0][:2]) +     # hr
                 float(x[0][2:4])/60 + # min
                 float(x[0][4:])/3600) # sec

In [None]:
# make a list of altitudes, in meters above mean sea level
alts = []
for x in points:
    alts.append(float(x[8])) # altitude

In [None]:
# make a list of latitudes, in degrees north
lats = []
for x in points:
    sign = 1 if x[2]=="N" else -1
    lats.append(sign*int(x[1][:2]) + # deg
                sign*float(x[1][2:])/60)  # min

In [None]:
# make a list of longitudes, in degrees east
longs = []
for x in points:
    sign = 1 if x[4]=="E" else -1
    longs.append(sign*int(x[3][:3]) + # deg
                 sign*float(x[3][3:])/60)  # min

In [None]:
# make a list of number of satellites used for fix
fix_num = []
for x in points:
    fix_num.append(int(x[6]))

### Visualize Data

In [None]:
# look for gaps in recorded times
plt.plot(times)
plt.xlabel("index")
plt.ylabel("time (hrs)")
plt.show()

In [None]:
# observe jitter in altitudes
plt.plot(times, alts)
plt.xlabel("time (hrs)")
plt.ylabel("altitude (m)")
plt.show()

In [None]:
# observe jitter in residual latitudes and longitudes
residue_lat = 46.392
residue_long = -116.973
lat_to_res = lambda x: x-residue_lat
long_to_res = lambda x: x-residue_long

plt.plot(times, list(map(lat_to_res, lats)),
         times, list(map(long_to_res, longs)))
plt.legend(["latitude - "+str(residue_lat),
            "longitude - "+str(residue_long)])
plt.ylabel("residual (deg)")
plt.xlabel("time (hrs)")
plt.show()

In [None]:
# find number of satellites used for fixes
plt.plot(times, fix_num, "rd")
plt.xlabel("time (hrs)")
plt.ylabel("number of satellites used for fix")
plt.show()

### Todo

Tasks left to complete:
* project to (x,y) position in meters
* look for patterns in GPS quality indicator, i.e why some fixes were unavailable
* find correlation between number of satellites used for fixes and jitter in position
* compute checksums of each row and compare to recorded checksum