GNSS log files are sort of like a bundle of csvs all mixed together. The header specifies which datasets are contained in the rest of the file and which columns are in each dataset. In the data section, the first column of each row specifies which dataset the row belongs to.

You can use the function below to load the GNSS log files into pandas dataframes without having to learn anything else about the GNSS file format.

In [None]:
import pandas as pd

In [None]:
def gnss_log_to_dataframes(path):
    print('Loading ' + path, flush=True)
    gnss_section_names = {'Raw','UncalAccel', 'UncalGyro', 'UncalMag', 'Fix', 'Status', 'OrientationDeg'}
    with open(path) as f_open:
        datalines = f_open.readlines()

    datas = {k: [] for k in gnss_section_names}
    gnss_map = {k: [] for k in gnss_section_names}
    for dataline in datalines:
        is_header = dataline.startswith('#')
        dataline = dataline.strip('#').strip().split(',')
        # skip over notes, version numbers, etc
        if is_header and dataline[0] in gnss_section_names:
            gnss_map[dataline[0]] = dataline[1:]
        elif not is_header:
            datas[dataline[0]].append(dataline[1:])

    results = dict()
    for k, v in datas.items():
        results[k] = pd.DataFrame(v, columns=gnss_map[k])
    # pandas doesn't properly infer types from these lists by default
    for k, df in results.items():
        for col in df.columns:
            if col == 'CodeType':
                continue
            results[k][col] = pd.to_numeric(results[k][col])

    return results

In [None]:
dfs = gnss_log_to_dataframes('../input/google-smartphone-decimeter-challenge/train/2020-05-14-US-MTV-1/Pixel4/Pixel4_GnssLog.txt')

In [None]:
dfs['Raw'].head(3)

In [None]:
dfs['UncalAccel'].head(3)

In [None]:
dfs['UncalGyro'].head(3)

In [None]:
dfs['UncalMag'].head(3)

Not all of the possible fields will actually be populated.

In [None]:
dfs['Fix'].head(3)

In [None]:
dfs['Status'].head(3)

Not all log files have every possible dataset in their header.

In [None]:
dfs['OrientationDeg']

Expect to see a different number of rows in each dataset. Each instrument may have a different logging frequency, and some require multiple rows to log all of the data recorded at a single timestamp.

In [None]:
for field_name, df in dfs.items():
    print('Number of rows in {0}: {1}'.format(field_name, len(df)))