**Update 7/11/22:**

When I originally released this notebook a couple of months ago, the resulting score was 3.135 which was good enough to tie for first place.  Since then the scores have improved significantly and this score is no longer low enough to be competitive.  I have made a couple of updates below to improve the resulting score to 2.67 which as of today is good enough to get you into 27th place out of 461 entries.

The two changes I made were to update the base station locations to account for tetonic plate movement and to replace the rides with hardware clock discontinuities on a point by point basis instead of the entire data set.  I also added a few sort() statements to correct an issue when the code was not run in Windows.

--------------------------------------------------------------------------------------------------------



**INTRODUCTION**

RTKLIB is an open source software library used for (among other things) calculating GNSS solutions from raw observation data.  It was originally written by Tomoji Takasu of the Tokyo University of Marine Science and Technology, but there are now multiple forks available, including the demo5 fork which I maintain. When used to generate PPK (post-processing kinematic) solutions it has two advantages over the baseline solutions provided by Google.  First of all, it uses the carrier phase observations (ADR) as well as the pseduorange observations.  The carrier phase observations are more difficult to use but also have smaller errors than the pseudorange observations.  Secondly, the PPK solutions are differential, relative to a nearby known base location, rather than absolute like the Google solutions.  The differential solution allows us to difference raw observations between the rover and base which effectively cancels most of the satellite orbital, clock, and atmospheric errors, resulting in more accurate solutions.  

I describe this process in more detail in [this blog post](https://rtklibexplorer.wordpress.com/2022/01/10/google-smartphone-decimeter-challenge/) in which I share my experience working with last year's GSDC data after the competition was over.  It includes a link to download the code I used to generate RTKLIB solutions for last year's data.  Submitting these solutions to Kaggle (after the competition was over) resulted in a score of 2.15 meters which put it into fifth place on the final Private Leaderboard.

I would like to encourage use of RTKLIB in this year's competition and so I am sharing an updated version of the previous code to duplicate my results with this year's data and the most recent RTKLIB code. This code gives a set of solutions that score 3.135 on this year's Public Leaderboad.  The new code is only slightly modified from the previous version, so anyone interested in using RTKLIB for this challenge can start by becoming familiar with the previous code on the previous data.  I would strongly recommend starting with the older code and data because it comes as a complete package and is easier to get results with than what I am presenting here.

My hope is to provide a platform which will allow competitors to jump right into extending the existing GNSS theory rather than having to build a solution from scratch. In addition to the C version of RTKLIB I described in the above post, I have recently created an all python subset of RTKLIB for PPK solutions. This runs somewhat slower than the C code but does make an easier development platform. In this notebook I will provide code and guidelines for working with the C code version of RTKLIB.  I will also describe working with the Python version in a separate notebook after I complete this one.

In the interestes of getting this out sooner rather than later, this will not be the "push the button and you get an answer" kind of notebook.  It will be more like a set of handwritten notes scribbled down quickly while actually running the experiment.  In it's current form you will need to download the pieces of code to your own system, put them together and run locally since it relies on open source compiled code.  In the future I hope to make this more user friendly, but for now it is assumed that you are fairly familiar with python and can debug simple issues that may crop up when trying to follow these instructions.

I ran this exercise on a Windows PC but you should be able to run it on linux as well.  In general, the top of each file will have a set of input parameters.  Unless your folder names and paths are identical to mine, you will often need to update these before running.

The folder structure I use in this solution is:

GSDC_2022

    config
    
    data
    
      test
    
      train
      
    python
   
      android_rinex
      
      rtklib-py
      
    rtklib

It will be easier to follow these instuctions if you use the same folder structure.

**Step 1: Retrieve base observation and satellite navigation files**

Since these are differential solutions, we will need raw observation measurements from a nearby base station for each data set.  Fortunately, these are available from the National Geodetic Survey (NGS) website.  We will also need satellite orbital data for each data set for the GPS, GLONASS, and Galileo constellations.  These are available from multiple websites.  I chose to retrieve them from the UNAVCO site, in part because these files include Galileo navigation data as well as GPS and GLONASS.  Some of the other sources include only GPS and GLONASS.

Last year it was sufficient to use a single base station for every data set since they were all located in a small geographic area.  This year there are data sets from the Bay Area and from the LA area, so we will need to select the appropriate base station for each data set and also use the correct location for that base station in the solution.

The code below simply retrieves the base and navigation data for the full day corresponding to the starting time of each data set.  This works fine for the test data set since all data sets start and end on the same UTC day but but in the training set there are multiple data sets that start in one (UTC) day and finish in the next day.  I will be demonstrating this exercise on the test data so will not worry about this issue but if you are trying to retrieve this data for the training data you will either need to eliminate the multi-day sets or manually download a file with the correct starting and stopping times.  This is most easily done from the "User Friendly CORS" page on the NGS site.

The observation files are doubly compressed.  They first need to be decompressed with gzip and then with crx2rnx.  This second step translates from compressed rinex to uncompressed and requires an executable file from RTKLIB.

You can download the RTKLIB executables for Windows at https://github.com/rtklibexplorer/RTKLIB/releases.  Put them in the GSDC_2022/rtklib folder.  Note that these are for the demo5 fork of RTKLIB which I maintain.  You can use any fork of RTKLIB for this step, but later when we are calulating the solutions, you will need to use the demo5 fork.  If you are running in Linux, you will need to build your own executables from the source code.  This is decribed at https://rtklibexplorer.wordpress.com/2020/12/18/building-rtklib-code-in-linux/



In [3]:
!pip install pymap3d

Collecting pymap3d
  Downloading pymap3d-2.9.1-py3-none-any.whl (53 kB)
Installing collected packages: pymap3d
Successfully installed pymap3d-2.9.1


In [None]:
%%writefile D:\kaggle\GSDC2022\config\bases.sta
%  LATITUDE(DEG) LONGITUDE(DEG)    HEIGHT(M)   NAME
37.41652004  -122.20426728  63.678  SLAC
34.17856759  -118.22000401  318.030  VDCY
37.53924280  -122.08326960  53.400  P222

In [None]:
%%writefile D:\kaggle\GSDC2022\config\ppk_phone_0510.conf
# ppk_phone_0510.conf - config file for RTKLIB PPK solution
#2.9549
pos1-posmode       =kinematic  #
pos1-frequency     =l1+l2+l5   #
pos1-soltype       =combined-nophasereset #
pos1-elmask        =15         #
pos1-snrmask_r     =on         #
pos1-snrmask_b     =on         #
pos1-snrmask_L1    =24,28,28,32,28,28,24,24,24
pos1-snrmask_L2    =34,34,34,34,34,34,34,34,34
pos1-snrmask_L5    =24,20,20,24,20,28,28,20,20 #pos1-snrmask_L5    =24,20,24,24,24,28,28,20,20
pos1-dynamics      =on         #
pos1-tidecorr      =on        #
pos1-ionoopt       =brdc       #
pos1-tropopt       =saas       #
pos1-sateph        =brdc       #
pos1-posopt1       =off        #
pos1-posopt2       =off        #
pos1-posopt3       =off        # 
pos1-posopt4       =off        # 
pos1-posopt5       =off        #
pos1-posopt6       =off        #
pos1-exclsats      =           # 
pos1-navsys        =13         # 
pos2-armode        =off        # 
pos2-gloarmode     =off        # 
pos2-bdsarmode     =on         # 
pos2-arfilter      =on         #
pos2-arthres       =3
pos2-arthresmin    =1.5
pos2-arthresmax    =10

pos2-arthres1      =0.25
pos2-arthres2      =0
pos2-arthres3      =1e-09
pos2-arthres4      =1e-05
pos2-varholdamb    =0.7        # (cyc^2)
pos2-gainholdamb   =0.01

pos2-arlockcnt     =0
pos2-minfixsats    =4
pos2-minholdsats   =5
pos2-mindropsats   =10
pos2-arelmask      =0         # (deg)
pos2-arminfix      =50
pos2-armaxiter     =1
pos2-elmaskhold    =15         # (deg)
pos2-aroutcnt      =4
pos2-maxage        =30         # (s)
pos2-syncsol       =off        #
pos2-slipthres     =0.1        #
pos2-dopthres      =5          #
pos2-rejionno      =1          # (m)
pos2-rejgdop       =30
pos2-niter         =1
pos2-baselen       =0          # (m)
pos2-basesig       =0          # (m)
out-solformat      =llh        # 
out-outhead        =on         #
out-outopt         =on         # 
out-outvel         =off        # 
out-timesys        =gpst       # 
out-timeform       =tow        # 
out-timendec       =3
out-degform        =deg        # 
out-fieldsep       =
out-outsingle      =off        # 
out-maxsolstd      =0          # (m)
out-height         =ellipsoidal # 
out-geoid          =internal   # 
out-solstatic      =all        # 
out-nmeaintv1      =0          # (s)
out-nmeaintv2      =0          # (s)
out-outstat        =residual   # 
stats-eratio1      =400        #
stats-eratio2      =300
stats-eratio5      =100
stats-errphase     =0.006      # (m) 
stats-errphaseel   =0.003      # (m)
stats-errphasebl   =0          # (m/10km)
stats-errdoppler   =1          # (Hz)
stats-snrmax       =50         # (dB.Hz)
stats-errsnr       =0          # (m)
stats-errrcv       =0          # ( )
stats-stdbias      =30         # (m)
stats-stdiono      =0.03       # (m)
stats-stdtrop      =0.3        # (m)

stats-prnaccelh    =1          # (m/s^2)
stats-prnaccelv    =0.1          # (m/s^2)

stats-prnbias      =0.032       # 

stats-prniono      =0.001      # (m)
stats-prntrop      =0.0001     # (m)
stats-prnpos       =0          # (m)
stats-clkstab      =5e-12      # (s/s)
ant1-postype       =llh        # 
ant1-pos1          =0          # 
ant1-pos2          =0          
ant1-pos3          =0          
ant1-anttype       =
ant1-antdele       =0          
ant1-antdeln       =0          
ant1-antdelu       =0          
ant2-postype       =posfile    
ant2-pos1          =0          
ant2-pos2          =0          
ant2-pos3          =0          
ant2-anttype       =
ant2-antdele       =0          
ant2-antdeln       =0          
ant2-antdelu       =0          
ant2-maxaveep      =1
ant2-initrst       =off         
misc-timeinterp    =off         
misc-sbasatsel     =0          
misc-rnxopt1       =
misc-rnxopt2       =
misc-pppopt        =
misc-svrcycle      =5         # (ms)
misc-timeout       =10000      # (ms)
misc-reconnect     =10000      # (ms)
misc-nmeacycle     =5000       # (ms)
misc-buffsize      =32768      # (bytes)
misc-navmsgsel     =all        # (0:all,1:rover,2:base,3:corr)
misc-proxyaddr     =
misc-fswapmargin   =30         # (s)
file-satantfile    =
file-rcvantfile    =
file-staposfile    =D:\kaggle\GSDC2022\config\bases.sta
file-geoidfile     =
file-ionofile      =
file-dcbfile       =
file-eopfile       =
file-blqfile       =
file-tempdir       =
file-geexefile     =
file-solstatfile   =
file-tracefile     =

In [1]:
""" 
get_base_data.py - retrieve base observation and navigation data for the
    2022 GSDC competition 
"""

import os
from datetime import datetime
import numpy as np
import requests
import gzip
from glob import glob


# Input parameters
#D:\kaggle\GSDC2022\smartphone-decimeter-2022\test
datadir = r'D:\kaggle\GSDC2022\smartphone-decimeter-2022\train'
stas = ['slac', 'vdcy', 'p222']  # Bay Area, LA. backup for Bay Area
obs_url_base = 'https://geodesy.noaa.gov/corsdata/rinex'
nav_url_base = 'https://data.unavco.org/archive/gnss/rinex3/nav' 
nav_file_base = 'AC0300USA_R_'  # 20210060000_01D_MN.rnx.gz

# Make sure you have downloaded this executable before running this code
crx2rnx_bin = r'D:\kaggle\GSDC2022\rtklib\crx2rnx.exe'


os.chdir(datadir)

In [2]:
for dataset in np.sort(os.listdir()):
    if not os.path.isdir(dataset):
        continue
    print(dataset)
    ymd = dataset.split('-')
    doy = datetime(int(ymd[0]), int(ymd[1]), int(ymd[2])).timetuple().tm_yday # get day of year
    doy = str(doy).zfill(3)
    
    if len(glob(os.path.join(dataset,'*.*o'))) == 0:
        # get obs data
        i = 1 if '-LAX-' in dataset else 0  # use different base for LA
        fname = stas[i] + doy + '0.' + ymd[0][2:4] + 'd.gz'
        url = '/'.join([obs_url_base, ymd[0], doy, stas[i], fname])
        try:
            obs = gzip.decompress(requests.get(url).content) # get obs and decompress
            # write obs data
            open(os.path.join(dataset, fname[:-3]), "wb").write(obs)
        except:
            # try backup CORS station
            i += 2
            fname = stas[i] + doy + '0.' + ymd[0][2:4] + 'd.gz'
            url = '/'.join([obs_url_base, ymd[0], doy, stas[i], fname])
            try:
                obs = gzip.decompress(requests.get(url).content) # get obs and decompress
                # write obs data
                open(os.path.join(dataset, fname[:-3]), "wb").write(obs)
            except:
                print('Fail obs: %s' % dataset)
            
        # convert crx to rnx
        crx_files = glob(os.path.join(dataset,'*.*d'))
        if len(crx_files) > 0:
            os.system(crx2rnx_bin + ' ' + crx_files[0])
    
    # get nav data
    if len(glob(os.path.join(dataset,'*.rnx'))) > 0:
           continue  # file already exists
    fname = nav_file_base + ymd[0] + doy + '0000_01D_MN' + '.rnx.gz'
    url = '/'.join([nav_url_base, ymd[0], doy, fname])
    try:
        obs = gzip.decompress(requests.get(url).content) # get obs and decompress    
        # write nav data
        open(os.path.join(dataset, fname[:-3]), "wb").write(obs)
    except:
        print('Fail nav: %s' % dataset)

2020-05-15-US-MTV-1
2020-05-21-US-MTV-1
2020-05-21-US-MTV-2
2020-05-28-US-MTV-2
2020-05-29-US-MTV-1
2020-05-29-US-MTV-2
2020-06-04-US-MTV-1
2020-06-04-US-MTV-2
2020-06-05-US-MTV-1
2020-06-05-US-MTV-2
2020-06-10-US-MTV-1
2020-06-10-US-MTV-2
2020-06-11-US-MTV-1
2020-06-18-US-MTV-1
2020-06-24-US-MTV-1
2020-06-24-US-MTV-2
2020-07-08-US-MTV-1
2020-07-08-US-MTV-2
2020-07-17-US-MTV-2
2020-07-24-US-MTV-1
2020-07-24-US-MTV-2
2020-08-03-US-MTV-1
2020-08-03-US-MTV-2
2020-08-06-US-MTV-1
2020-08-06-US-MTV-2
2020-08-11-US-MTV-1
2020-08-11-US-MTV-2
2020-08-13-US-MTV-1
2020-09-04-US-MTV-1
2020-09-04-US-MTV-2
2020-11-23-US-MTV-1
2020-12-10-US-SJC-1
2020-12-10-US-SJC-2
2021-01-04-US-SFO-1
2021-01-04-US-SFO-2
2021-01-05-US-MTV-1
2021-01-05-US-MTV-2
2021-03-10-US-MTV-1
2021-03-16-US-MTV-1
2021-03-16-US-MTV-2
2021-03-16-US-MTV-3
2021-04-02-US-SJC-1
2021-04-08-US-MTV-1
2021-04-21-US-MTV-2
2021-04-26-US-SVL-2
2021-04-28-US-MTV-1
2021-04-29-US-MTV-1
2021-04-29-US-MTV-2
2021-07-01-US-MTV-1
2021-07-14-US-MTV-1


**Step 2: Download and update android_rinex library and create RTKLIB config file**

You will need the android_rinex library for converting the raw Android observation files to rinex format.  RTKLIB post processing solutions require that the input files be in the rinex format.  You will need to use my fork of this library which is available at the address shown in the code below. Make sure you have the most recent version which was updated on 7/13/22


Put this in the GSDC_2022/python/android_rinex folder.  (Temporary workaround: There is currently a path issue when running in multi-processing mode.  For now, you will need to copy the files from the android_rinex/src folder into the GSDC_2022/python folder)



You will also need a configuration file for the solution. Copy the config file below to the GSDC_2022/config folder with a file name of ppk_phone_0510.conf

In [None]:

#git clone https://github.com/rtklibexplorer/android_rinex.git

In [3]:
"""
run_ppk_multi.py - convert raw android files to rinex and run PPK solutions for GDSC_2022
data set with RTKLIB and/or rtklib-py.   
"""

import sys
import numpy as np
#if 'rtklib-py/src' not in sys.path:
sys.path.append(r'D:\kaggle\GSDC2022\python\rtklib-py')
#if 'android_rinex/src' not in sys.path:
sys.path.append(r'D:\kaggle\GSDC2022\python\android_rinex\src')

import os, shutil
from os.path import join, isdir, isfile
from glob import glob
from multiprocessing import Pool
import subprocess
import gnsslogger_to_rnx as rnx
from time import time

# set run parameters
maxepoch = None # max number of epochs, used for debug, None = no limit

# Set solution choices
ENABLE_PY = False        # Use RTKLIB-PY to generate solutions 
ENABLE_RTKLIB = True     # Use RTKLIB to generate solutions
OVERWRITE_RINEX = True  # overwrite existing rinex filex
OVERWRITE_SOL = True    # overwrite existing solution files

# specify location of input folder and files
datadir = r'D:\kaggle\GSDC2022\smartphone-decimeter-2022\train'
basefiles = '../*0.2*o' # rinex2, use this for rtklib only
#basefiles = '../base.obs' # rinex3, use this for python only
navfiles = '../*MN.rnx' # navigation files with wild cards

# Setup for RTKLIB 
binpath_rtklib  = r'D:\kaggle\GSDC2022\rtklib\rnx2rtkp.exe'
cfgfile_rtklib = r'D:\kaggle\GSDC2022\config\ppk_phone_0510.conf'
soltag_rtklib = '_rtklib' # postfix for solution file names

# Setup for rtklib-py
cfgfile = r'D:\kaggle\GSDC2022\config\ppk_phone_0510.py' # cfgfile must be absolute path
soltag_py = '_py0510'  # postfix for solution file names

PHONES = ['GooglePixel4', 'GooglePixel4XL', 'Pixel4Modded', 'GooglePixel5', 'GooglePixel6Pro', 'XiaomiMi8', 'SamsungGalaxyS20Ultra']
BASE_POS = {'slac' : [-2703115.9184, -4291767.2037, 3854247.9027],  # WGS84 XYZ coordinates
            'vdcy' : [-2497836.5139, -4654543.2609, 3563028.9379],
            'p222' : [-2689640.2891, -4290437.3671, 3865050.9313]}


# input structure for rinex conversion
class args:
    def __init__(self):
        # Input parameters for conversion to rinex
        self.slip_mask = 0 # overwritten below
        self.fix_bias = True
        self.timeadj = 1e-7
        self.pseudorange_bias = 0
        self.filter_mode = 'sync'
        # Optional hader values for rinex files
        self.marker_name = ''
        self.observer = ''
        self.agency = ''
        self.receiver_number = ''
        self.receiver_type = ''
        self.receiver_version = ''
        self.antenna_number = ''
        self.antenna_type = ''

# Copy and read config file
if ENABLE_PY:
    shutil.copyfile(cfgfile, '__ppk_config.py')
    import __ppk_config as cfg
    import rinex as rn
    import rtkcmn as gn
    from rtkpos import rtkinit
    from postpos import procpos, savesol

# function to convert single rinex file
def convert_rnx(folder, rawFile, rovFile, slipMask):
    os.chdir(folder)
    argsIn = args()
    argsIn.input_log = rawFile
    argsIn.output = os.path.basename(rovFile)
    argsIn.slip_mask = slipMask
    rnx.convert2rnx(argsIn)

# function to run single RTKLIB-Py solution
def run_ppk(folder, rovfile, basefile, navfile, solfile):
    # init solution
    os.chdir(folder)
    gn.tracelevel(0)
    nav = rtkinit(cfg)
    nav.maxepoch = maxepoch
    print(folder)

    # load rover obs
    rov = rn.rnx_decode(cfg)
    print('    Reading rover obs...')
    if nav.filtertype == 'backward':
        maxobs = None   # load all obs for backwards
    else:
        maxobs = maxepoch
    rov.decode_obsfile(nav, rovfile, maxobs)

    # load base obs and location
    base = rn.rnx_decode(cfg)
    print('   Reading base obs...')
    base.decode_obsfile(nav, basefile, None)
    
    # determine base location from original base obs file name
    if len(BASE_POS) > 1:
        baseName = glob('../*.2*o')[0][-12:-8]
        nav.rb[0:3]  = BASE_POS[baseName]
    elif nav.rb[0] == 0:
        nav.rb = base.pos # from obs file
        
    # load nav data from rover obs
    print('   Reading nav data...')
    rov.decode_nav(navfile, nav)

    # calculate solution
    print('    Calculating solution...')
    sol = procpos(nav, rov, base)

    # save solution to file
    savesol(sol, solfile)
    return rovfile

# function to run single RTKLIB solution
def run_rtklib(folder, rovfile, basefile, navfile, solfile):
    # create command to run solution
    rtkcmd='%s -x 0 -y 2 -k %s -o %s %s %s %s' % \
        (binpath_rtklib, cfgfile_rtklib, solfile, rovfile, basefile, navfile)
    
    # run command
    os.chdir(folder)
    subprocess.run(rtkcmd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)   

####### Start of main code ##########################

def main():

    # get list of data sets in data path
    datasets = np.sort(os.listdir(datadir))

    # loop through data set folders
    rinexIn = []
    ppkIn = []
    rtklibIn = []
    for dataset in datasets:
        if dataset in ['2020-08-06-US-MTV-1',
                '2020-08-06-US-MTV-2',
                '2020-11-23-US-MTV-1',
                '2021-03-16-US-MTV-2',
                '2021-04-21-US-MTV-2',
                '2021-04-29-US-MTV-1',
                '2021-04-29-US-MTV-2',
                '2021-12-15-US-MTV-1',
                '2021-12-28-US-MTV-1'
                ]:
            continue
        for phone in PHONES:
            # skip if no folder for this phone
            folder = join(datadir, dataset, phone)
            if not isdir(folder):  
                continue
            os.chdir(folder)
            rawFile = join('supplemental', 'gnss_log.txt')
            rovFile = join('supplemental', 'gnss_log.obs')

            rinex = False
            # check if need rinex conversion
            if OVERWRITE_RINEX or not isfile(rovFile):
                # generate list of input parameters for each rinex conversion
                if phone == 'SamsungS20Ultra': # Use cycle slip flags for Samsung phones
                    slipMask = 0 # 1 to unmask recevier cycle slips
                else:
                    slipMask = 0 
                rinexIn.append((folder, rawFile, rovFile, slipMask))
                print(rawFile, '->', rovFile) 
                rinex = True
            
            # check if need to create PPK solution
            try:
                baseFile = glob(basefiles)[0]
                navFile = glob(navfiles)[0]
                solFile = rovFile[:-4] + soltag_py + '.pos'
                solFile_rtklib = rovFile[:-4] + soltag_rtklib + '.pos'
            except:
                print(folder,'  Error: Missing file')
                continue
            if ENABLE_PY and (OVERWRITE_SOL == True or len(glob(solFile)) == 0 
                              or rinex == True):
                # generate list of input/output files for each python ppk solution
                print('PY: ', join(dataset, phone))
                ppkIn.append((folder, rovFile, baseFile, navFile, solFile))
            if ENABLE_RTKLIB and (OVERWRITE_SOL == True or 
                        len(glob(solFile_rtklib)) == 0 or rinex == True):
                # generate list of input/output files for each rtklib ppk solution
                print('RTKLIB: ', join(dataset, phone))
                rtklibIn.append((folder, rovFile, baseFile, navFile, solFile_rtklib))

    if len(rinexIn) > 0:
        print('\nConvert rinex files...')
        # generate rinx obs files in parallel, does not give error messages
        #with Pool() as pool: # defaults to using cpu_count for number of procceses
        #    res = pool.starmap(convert_rnx, rinexIn)
        # run sequentially, use for debug
        for input in rinexIn:
            convert_rnx(input[0],input[1],input[2],input[3])

    if ENABLE_PY and len(ppkIn) > 0:
        print('Calculate PPK solutions...')
        # run PPK solutions in parallel, does not give error messages
        # with Pool() as pool: # defaults to using cpu_count for number of procceses
        #     res = pool.starmap(run_ppk, ppkIn)
        # run sequentially, use for debug
        for input in ppkIn:
            run_ppk(input[0],input[1],input[2],input[3],input[4])

    if ENABLE_RTKLIB and len(rtklibIn) > 0:
        print('Calculate RTKLIB solutions...')
        # run PPK solutions in parallel, does not give error messages
        # with Pool() as pool: # defaults to using cpu_count for number of procceses
        #     res = pool.starmap(run_rtklib, rtklibIn)
        # run sequentially, use for debug
        for input in rtklibIn:
            run_rtklib(input[0],input[1],input[2],input[3],input[4])

if __name__ == '__main__':
    t0 = time()
    main()
    print('Runtime=%.1f' % (time() - t0))

""" create_baseline_csv_from_pos.py -  Create csv file PPK solution files using timestamps in reference file
"""

import os
from os.path import join, isfile
import numpy as np
from datetime import date

########### Input parameters ###############################

DATA_SET = 'train'
SOL_TAG = '_rtklib'
datapath = r'D:\kaggle\GSDC2022\smartphone-decimeter-2022'
hdrlen = 25    # 25 for RTKLIB, 1 for rtklib-py

# Also make sure the appropriate reference file is in the datapath
#  test: sample_submission.csv - provided in Google data
# train: ground_truths_train.csv - created with crete_ground_truths.py

############################################################

GPS_TO_UTC = 315964782  # second


# get timestamps from existing baseline file
os.chdir(datapath)
if DATA_SET == 'train':
    baseline_file = 'ground_truths_train.csv'
else: # 'test'
    baseline_file = 'sample_submission.csv'
base_txt = np.genfromtxt(baseline_file, delimiter=',',invalid_raise=False, 
                         skip_header=1, dtype=str)
msecs_base = base_txt[:,1].astype(np.int64)
phones_base = base_txt[:,0]

# open output file
fout =open('baseline_locations_' + DATA_SET + '_' + 'CV' + '.csv','w')
fout.write('tripId,UnixTimeMillis,LatitudeDegrees,LongitudeDegrees\n')

# get list of data sets in data path
os.chdir(join(datapath, DATA_SET))
trips = np.sort(os.listdir())

# loop through data set folders
ix_b = 0
for trip in trips:
    if isfile(trip) or trip in ['2020-08-06-US-MTV-1',
                                '2020-08-06-US-MTV-2',
                                '2020-11-23-US-MTV-1',
                                '2021-03-16-US-MTV-2',
                                '2021-04-21-US-MTV-2',
                                '2021-04-29-US-MTV-1',
                                '2021-04-29-US-MTV-2',
                                '2021-12-15-US-MTV-1',
                                '2021-12-28-US-MTV-1'
                                ]:
        continue
    phones = os.listdir(trip)
    # loop through phone folders
    for phone in phones:
        # check for valid folder and file
        folder = join(trip, phone)
        if isfile(folder):
            continue
        trip_phone = trip + '/' + phone
        #if trip_phone in ['2020-06-24-US-MTV-1/GooglePixel4', 
        #                  '2020-06-24-US-MTV-1/GooglePixel4XL',
        #                 '2020-06-24-US-MTV-2/GooglePixel4',
        #                 '2020-06-24-US-MTV-2/GooglePixel4XL',
        #                 '2020-08-03-US-MTV-2/GooglePixel4',
        #                 '2020-08-03-US-MTV-2/GooglePixel4XL',
        #                 '2020-08-03-US-MTV-2/GooglePixel5',
        #                 '2020-11-23-US-MTV-1/XiaomiMi8',
        #                 '2020-06-04-US-MTV-2/GooglePixel4']:
        #    continue
        print(trip_phone)

        ix_b = np.where(phones_base == trip_phone)[0]
        sol_path = join(folder, 'supplemental', 'gnss_log' + SOL_TAG + '.pos')
        if isfile(sol_path):
            # parse solution file
            fields = np.genfromtxt(sol_path, invalid_raise=False, skip_header=hdrlen)
            #print(fields)
            try:
                if int(fields[0,1]) > int(fields[-1,1]): # invert if backwards solution
                    fields = fields[::-1]
                pos = fields[:,2:5]
                qs = fields[:,5].astype(int)
                nss = fields[:,6].astype(int)
                acc = fields[:,7:13]
                msecs = (1000 * (fields[:,0] * 7 * 24 * 3600 + fields[:,1])).astype(np.int64)
                msecs += GPS_TO_UTC * 1000
            except:
                continue
        else:
            print('File not found: ', sol_path)
            msecs = msecs_base.copy()
            pos = acc = np.zeros((len(msecs), 3))
            qs = nss = np.zeros(len(msecs))
            
           
        # interpolate to baseline timestamps to fill in missing samples
        llhs = []; stds = []
        for j in range(6):
            if j < 3:
                llhs.append(np.interp(msecs_base[ix_b], msecs, pos[:,j]))
                stds.append(np.interp(msecs_base[ix_b], msecs, acc[:,j],
                                     left=1000, right=1000))
            qsi = np.interp(msecs_base[ix_b], msecs, qs)
            nssi = np.interp(msecs_base[ix_b], msecs, nss)
            
            

        # write results to combined file
        for i in range(len(ix_b)):
                fout.write('%s,%d,%.12f,%.12f,%.2f,%.0f,%.0f,%.3f,%.3f,%.3f\n' % 
                        (trip_phone, msecs_base[ix_b[i]], llhs[0][i], llhs[1][i],
                         llhs[2][i], qsi[i], nssi[i], stds[0][i], stds[1][i], 
                         stds[2][i]))
fout.close()

import numpy as np
import pandas as pd
import pymap3d as pm
import pymap3d.vincenty as pmv

# Compute distance by Vincenty's formulae
def vincenty_distance(llh1, llh2):
    """
    Args:
        llh1 : [latitude,longitude] (deg)
        llh2 : [latitude,longitude] (deg)
    Returns:
        d : distance between llh1 and llh2 (m)
    """
    d, az = np.array(pmv.vdist(llh1[:, 0], llh1[:, 1], llh2[:, 0], llh2[:, 1]))

    return d


# Compute score
def calc_score(llh, llh_gt):
    """
    Args:
        llh : [latitude,longitude] (deg)
        llh_gt : [latitude,longitude] (deg)
    Returns:
        score : (m)
    """
    d = vincenty_distance(llh, llh_gt)
    score = np.mean([np.quantile(d, 0.50), np.quantile(d, 0.95)])

    return score

gnss_df = pd.read_csv(r'D:\kaggle\GSDC2022\smartphone-decimeter-2022\baseline_locations_train_CV.csv')  # GNSS data
gt_df = pd.read_csv(r'D:\kaggle\GSDC2022\smartphone-decimeter-2022\ground_truths_train.csv')  # ground truth

gnss_df = gnss_df.drop(columns=['tripId', 'UnixTimeMillis', 'LatitudeDegrees', 'LongitudeDegrees']).reset_index().drop(columns=['level_4', 'level_5'])
gnss_df.columns = gt_df.columns
gt_df = gt_df[gt_df['tripId'].isin(gnss_df['tripId'])].reset_index(drop=True)
llh_gt = gt_df[['LatitudeDegrees', 'LongitudeDegrees']].to_numpy()
plh_gt = gnss_df[['LatitudeDegrees', 'LongitudeDegrees']].to_numpy()
score = calc_score(plh_gt, llh_gt)
score

#kinematic:3.656
#single:9.4005
#dgps：12707488.455379382

#l1+l2+l5+l6：528.3245
#l1+l2:512.05267



Overwriting D:\kaggle\GSDC2022\config\bases.sta
