## Notebook to process GNSS data for multiple receivers

### Step 1: Load libraries

Chunk that sloads necessary packages & sets working environment to where the jupyterlab notebook file is 

In [1]:
%load_ext autoreload
%autoreload 2
import gnssvod as gv
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pdb
import importlib
import zipfile
import os
import xarray as xr
import glob
import shutil
from gnssvod.hemistats.hemistats import hemibuild
#import georinex as gr
#import qgrid as interactive table 
from matplotlib.collections import PatchCollection
import matplotlib.dates as mdates
import shutil
from matplotlib import rcParams
from datetime import datetime, timedelta

### Step 2: Time intervals

Set up time intervals for the study period. The argument **periods** represents the number of days from the start day.

In [None]:
# Define the start day
startday = pd.to_datetime('21-03-2025', format='%d-%m-%Y')
# Generate a range of datetime values
timeintervals=pd.interval_range(start=startday, periods=10, freq='D', closed='left')
timeintervals

### Step 3: Read RINEX file

We then proceed to read RINEX file. First we will run the snippet of code to perform the pre-processing in python and visualize the dataframe. The **interval** property is resampling the file to reduce it size, from 1 observation per second to one every 15 s.

Sometimes we could get the following error:

ValueError: Missing an approximate antenna position. Provide the argument ‘approx_position’ to preprocess()

In [None]:
pattern = {'YoungPine-2':'/Users/ger/Library/CloudStorage/Box-Box/Project_MetoliusGNSS/VOD/Data/GNSS/extracted/youngpine/youngpine_pheno/Reach_raw_20250311193931.25O'}
#approx_position=[-4705.036,43.000,23011766.990]
#gv.preprocess(pattern,interval='15s',keepvars=keepvars, approx_position=approx_position)# if you want to use the approximate position after providing coordinates, uncomment this line
result = gv.preprocess(pattern,interval='15s',outputresult=True) # preprocess the data and save the result
obs = result['YoungPine-2'][0] # create observation object 

**Observation objects** contain the following properties

- obs.filename = the name of the source file
- obs.epoch = a datetime indicate the day at the start of the record
- obs.observation = a pandas data frame containing all measurements
- obs.approx_position = the approximate receiver position as provided in the RINEX file [X,Y,Z]
- obs.receiver_type = the receiver type if provided in the RINEX file
- obs.antenna_type = the antenna type if provided in the RINEX file
- obs.interval = the measurement frequency in seconds
- obs.receiver_clock = the receiver clock if provided in the RINEX file
- obs.version = the version of the RINEX file
- obs.observation_types = the observation types reported as columns in obs.observation

We can look at the day when the record started

In [None]:
obs = result['YoungPine-2'][0]
obs.epoch

Let's now look at the data:

The pandas data frame has a MultIndex that contains both Epoch and SV as indices. The Epoch is the local time of the measurement and the SV is a satellite identification number (also called PRN).

The columns correspond to:
- C# = Pseudorange from the receiver to the satellite, in meters
- L# = Carrier phase, in cycles
- D# = Doppler, in Hz
- S# = Carrier to noise density C/N_0, in dB (receiver-dependent)
And the numbers (S1, S2, etc. ) indicate the corresponding **GNSS frequency**

The azimuth and elevation of the satellite with respect to the receiver are expressed in degrees. Computation speed for the azimuth and elevation can vary according to your hardware. Most of the time is spent interpolating the orbit parameters to the time stamps of each measurement. This is why it is sometimes useful to resample high frequency data (here one measurement per second) to for instance one measurement each 15 seconds.

In [None]:
obs.observation

### Step 4: Saving processed RINEX file

Repeat **Step 3** but instead of creating an object we save the netcdf file in the box folder. This recquires to indicate the location of the file (**pattern**) and the location of the output directory (**outputdir**). 

In [None]:
pattern = {'YoungPine-2':'/Users/ger/Library/CloudStorage/Box-Box/Project_MetoliusGNSS/VOD/Data/GNSS/extracted/youngpine/youngpine_pheno/Reach_raw_20250311193931.25O'}
outputdir={'YoungPine-2':'/Users/ger/Library/CloudStorage/Box-Box/Project_MetoliusGNSS/VOD/Data/GNSS/extracted/youngpine/youngpine_pheno/youngpine_pheno_nc/'}
#approx_position=[-4705.036,43.000,23011766.990]
#gv.preprocess(pattern,interval='15s',keepvars=keepvars,outputdir=outputdir, approx_position=approx_position)# if you want to use the approximate position after providing coordinates, uncomment this line
gv.preprocess(pattern,interval='15s',outputdir=outputdir,outputresult=True) # preprocess the data and save the result as a netcdf file