Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: new GNSS TEC instrument #11

Closed
wants to merge 45 commits into from
Closed
Show file tree
Hide file tree
Changes from 36 commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
42d487f
ENH: rename import nickname
aburrell Aug 11, 2020
5cdeffe
ENH: added GNSS TEC instrument
aburrell Aug 11, 2020
e35d5ce
BUG: code review feedback
aburrell Aug 12, 2020
aeab2f7
MAINT: new standards
aburrell Aug 12, 2020
1fb3fc4
BUG: list from keys and logging
aburrell Aug 12, 2020
895b0e4
ENH: smarter xarray creation
aburrell Aug 13, 2020
d599f62
MAINT: removed 'los' and updated 'vtec' load
aburrell Aug 13, 2020
12afe51
BUG: tuple with length attribute
aburrell Aug 13, 2020
0e0f347
BUG: final xarray check
aburrell Aug 13, 2020
3cfa630
BUG: merge not done in place
aburrell Aug 14, 2020
d32b71c
ENH: Squeeze out the madrigal keywords
aburrell Aug 14, 2020
e8a714a
DOC: VTEC example figure for docs
aburrell Aug 14, 2020
5684bb0
ENH: fixed metadata units
aburrell Aug 14, 2020
d7f42f1
STY: flake8 changes
aburrell Aug 14, 2020
c3f3965
DOC: docstring typo
aburrell Aug 14, 2020
b331197
STY: meta generalization
aburrell Aug 14, 2020
626c87d
DOC: docstring grammar check
aburrell Aug 14, 2020
7d43205
STY: more meta units label
aburrell Aug 17, 2020
9782e4d
BUG: dropped duplicates too early
aburrell Aug 18, 2020
318ca68
Merge branch 'tec_inst' of https://github.com/pysat/pysatMadrigal int…
aburrell Aug 18, 2020
ab600b0
DOC: updated reference figure
aburrell Aug 18, 2020
6da4347
BUG: jro debug
aburrell Sep 10, 2020
1c7fb88
BUG: data column number
aburrell Sep 10, 2020
ebd3525
BUG: new loading method
aburrell Sep 10, 2020
06a8c27
MAINT: install_requires update
aburrell Oct 9, 2020
a529e75
ENH: general and TEC netCDF support
aburrell Oct 9, 2020
f5b451a
BUG: netCDF4-Python pip name
aburrell Oct 9, 2020
bdbaee6
Merge branch 'develop' into tec_inst
aburrell Oct 9, 2020
6dd3e57
Merge pull request #19 from pysat/develop
aburrell Oct 9, 2020
1380341
TST: Travis uses develop-3
aburrell Oct 9, 2020
59c5983
BUG: update module nickname
aburrell Oct 9, 2020
6908d09
BUG: lingering sat_id
aburrell Oct 9, 2020
5df53b1
STY: removed print statement
aburrell Oct 9, 2020
ecf0403
STY: file_format flag
aburrell Oct 9, 2020
dcf32ef
ENH: HDF5 coordinate flexibility
aburrell Oct 9, 2020
d9f3247
STY: flake8
aburrell Oct 9, 2020
e69fd43
BUG: madrigal docs and file format handling
aburrell Oct 13, 2020
e94aec6
BUG: allow multiple TEC file formats
aburrell Oct 13, 2020
3029d4a
BUG: time index
aburrell Oct 13, 2020
15dd2b3
BUG: file_type can only be set once
aburrell Oct 13, 2020
54ab698
STY: dmsp_ivm file_type cleanup
aburrell Oct 14, 2020
d7e64a3
STY: jro_isr file_type update
aburrell Oct 14, 2020
9f25389
BUG: index
aburrell Oct 14, 2020
f316ad9
STY: Apply suggestions from code review
aburrell Oct 15, 2020
fc50297
Merge branch 'inst_kwarg_updates' into tec_inst
aburrell Oct 15, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ install:
- pip install pysatCDF >/dev/null
# Custom pysat install
- cd ..
- git clone https://github.com/pysat/pysat.git
- git clone --single-branch --branch develop-3 https://github.com/pysat/pysat.git
- cd pysat
- git checkout develop-3
- python setup.py install
Expand Down
Binary file added docs/figures/gnss_tec_vtec_example.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
8 changes: 6 additions & 2 deletions pysatMadrigal/instruments/__init__.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,8 @@
from pysatMadrigal.instruments import dmsp_ivm, jro_isr
# Import Madrigal instruments
from pysatMadrigal.instruments import dmsp_ivm, gnss_tec, jro_isr

# Import Madrigal methods
from pysatMadrigal.instruments import methods # noqa F401

__all__ = ['dmsp_ivm', 'jro_isr']
# Define variable name with all available instruments
__all__ = ['dmsp_ivm', 'gnss_tec', 'jro_isr']
5 changes: 3 additions & 2 deletions pysatMadrigal/instruments/dmsp_ivm.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,9 @@
import numpy as np
import pandas as pds

from pysat.instruments.methods import general as ps_gen

from pysatMadrigal.instruments.methods import madrigal as mad_meth
from pysat.instruments.methods import general as mm_gen

logger = logging.getLogger(__name__)

Expand All @@ -89,7 +90,7 @@
dmsp_fname2 = {'utd': '.{version:03d}.hdf5', '': 's?.{version:03d}.hdf5'}
supported_tags = {ss: {kk: dmsp_fname1[kk] + ss[1:] + dmsp_fname2[kk]
for kk in inst_ids[ss]} for ss in inst_ids.keys()}
list_files = functools.partial(mm_gen.list_files,
list_files = functools.partial(ps_gen.list_files,
supported_tags=supported_tags)

# madrigal tags
Expand Down
233 changes: 233 additions & 0 deletions pysatMadrigal/instruments/gnss_tec.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,233 @@
# -*- coding: utf-8 -*-.
"""Supports the MIT Haystack GNSS TEC data products

The Global Navigation Satellite System (GNSS) is used in conjunction with a
world-wide receiver network to produce total electron content (TEC) data
products, including vertical and line-of-sight TEC.

Downloads data from the MIT Haystack Madrigal Database.

Properties
----------
platform
'gnss'
name
'tec'
tag
'vtec'

Examples
--------
::

import datetime
import pysat
import pysatMadrigal as pymad

vtec = pysat.Instrument(inst_module=pymad.instruments.gnss_tec, tag='vtec')
vtec.download(dt.datetime(2017, 11, 19), dt.datetime(2017, 11, 20),
user='Firstname+Lastname', password='email@address.com')
vtec.load(date=dt.datetime(2017, 11, 19))


Note
----
Please provide name and email when downloading data with this routine.

"""

import datetime as dt
import functools
import numpy as np

from pysat.instruments.methods import general as ps_gen

from pysatMadrigal.instruments.methods import madrigal as mad_meth

import logging
logger = logging.getLogger(__name__)


platform = 'gnss'
name = 'tec'
tags = {'vtec': 'vertical TEC'}
inst_ids = {'': [tag for tag in tags.keys()]}
_test_dates = {'': {'vtec': dt.datetime(2017, 11, 19)}}
pandas_format = False

# support list files routine
# use the default pysat method
dname = '{year:02d}{month:02d}{day:02d}'
vname = '.{version:03d}'
supported_tags = {ss: {'vtec': "gps{:s}g{:s}.hdf5".format(dname, vname)}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hardcoded to .hdf5.

for ss in inst_ids.keys()}
list_files = functools.partial(ps_gen.list_files,
aburrell marked this conversation as resolved.
Show resolved Hide resolved
supported_tags=supported_tags,
two_digit_year_break=99)

# madrigal tags
madrigal_inst_code = 8000
madrigal_tag = {'': {'vtec': 3500}} # , 'los': 3505}}

# support listing files currently available on remote server (Madrigal)
list_remote_files = functools.partial(mad_meth.list_remote_files,
supported_tags=supported_tags,
inst_code=madrigal_inst_code)


def init(self):
"""Initializes the Instrument object with values specific to GNSS TEC

Runs once upon instantiation.

"""

ackn_str = ''.join(["GPS TEC data products and access through the ",
"Madrigal distributed data system are provided to ",
"the community by the Massachusetts Institute of ",
"Technology under support from U.S. National Science",
" Foundation grant AGS-1242204. Data for the TEC ",
"processing is provided by the following ",
"organizations: UNAVCO, Scripps Orbit and Permanent",
" Array Center, Institut Geographique National, ",
"France, International GNSS Service, The Crustal ",
"Dynamics Data Information System (CDDIS), National ",
" Geodetic Survey, Instituto Brasileiro de Geografia",
aburrell marked this conversation as resolved.
Show resolved Hide resolved
"e Estatística, RAMSAC CORS of Instituto Geográfico",
" Nacional del la República Agentina, Arecibo ",
"Observatory, Low-Latitude Ionospheric Sensor ",
"Network (LISN), Topcon Positioning Systems, Inc., ",
"Canadian High Arctic Ionospheric Network, ",
"Institute of Geology and Geophysics, Chinese ",
"Academy of Sciences, China Meterorology ",
"Administration, Centro di Ricerche Sismogiche, ",
"Système d’Observation du Niveau des Eaux Littorales",
" (SONEL), RENAG : REseau NAtional GPS permanent, ",
"and GeoNet—the official source of geological ",
"hazard information for New Zealand.\n",
mad_meth.cedar_rules()])

logger.info(ackn_str)
self.acknowledgements = ackn_str
self.references = "Rideout and Coster (2006) doi:10.1007/s10291-006-0029-5"

return


def download(date_array, tag='', inst_id='', data_path=None, user=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I downloaded a few days for testing but the files I got are labeled stuff.hdf. The load time is really quick though, suggesting they are actually netcdf files. Poking at the code the 'netcdf' path is indeed being run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, regardless of the format you ask for and get, the remote files are listed as .hdf5. Not sure if I fixed the rename here or in the other PR.

password=None, url='http://cedar.openmadrigal.org',
file_format='netCDF4'):
"""Downloads data from Madrigal.

Parameters
----------
date_array : array-like
list of datetimes to download data for. The sequence of dates need not
be contiguous.
tag : string
Tag identifier used for particular dataset. This input is provided by
pysat. (default='')
inst_id : string
Satellite ID string identifier used for particular dataset. This input
aburrell marked this conversation as resolved.
Show resolved Hide resolved
is provided by pysat. (default='')
data_path : string
Path to directory to download data to. (default=None)
user : string
User string input used for download. Provided by user and passed via
pysat. If an account is required for dowloads this routine here must
aburrell marked this conversation as resolved.
Show resolved Hide resolved
error if user not supplied. (default=None)
aburrell marked this conversation as resolved.
Show resolved Hide resolved
password : string
Password for data download. (default=None)
url : string
URL for Madrigal site (default='http://cedar.openmadrigal.org')
file_format : string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file_format keyword conflicts with the built-in keyword at the Instrument level. I remember you said something about this during the meeting today but unfortunately I don't remember what. I'm having trouble in this area.

I tried checking out 'hdf5' support. Using the branch over in pysat (without the keyword) I get,

gps = pysat.Instrument('gnss', 'tec', tag='vtec')                                                                                                                                                                   
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-8-c7e28318904d> in <module>
----> 1 gps = pysat.Instrument('gnss', 'tec', tag='vtec')

~/Code/pysat/pysat/_instrument.py in _get_supported_keywords(local_func)
   2950         for pop in pop_list[::-1]:
   2951             args.pop(pop)
-> 2952             defaults.pop(pop)
   2953 
   2954     out_dict = {}

IndexError: pop index out of range

With the keyword,

In [9]: gps = pysat.Instrument('gnss', 'tec', tag='vtec', file_format='hdf5')                                                                                                                                               
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-06a989677482> in <module>
----> 1 gps = pysat.Instrument('gnss', 'tec', tag='vtec', file_format='hdf5')

~/Code/pysat/pysat/_instrument.py in __init__(self, platform, name, tag, inst_id, clean_level, update_files, pad, orbit_info, inst_module, multi_file_day, manual_org, directory_format, file_format, temporary_file_list, strict_time_flag, ignore_empty_files, units_label, name_label, notes_label, desc_label, plot_label, axis_label, scale_label, min_label, max_label, fill_label, **kwargs)
    252                 estr = 'file format set to default, supplied string must be '
    253                 estr = '{:s}iteratable [{:}]'.format(estr, self.file_format)
--> 254                 raise ValueError(estr)
    255 
    256         # set up empty data and metadata

ValueError: file format set to default, supplied string must be iteratable [hdf5]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using the wrong pysat branch. You had the correct link above. My bad. Trying again...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My test setup was good the first time. Clearly I got myself turned around a bit.

The test for 'hdf5' was done with the 'delimiter_bug` branch. I confirmed that I get the pop issue or the ValueError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I apparently changed this bug in the following PR. Will update it here, too. Problems of finding too many related problems...

File format for Madrigal data. Load routines currently only accepts
'hdf5' and 'netCDF4', but any of the Madrigal options may be used
here. (default='netCDF4')

Note
----
The user's names should be provided in field user. Anthea Coster should
be entered as Anthea+Coster

The password field should be the user's email address. These parameters
are passed to Madrigal when downloading.

The affiliation field is set to pysat to enable tracking of pysat
downloads.

"""
mad_meth.download(date_array, inst_code=str(madrigal_inst_code),
kindat=str(madrigal_tag[inst_id][tag]),
data_path=data_path, user=user, password=password,
file_format=file_format, url=url)
return


def load(fnames, tag=None, inst_id=None, file_format='netCDF4'):
""" Routine to load the GNSS TEC data

Parameters
-----------
fnames : list
List of filenames
tag : string or NoneType
tag name used to identify particular data set to be loaded.
This input is nominally provided by pysat itself. (default=None)
inst_id : string or NoneType
Satellite ID used to identify particular data set to be loaded.
aburrell marked this conversation as resolved.
Show resolved Hide resolved
This input is nominally provided by pysat itself. (default=None)
file_format : string
File format for Madrigal data. Currently only accepts 'hdf5' and
'netCDF4', but any of the Madrigal options may be used here.
aburrell marked this conversation as resolved.
Show resolved Hide resolved
(default='netCDF4')

Returns
--------
data : xarray.Dataset
Object containing satellite data
meta : pysat.Meta
Object containing metadata such as column names and units

"""
# Define the xarray coordinate dimensions (apart from time)
# Not needed for netCDF
xcoords = {'vtec': {('time', 'gdlat', 'glon', 'kindat', 'kinst'):
['gdalt', 'tec', 'dtec'],
('time', ): ['year', 'month', 'day', 'hour', 'min',
'sec', 'ut1_unix', 'ut2_unix', 'recno']}}

# Load the specified data
data, meta = mad_meth.load(fnames, tag, inst_id,
xarray_coords=xcoords[tag],
file_format=file_format)

# Squeeze the kindat and kinst 'coordinates', but keep them as floats
squeeze_dims = np.array(['kindat', 'kinst'])
squeeze_mask = [sdim in data.coords for sdim in squeeze_dims]
if np.any(squeeze_mask):
data = data.squeeze(dim=squeeze_dims[squeeze_mask])

# Fix the units for tec and dtec
if tag == 'vtec':
meta['tec'] = {meta.units_label: 'TECU'}
meta['dtec'] = {meta.units_label: 'TECU'}

return data, meta


def clean(self):
"""Routine to return GNSS TEC data at a specific level

Note
----
Supports 'clean', 'dusty', 'dirty', or 'None'.
Routine is called by pysat, and not by the end user directly.

"""
if self.tag == "vtec":
logger.info("".join(["Data provided at a clean level, further ",
"cleaning may be performed using the ",
"measurement error 'dtec'"]))

return
88 changes: 84 additions & 4 deletions pysatMadrigal/instruments/jro_isr.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,9 +86,6 @@
supported_tags=supported_tags,
inst_code=madrigal_inst_code)

# support load routine
load = functools.partial(mad_meth.load, xarray_coords=['gdalt'])

# Madrigal will sometimes include multiple days within a file
# labeled with a single date.
# Filter out this extra data using the pysat nanokernel processing queue.
Expand Down Expand Up @@ -124,7 +121,7 @@ def init(self):


def download(date_array, tag='', inst_id='', data_path=None, user=None,
password=None):
password=None, file_format='hdf5'):
"""Downloads data from Madrigal.

Parameters
Expand All @@ -146,6 +143,9 @@ def download(date_array, tag='', inst_id='', data_path=None, user=None,
error if user not supplied. (default=None)
password : string
Password for data download. (default=None)
file_format : string
File format for Madrigal data. Currently only accept 'netcdf4' and
'hdf5'. (default='hdf5')

Notes
-----
Expand All @@ -164,6 +164,86 @@ def download(date_array, tag='', inst_id='', data_path=None, user=None,
data_path=data_path, user=user, password=password)


def load(fnames, tag=None, inst_id=None, file_format='hdf5'):
""" Routine to load the GNSS TEC data

Parameters
-----------
fnames : list
List of filenames
tag : string or NoneType
tag name used to identify particular data set to be loaded.
This input is nominally provided by pysat itself. (default=None)
inst_id : string or NoneType
Satellite ID used to identify particular data set to be loaded.
aburrell marked this conversation as resolved.
Show resolved Hide resolved
This input is nominally provided by pysat itself. (default=None)
file_format : string
File format for Madrigal data. Currently only accept 'netcdf4' and
'hdf5'. (default='hdf5')

Returns
--------
data : xarray.Dataset
Object containing satellite data
meta : pysat.Meta
Object containing metadata such as column names and units

"""
# Define the xarray coordinate dimensions (apart from time)
xcoords = {'drifts': {('time', 'gdalt', 'gdlatr', 'gdlonr', 'kindat',
'kinst'): ['nwlos', 'range', 'vipn2', 'dvipn2',
'vipe1', 'dvipe1', 'vi72', 'dvi72',
'vi82', 'dvi82', 'paiwl', 'pacwl',
'pbiwl', 'pbcwl', 'pciel', 'pccel',
'pdiel', 'pdcel', 'jro10', 'jro11'],
('time', ): ['year', 'month', 'day', 'hour', 'min',
'sec', 'spcst', 'pl', 'cbadn', 'inttms',
'azdir7', 'eldir7', 'azdir8', 'eldir8',
'jro14', 'jro15', 'jro16', 'ut1_unix',
'ut2_unix', 'recno']},
'drifts_ave': {('time', 'gdalt', 'gdlatr', 'gdlonr', 'kindat',
'kinst'): ['altav', 'range', 'vipn2', 'dvipn2',
'vipe1', 'dvipe1'],
('time', ): ['year', 'month', 'day', 'hour',
'min', 'sec', 'spcst', 'pl',
'cbadn', 'inttms', 'ut1_unix',
'ut2_unix', 'recno']},
'oblique_stan': {('time', 'gdalt', 'gdlatr', 'gdlonr', 'kindat',
'kinst'): ['rgate', 'ne', 'dne', 'te', 'dte',
'ti', 'dti', 'ph+', 'dph+', 'phe+',
'dphe+'],
('time', ): ['year', 'month', 'day', 'hour',
'min', 'sec', 'azm', 'elm',
'pl', 'inttms', 'tfreq',
'ut1_unix', 'ut2_unix', 'recno']},
'oblique_rand': {('time', 'gdalt', 'gdlatr', 'gdlonr', 'kindat',
'kinst'): ['rgate', 'pop', 'dpop', 'te', 'dte',
'ti', 'dti', 'ph+', 'dph+', 'phe+',
'dphe+'],
('time', ): ['year', 'month', 'day', 'hour',
'min', 'sec', 'azm', 'elm',
'pl', 'inttms', 'tfreq',
'ut1_unix', 'ut2_unix', 'recno']},
'oblique_long': {('time', 'gdalt', 'gdlatr', 'gdlonr', 'kindat',
'kinst'): ['rgate', 'pop', 'dpop', 'te', 'dte',
'ti', 'dti', 'ph+', 'dph+', 'phe+',
'dphe+'],
('time', ): ['year', 'month', 'day', 'hour',
'min', 'sec', 'azm', 'elm',
'pl', 'inttms', 'tfreq',
'ut1_unix', 'ut2_unix', 'recno']}}

# Load the specified data
data, meta = mad_meth.load(fnames, tag, inst_id,
xarray_coords=xcoords[tag],
file_format=file_format)

# Squeeze the kindat and kinst 'coordinates', but keep them as floats
data = data.squeeze(dim=['kindat', 'kinst', 'gdlatr', 'gdlonr'])

return data, meta


def clean(self):
"""Routine to return JRO ISR data cleaned to the specified level

Expand Down
Loading