Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: new GNSS TEC instrument #11

Closed
wants to merge 45 commits into from
Closed

ENH: new GNSS TEC instrument #11

wants to merge 45 commits into from

Conversation

aburrell
Copy link
Member

Adding a new instrument for vertical and slant TEC from MIT Haystack.

Renamed import nickname so that the source of two routines would not be confused.
Added a GNSS TEC instrument.
@aburrell aburrell added the enhancement New feature or request label Aug 11, 2020
@aburrell aburrell self-assigned this Aug 11, 2020
@aburrell aburrell marked this pull request as draft August 11, 2020 17:13
@aburrell
Copy link
Member Author

aburrell commented Aug 11, 2020

There's a problem listing files, which I don't know how to resolve. Files download as expected, but the file list is not working. Additionally, specifying a file to load does not work. This needs to be fixed before I can check to see that the Datasets for the two data sets are formatted correctly.

This may be related to #5

Copy link
Collaborator

@rstoneback rstoneback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm currently waiting for the load command on 'vtec' to finish.

pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
@rstoneback
Copy link
Collaborator

I've started a pull on pysat with the bug fix for finding files.
pysat/#510
pysat/pysat#510

@rstoneback
Copy link
Collaborator

Loads data just fine.

In [8]: vtec.load(date=dt.datetime(2017, 11, 18))                                                                                                                                                                       

In [9]: vtec.data                                                                                                                                                                                                       
Out[9]: 
<xarray.Dataset>
Dimensions:   (gdalt: 1, gdlat: 180, glon: 360, time: 288)
Coordinates:
  * time      (time) datetime64[ns] 2017-11-18T00:02:30 ... 2017-11-18T23:57:30
  * gdlat     (gdlat) float64 -90.0 -89.0 -88.0 -87.0 ... 86.0 87.0 88.0 89.0
  * glon      (glon) float64 -180.0 -179.0 -178.0 -177.0 ... 177.0 178.0 179.0
  * gdalt     (gdalt) float64 350.0
Data variables:
    year      (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    month     (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    day       (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    hour      (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    min       (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    sec       (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    recno     (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    kindat    (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    kinst     (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    ut1_unix  (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    ut2_unix  (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    tec       (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan
    dtec      (time, gdlat, glon, gdalt) float64 nan nan nan nan ... nan nan nan

gnss_example

@aburrell
Copy link
Member Author

@rstoneback can you provide your system and version details? Because it won't load on mine.

Applied code review feedback and new style standards, which include:
- spelling,
- fixing filename to use 2-digit year, split between 1900s and 2000s,
- init and clean docstrings,
- correct join string for acknowledgements, and
- attaching acknowledgements and references.
Updated to:
- remove python 2.7 support,
- improve docstrings,
- remove unnecessary comments, and
- apply new standards for acknowledgements and references.
Fixed logging call and extraction of list of keys.
@aburrell
Copy link
Member Author

aburrell commented Aug 12, 2020

Applied suggested changes and now the load works with the bugfix branch on pysat. Currently working on:

  • defining xarray with all variables except tec and dtec only depending on the time coordinate
  • determining best way to load los data, and how to best use sat_id to define method.

For help with the second point, refer to: http://cedar.openmadrigal.org/static/siteSpecific/programming_tips.pdf

Adapted the general load routine for loading data from an HDF file to allow
different data variables to depend on different coordinates.  This will make the loaded data set smaller, since less padding will be needed and non-unique
values will not need to be maintained.
Removed the initial support for 'los' data and updated 'vtec' load to use new
xarray loading capabilities.
To have a tuple with a length attribute, a comma needs to be included after the variable.
Updated example in docstring and fixed bug where variable name retrieval
from xarray wasn't correctly obtaining a list of keys.
Fixed bug introduced by assuming merge was done in place.
Remove the madrigal keywords as cooridinates for data variables, but keep
them as float coordinates.
Created docs directory and added VTEC figure for future examples.
@aburrell
Copy link
Member Author

aburrell commented Aug 14, 2020

Moved the LOS task to issue #12. Added a figure that begins to address issue #14. Still need to fix one thing:

  • Update metadata 'units' after loading, as they are incorrectly specified in the HDF file.

@aburrell
Copy link
Member Author

@rstoneback tried to figure out how to fix metadata, but failed miserably. Advice?

@rstoneback
Copy link
Collaborator

I'd change the units in the gnss_tec.py load method, after you get everything back from the general methods.

In [50]:  vtec.meta.data                                                                                                                                                                                                
Out[50]: 
              units long_name                                             desc     label      axis   scale notes value_min value_max fill
year              y      YEAR                            Year (universal time)      year      year  linear             NaN       NaN  NaN

In [51]: vtec.meta['year'] = {'units': 'new_units'}                                                                                                                                                                     

In [52]: vtec.meta.data                                                                                                                                                                                                 
Out[52]: 
              units long_name                                             desc     label      axis   scale notes value_min value_max fill
year      new_units      YEAR                            Year (universal time)      year      year  linear             NaN       NaN  NaN

I thought we could do something like,

vtec.meta['year', 'units'] = 'new_units'

but I was wrong and I need to go fix some recent code.

@rstoneback
Copy link
Collaborator

Correction for generality

In [51]: vtec.meta['year'] = {vtec.units_label: 'new_units'}                                                                                                                                                                     

Fixed incorrectly specified units in the VTEC meta data.
@aburrell
Copy link
Member Author

aburrell commented Aug 14, 2020

OK, this PR is ready to convert and review pending pysat/pysat#510

@aburrell aburrell marked this pull request as ready for review August 14, 2020 15:05
aburrell and others added 4 commits October 9, 2020 08:35
Updated the PiPy distribution name for netCDF4-Python.
Have Travis use develop-3, since enough breaking changes have been made.
@aburrell
Copy link
Member Author

aburrell commented Oct 9, 2020

@jklenzing any idea what's going on with Travis?? I am very confused.

@jklenzing
Copy link
Member

mm_gen needs to be updated to ps_gen on line 93 of the dmsp file.

Module nickname was not carried through entire file, fixed this.
Removed hopefully the last instance of sat_id.
Replaced print statements with logger warnings.
Added a file_format flag option to JRO ISR load and download routine.
Added ability to remove expected coordinates from
HDF5 xarray loading if they aren't present.  Now
only errors if all variables are absent.
Flake8 correction for hanging indent.
@aburrell
Copy link
Member Author

I did it! NetCDF solved the problem! And I managed not to break the other instruments!!!

Updated general madrigal routines by:
- fixing incorrect use of `self` in `filter_data_single_date`,
- changing `Notes` to `Note`,
- ensuring local files have the right extension,
- changing variable `file_format` to `file_type`
  to remove any confusion with the Instrument
  attribute,
- changing `netcdf4` to `netCDF4` everywhere, and
- specifying accepted file types.
Allow pysat to handlee multiple TEC file formats, though instruments will only load one type at a time.
Fixed bug assigning time index, now is an index of datetime values.
Fixed bug that tried to set file_type multiple times.
Adapted code to allow flexible file types, removed
unused default function call, and updated the style
and docstrings in clean.
Updated jro_isr to use flexible file type throughout
instrument, and changed old variable name.
More efficient transfer from unix to timestamp.  Fixed bug where index does not need to be assigned.
Copy link
Collaborator

@rstoneback rstoneback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aburrell! Comments posted. I ran into some issues so I wasn't able to test the hdf support.

I was able to load multiple days (netcdf files) using the appropriate pysat branch for multiple days. Thanks for the additional improvements!

pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
pysatMadrigal/instruments/jro_isr.py Outdated Show resolved Hide resolved
# use the default pysat method
dname = '{year:02d}{month:02d}{day:02d}'
vname = '.{version:03d}'
supported_tags = {ss: {'vtec': "gps{:s}g{:s}.hdf5".format(dname, vname)}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hardcoded to .hdf5.

pysatMadrigal/instruments/methods/madrigal.py Show resolved Hide resolved
Password for data download. (default=None)
url : string
URL for Madrigal site (default='http://cedar.openmadrigal.org')
file_format : string
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file_format keyword conflicts with the built-in keyword at the Instrument level. I remember you said something about this during the meeting today but unfortunately I don't remember what. I'm having trouble in this area.

I tried checking out 'hdf5' support. Using the branch over in pysat (without the keyword) I get,

gps = pysat.Instrument('gnss', 'tec', tag='vtec')                                                                                                                                                                   
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-8-c7e28318904d> in <module>
----> 1 gps = pysat.Instrument('gnss', 'tec', tag='vtec')

~/Code/pysat/pysat/_instrument.py in _get_supported_keywords(local_func)
   2950         for pop in pop_list[::-1]:
   2951             args.pop(pop)
-> 2952             defaults.pop(pop)
   2953 
   2954     out_dict = {}

IndexError: pop index out of range

With the keyword,

In [9]: gps = pysat.Instrument('gnss', 'tec', tag='vtec', file_format='hdf5')                                                                                                                                               
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-9-06a989677482> in <module>
----> 1 gps = pysat.Instrument('gnss', 'tec', tag='vtec', file_format='hdf5')

~/Code/pysat/pysat/_instrument.py in __init__(self, platform, name, tag, inst_id, clean_level, update_files, pad, orbit_info, inst_module, multi_file_day, manual_org, directory_format, file_format, temporary_file_list, strict_time_flag, ignore_empty_files, units_label, name_label, notes_label, desc_label, plot_label, axis_label, scale_label, min_label, max_label, fill_label, **kwargs)
    252                 estr = 'file format set to default, supplied string must be '
    253                 estr = '{:s}iteratable [{:}]'.format(estr, self.file_format)
--> 254                 raise ValueError(estr)
    255 
    256         # set up empty data and metadata

ValueError: file format set to default, supplied string must be iteratable [hdf5]

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was using the wrong pysat branch. You had the correct link above. My bad. Trying again...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My test setup was good the first time. Clearly I got myself turned around a bit.

The test for 'hdf5' was done with the 'delimiter_bug` branch. I confirmed that I get the pop issue or the ValueError.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I apparently changed this bug in the following PR. Will update it here, too. Problems of finding too many related problems...

return


def download(date_array, tag='', inst_id='', data_path=None, user=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I downloaded a few days for testing but the files I got are labeled stuff.hdf. The load time is really quick though, suggesting they are actually netcdf files. Poking at the code the 'netcdf' path is indeed being run.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, regardless of the format you ask for and get, the remote files are listed as .hdf5. Not sure if I fixed the rename here or in the other PR.

pysatMadrigal/instruments/gnss_tec.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@rstoneback rstoneback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot a couple consistency suggestions to go along with one of my inputs.

aburrell and others added 2 commits October 15, 2020 09:48
Several suggestions from code review, including code style and docstring typos.

Co-authored-by: Russell Stoneback <rstoneba@utdallas.edu>
@aburrell
Copy link
Member Author

This instrument will now be added with the merge of #20

@aburrell aburrell closed this Oct 15, 2020
@aburrell aburrell deleted the tec_inst branch December 18, 2020 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants