ENH: General Madrigal instrument #67

aburrell · 2022-02-28T19:09:49Z

Description

Partially addresses #1 by adding general support for pandas-supportable madrigal data sets.

Type of change

Please delete options that are not relevant.

New feature (non-breaking change which adds functionality or documentation)
This change requires a documentation update

How Has This Been Tested?

Added new unit tests and examples:

import pysat
import pysatMadrigal as py_mad

# Download DMSP data from Madrigal
dmsp_abi = pysat.Instrument(inst_module=py_mad.instruments.madrigal_pandas,
                            tag='180', kindat='17110')
dmsp_abi.download(dt.datetime(2015, 12, 30), dt.datetime(2015, 12, 31),
                  user='Firstname+Lastname', password='email@address.com')
dmsp_abi.load(date=dt.datetime(2015, 12, 30))

Test Configuration

Operating system: OS X Big Sur
Version number: Python 3.8 and 3.9
Any details about your local setup that are relevant: develop branch of pysat

Checklist:

If this is a release PR, replace the first item of the above checklist with the
release checklist on the pysat wiki:
https://github.com/pysat/pysat/wiki/Checklist-for-Release

Updated the JRO docstring and file header comments.

Removed the old Madrigal template, which was actually a broken implementation of a general instrument.

Updated the JRO ISR download docstring, removing the unnecessary "+" formatting suggestion.

Added a function to specify Madrigal instrument codes, grouping them by pandas-compatible or xarray-compatible.

Updated the general unit tests by: - adding a new unit test for the new known Madrigal instrument code function, and - updated tests for `_check_madrigal_params` to capture new, better evaluation.

Added a general instrument sub-module for time-series Madrigal data.

Updated the instrument init by: - breaking out the imports to different lines and - adding the new instrument.

Updated the changelog with a summary of the enhancements so far.

Added a function to return the known madrigal file formats and test these formats for pysat parsing compatibility. Also updated docstrings and local comments.

Updated the GNSS TEC, JRO ISR, and pandas general instruments to use the new general file format function. Also updated the pandas general list methods to handle kindat and year formatting. Improved the pandas general instrument by only including Madrigal instrument codes with parsable file formats and adding download test support for one of the possible instruments.

Added unit tests for `known_madrigal_inst_codes` and updated the version evaluations to use the packaging module.

Added the packaging module as a dependency. It was already required by other modules, so is not adding additional overhead.

Added a description of the general pandas instrument to the docs.

Updated the changelog to include the new improvements and cycled the year for the next release.

Fixed test method names that weren't updated when copying from an older test.

rstoneback

Thanks @aburrell! When following your example the software downloads data for Jan 1, 2015. I tried to sort out error location but so far it looks like it is coming from Madrigal (?) 🤷

Long list of filename formats you added here!

rstoneback · 2022-03-01T15:22:54Z

pysatMadrigal/instruments/madrigal_pandas.py

+    # If the kindat (madrigal tag) is not known, advise user
+    self.kindat = kindat
+    if self.kindat == '':
+        logger.warning('`inst_id` did not supply KINDAT, all will be returned.')


When I registered pysatMadrigal modules I got warnings from this line.

In [4]: pysat.utils.registry.register_by_module(pysatMadrigal.instruments) pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned. pysat WARNING: `inst_id` did not supply KINDAT, all will be returned.

I haven't traced the source.

That warning is unavoidable for the general instrument, but important to raise. Basically, it means you could potentially be mixing data sets that should be in separate Instruments.

rstoneback · 2022-03-01T15:32:27Z

pysatMadrigal/instruments/methods/general.py

+    # Warn if file format has multiple '*' wildcards
+    num_wc = len(fstr.split("*"))
+    if num_wc >= 3:
+        msg = "".join(["file format string has multiple '*' ",


Also got this,

In [1]: import pysat import In [2]: import pysatMadrigal as py_mad pysat WARNING: file format string has multiple '*' wildcards, may not be parsable by pysat pysat WARNING: file format string has '*' between formatting constraints, may not be parsable by pysat

Now that one is confusing. No idea why that one appears on module import.

So, this comes from the Jicamarca init. I think I can fix this by adding a "verbose" flag to the function.

This should be fixed now.

pysatMadrigal/instruments/methods/general.py

rstoneback · 2022-03-01T15:51:36Z

Data does load for Jan 1,

In [5]: dmsp_abi.load(2015, 1)
pysat WARNING: The generalized Madrigal data Instrument can't support instrument-specific cleaning.
<ipython-input-5-065c33fa990e>:1: DeprecationWarning: Meta now contains a class for global metadata (MetaHeader). Default attachment of global attributes to Instrument will be Deprecated in pysat 3.2.0+. Set `use_header=True` to remove this warning.
  dmsp_abi.load(2015, 1)

In [6]: dmsp_abi.data
Out[6]: 
                     year  month  day  hour  min  sec  recno  kindat  kinst      ut1_unix      ut2_unix  epowf        mlt  gdlat   glon  cgm_lat  cgm_long  eqb_qc_fl  eqb_prv_kp  eqb_prb_kp  sat_id   epowq
2015-01-01 00:00:53  2015      1    1     0    0   53      0   17110    180  1.420070e+09  1.420070e+09   62.9   4.070833  -50.7   80.4    -62.9     135.0        2.0        2.24       0.907      17  2171.0
2015-01-01 00:13:05  2015      1    1     0   13    5      1   17110    180  1.420071e+09  1.420071e+09   64.1  20.991389  -80.2  -27.0    -67.9      26.2        1.0        1.58       0.972      17  2171.0
2015-01-01 00:14:41  2015      1    1     0   14   41      2   17110    180  1.420071e+09  1.420071e+09   63.6   6.001944  -51.2  100.3    -65.8     161.0        2.0        1.87       0.928      18  2181.0
2015-01-01 00:27:21  2015      1    1     0   27   21      3   17110    180  1.420072e+09  1.420072e+09   64.6  21.463333  -79.2  -18.2    -67.6      30.3        1.0        1.28       0.992      18  2181.0
2015-01-01 00:35:49  2015      1    1     0   35   49      4   17110    180  1.420073e+09  1.420073e+09   63.0   4.343611  -51.3   78.2    -63.0     131.6        2.0        2.21       0.907      19  2191.0
.

aburrell · 2022-03-01T18:05:45Z

Long list of filename formats you added here!

Was super not happy to need to do that, but couldn't find another solution.

rstoneback · 2022-03-01T19:11:16Z

Long list of filename formats you added here!

Was super not happy to need to do that, but couldn't find another solution.

Makes sense. Thanks for the bummer repetitive work!

pysatMadrigal/instruments/madrigal_pandas.py

Updated the general madrigal instrument example.

pysatMadrigal/instruments/madrigal_pandas.py

Updated the test day to be different, and hopefully work.

Added a verbosity flag to the general Madrigal file format function, implemented this flag in the instruments that use the function, and added unit tests for it's success.

Set the general pandas instrument download tests to True for all tags.

Expanded ranges of values that indicate all kindat options should be considered.

Allows Madrigal to store HDF5 files using either the 'h5' or 'hdf5' extensions. Also changed the date for the testing to comply with the file available for tag 7800.

Use presence of '*' to choose the correct file parsing method.

Updated test dates for current known potential instruments.

aburrell · 2022-04-13T12:54:35Z

These tests are passing locally, but require the file format parsing changes available on the 'develop' branch of pysat. Because of this, I am adding a version cap for pysat and putting this merge on hold until the next pysat release.

Fixed the logic for evaluating the presence or absence of a wildcard.

Examine file format to provide delimiter, if needed.

Add a pysat version cap for this pull request.

Removed try/except catch that is now handled in the general list_remote_files function.

aburrell · 2022-07-22T21:44:38Z

@rstoneback pinging for a re-review. Looks like the tests are probably going to pass given how long they've taken.

rstoneback · 2022-07-24T14:30:23Z

xarray released a new version https://twitter.com/xarray_dev/status/1550522574672523266 , on Friday, which breaks pysat tests and functionality.

Bumped the lowest supported pysat version due to an xarray issue.

rstoneback · 2022-07-29T14:26:28Z

The madrigal_pandas download isn't working. Getting an internal server error. Rerunning the tests didn't help.

aburrell · 2022-07-29T19:57:13Z

Hmm... was able to do a download in ipython. I'm going to try re-running just one job and see if it's perhaps the parallel requests that's an issue.

aburrell · 2022-07-29T20:34:21Z

Well, that didn't work. @rstoneback , what happens if you run the test suite locally?

rstoneback · 2022-07-29T22:38:55Z

Getting at least 1 failure locally so far. inst_dict26. When tests finish I'll decode that.

rstoneback · 2022-08-01T18:12:57Z

In total there were 29 test failures locally. madrigal_pandas was the Instrument with the download error. Some 500 Internal Server Errors and a variety of error message check issues.

rstoneback · 2022-08-01T18:55:16Z

In total there were 29 test failures locally. madrigal_pandas was the Instrument with the download error. Some 500 Internal Server Errors and a variety of error message check issues.

I unfortunately did not run these tests in my normal Python environment. Rerunning in the correct one.

rstoneback · 2022-08-01T22:40:06Z

In the correct environment I get only 2 errors, 1 for download and 1 for the remote_file_list, both for madrigal_pandas or inst_dict26.

aburrell · 2022-08-02T12:15:36Z

Can you see what the tag is?

rstoneback · 2022-08-02T15:33:54Z

Can you see what the tag is?

8105

rstoneback · 2022-08-02T16:35:37Z

pysatMadrigal/instruments/madrigal_pandas.py

+tag_dates = {'120': dt.datetime(1963, 11, 27), '170': dt.datetime(1998, 7, 1),
+             '180': dt.datetime(2000, 1, 1), '210': dt.datetime(1950, 1, 1),
+             '211': dt.datetime(1978, 1, 1), '212': dt.datetime(1957, 1, 1),
+             '7800': dt.datetime(2009, 11, 10), '8105': dt.datetime(2017, 9, 1)}


Suggested change

'7800': dt.datetime(2009, 11, 10), '8105': dt.datetime(2017, 9, 1)}

'7800': dt.datetime(2009, 11, 10), '8105': dt.datetime(2017, 9, 7)}

There is data for Sept 7-11 on Madrigal. I can download data for Sept 7th, but to then load that data I have to load for Sept 1. The filename is van_allen_2017_09.001.hdf5 which looks like it only has year and month in the filename.....

In [19]: inst.download( ...: dt.datetime(2017, 9, 7), user=Redacted, password=Redacted ...: ) In [20]: inst.load(date=dt.datetime(2017, 9, 7)) <ipython-input-20-18f7eb017f3c>:1: DeprecationWarning: Meta now contains a class for global metadata (MetaHeader). Default attachment of global attributes to Instrument will be Deprecated in pysat 3.2.0+. Set `use_header=True` in this load call or on Instrument instantiation to remove this warning. inst.load(date=dt.datetime(2017, 9, 7)) In [21]: inst.data Out[21]: Empty DataFrame Columns: [] Index: [] In [22]: inst.files.files Out[22]: 2017-09-01 van_allen_2017_09.001.hdf5 dtype: object In [23]: inst.load(date=dt.datetime(2017, 9, 1)) pysat WARNING: The generalized Madrigal data Instrument can't support instrument-specific cleaning. <ipython-input-23-146bd78005dd>:1: DeprecationWarning: Meta now contains a class for global metadata (MetaHeader). Default attachment of global attributes to Instrument will be Deprecated in pysat 3.2.0+. Set `use_header=True` in this load call or on Instrument instantiation to remove this warning. inst.load(date=dt.datetime(2017, 9, 1)) In [24]: inst.data Out[24]: year month day hour min sec recno kindat kinst ut1_unix ut2_unix ch_energy lpeak_orb_el_flux mid_grad_el_flux plasmapause 2017-09-07 00:02:00 2017 9 7 0 2 0 0 10305 8105 1.504743e+09 1.504743e+09 2600000.0 11.563559 NaN NaN 2017-09-07 04:54:00 2017 9 7 4 54 0 1 10305 8105 1.504760e+09 1.504760e+09 2600000.0 11.241525 NaN NaN 2017-09-07 05:54:00 2017 9 7 5 54 0 2 10305 8105 1.504764e+09 1.504764e+09 2600000.0 NaN NaN 4.690049 2017-09-07 09:05:00 2017 9 7 9 5 0 3 10305 8105 1.504775e+09 1.504775e+09 2600000.0 10.631356 NaN NaN 2017-09-07 09:44:00 2017 9 7 9 44 0 4 10305 8105 1.504777e+09 1.504777e+09 2600000.0 NaN 4.037392 NaN ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 2017-09-11 15:40:00 2017 9 11 15 40 0 70 10305 8105 1.505144e+09 1.505144e+09 2600000.0 NaN NaN 3.500927 2017-09-11 16:33:00 2017 9 11 16 33 0 71 10305 8105 1.505148e+09 1.505148e+09 2600000.0 12.190678 NaN NaN 2017-09-11 18:05:00 2017 9 11 18 5 0 72 10305 8105 1.505153e+09 1.505153e+09 2600000.0 NaN NaN 3.775340 2017-09-11 19:30:00 2017 9 11 19 30 0 73 10305 8105 1.505158e+09 1.505158e+09 2600000.0 NaN 2.992274 NaN 2017-09-11 20:02:00 2017 9 11 20 2 0 74 10305 8105 1.505160e+09 1.505160e+09 2600000.0 12.258475 NaN NaN [75 rows x 15 columns] In [25]:

This looks like it'd need a special download routine to adjust the filename. I guess we should remove this from the general instrument 🐕‍🦺

Yeah.... there are also only 6 days total of data on the site.

Not our highest priority 🐦

Removed the Van Allen probe data because the file handling requires additional support.

rstoneback

🚀

aburrell and others added 16 commits January 6, 2022 10:44

DOC: improved jro docstring

2a88e05

Updated the JRO docstring and file header comments.

DEL: removed old template

3326d33

Removed the old Madrigal template, which was actually a broken implementation of a general instrument.

DOC: updated download docstring

60197dd

Updated the JRO ISR download docstring, removing the unnecessary "+" formatting suggestion.

ENH: added Madrigal inst_code function

e4961fc

Added a function to specify Madrigal instrument codes, grouping them by pandas-compatible or xarray-compatible.

TST: updated general tests

e30d0e0

Updated the general unit tests by: - adding a new unit test for the new known Madrigal instrument code function, and - updated tests for `_check_madrigal_params` to capture new, better evaluation.

ENH: added a general madrigal inst

717713b

Added a general instrument sub-module for time-series Madrigal data.

ENH: updated instrument init

ae8f0b2

Updated the instrument init by: - breaking out the imports to different lines and - adding the new instrument.

DOC: updated changelog

7c0e163

Updated the changelog with a summary of the enhancements so far.

ENH: added madrigal file format function

975c2e1

Added a function to return the known madrigal file formats and test these formats for pysat parsing compatibility. Also updated docstrings and local comments.

TST: added known_madrigal_inst_codes tests

0566cba

Added unit tests for `known_madrigal_inst_codes` and updated the version evaluations to use the packaging module.

MAINT: added packaging

c47d9d9

Added the packaging module as a dependency. It was already required by other modules, so is not adding additional overhead.

DOC: updated supported instruments

2d1bbc4

Added a description of the general pandas instrument to the docs.

DOC: updated changelog

601ab15

Updated the changelog to include the new improvements and cycled the year for the next release.

BUG: fixed test method names

51986d7

Fixed test method names that weren't updated when copying from an older test.

Merge branch 'develop' into general_madrigal_inst

7985813

aburrell mentioned this pull request Feb 28, 2022

Generalised Madrigal Instrument #1

Open

aburrell requested a review from rstoneback February 28, 2022 19:13

rstoneback reviewed Mar 1, 2022

View reviewed changes

STY: Updated feedback.

c780112

aburrell commented Mar 1, 2022

View reviewed changes

pysatMadrigal/instruments/madrigal_pandas.py Outdated Show resolved Hide resolved

pysatMadrigal/instruments/madrigal_pandas.py Show resolved Hide resolved

DOC: updated example

d87d035

Updated the general madrigal instrument example.

aburrell commented Mar 2, 2022

View reviewed changes

pysatMadrigal/instruments/madrigal_pandas.py Outdated Show resolved Hide resolved

pysatMadrigal/instruments/madrigal_pandas.py Outdated Show resolved Hide resolved

BUG: update test day

4e892af

Updated the test day to be different, and hopefully work.

aburrell added this to the 0.1.0 Release milestone Mar 17, 2022

aburrell added the enhancement New feature or request label Mar 17, 2022

ENH: added verbose flag

820a8a0

Added a verbosity flag to the general Madrigal file format function, implemented this flag in the instruments that use the function, and added unit tests for it's success.

aburrell added 5 commits April 6, 2022 10:58

TST: set general download True

d77ec0c

Set the general pandas instrument download tests to True for all tags.

BUG: fixed kindat handling

3af6f71

Expanded ranges of values that indicate all kindat options should be considered.

BUG: added handling for 'h5' extension

4336552

Allows Madrigal to store HDF5 files using either the 'h5' or 'hdf5' extensions. Also changed the date for the testing to comply with the file available for tag 7800.

BUG: fixed file parsing

757c7cc

Use presence of '*' to choose the correct file parsing method.

BUG: updated test dates

7e5746d

Updated test dates for current known potential instruments.

aburrell and others added 6 commits April 13, 2022 11:26

BUG: fixed logic

8db4725

Fixed the logic for evaluating the presence or absence of a wildcard.

BUG: fixed delimiter bug in list_files

9fc9e09

Examine file format to provide delimiter, if needed.

MAINT: add pysat version cap

2d25046

Add a pysat version cap for this pull request.

Merge branch 'develop' into general_madrigal_inst

6ce1e74

STY: removed try/except statement

9e87e8a

Removed try/except catch that is now handled in the general list_remote_files function.

Merge branch 'develop' into general_madrigal_inst

f412a3b

MAINT: bumped pysat version

e7f5272

Bumped the lowest supported pysat version due to an xarray issue.

rstoneback reviewed Aug 2, 2022

View reviewed changes

BUG: removed '8105' tag

9ce2721

Removed the Van Allen probe data because the file handling requires additional support.

aburrell requested a review from rstoneback August 2, 2022 20:58

rstoneback approved these changes Aug 3, 2022

View reviewed changes

aburrell merged commit 25001ad into develop Aug 3, 2022

aburrell deleted the general_madrigal_inst branch August 3, 2022 12:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: General Madrigal instrument #67

ENH: General Madrigal instrument #67

aburrell commented Feb 28, 2022 •

edited

rstoneback left a comment

rstoneback Mar 1, 2022

aburrell Mar 1, 2022

rstoneback Mar 1, 2022

aburrell Mar 1, 2022

aburrell Mar 17, 2022

aburrell Apr 4, 2022

rstoneback commented Mar 1, 2022

aburrell commented Mar 1, 2022

rstoneback commented Mar 1, 2022

aburrell commented Apr 13, 2022

aburrell commented Jul 22, 2022

rstoneback commented Jul 24, 2022

rstoneback commented Jul 29, 2022

aburrell commented Jul 29, 2022

aburrell commented Jul 29, 2022

rstoneback commented Jul 29, 2022

rstoneback commented Aug 1, 2022

rstoneback commented Aug 1, 2022

rstoneback commented Aug 1, 2022

aburrell commented Aug 2, 2022

rstoneback commented Aug 2, 2022

rstoneback Aug 2, 2022

rstoneback Aug 2, 2022

aburrell Aug 2, 2022

rstoneback Aug 2, 2022

aburrell Aug 2, 2022

rstoneback left a comment

	'7800': dt.datetime(2009, 11, 10), '8105': dt.datetime(2017, 9, 1)}
	'7800': dt.datetime(2009, 11, 10), '8105': dt.datetime(2017, 9, 7)}

ENH: General Madrigal instrument #67

ENH: General Madrigal instrument #67

Conversation

aburrell commented Feb 28, 2022 • edited

Description

Type of change

How Has This Been Tested?

Test Configuration

Checklist:

rstoneback left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rstoneback commented Mar 1, 2022

aburrell commented Mar 1, 2022

rstoneback commented Mar 1, 2022

aburrell commented Apr 13, 2022

aburrell commented Jul 22, 2022

rstoneback commented Jul 24, 2022

rstoneback commented Jul 29, 2022

aburrell commented Jul 29, 2022

aburrell commented Jul 29, 2022

rstoneback commented Jul 29, 2022

rstoneback commented Aug 1, 2022

rstoneback commented Aug 1, 2022

rstoneback commented Aug 1, 2022

aburrell commented Aug 2, 2022

rstoneback commented Aug 2, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rstoneback left a comment

Choose a reason for hiding this comment

aburrell commented Feb 28, 2022 •

edited