Add I/O for masked Traces #49

malcolmw · 2018-05-19T21:08:59Z

This pull request is for a branch that I use to work with masked Traces. Masked values are filled with a fill value automatically determined based on the data array's dtype at write-time, which is stored in the Dataset's attributes and later used to reconstruct the mask at read-time.

Make backwards compatible with older ASDF versions I/O of masked arrays of float/integer data. Ignore some test files.

Make backwards compatible with older ASDF versions I/O of masked arrays of float/integer data. Ignore some test files. Fix broken identity test

coveralls · 2018-05-19T23:30:11Z

Coverage increased (+0.1%) to 89.162% when pulling c6db4d2 on malcolmw:dev/masked_trace_IO into fb88b35 on SeismicData:master.

krischer

Heyhey,

first of all I'm really sorry for taking so long!

I like this and the implementation looks solid to me.

There is one point I'd like to discuss a bit: Instead of relying on a "magic" mask value, how about storing the actual mask as an attribute?

With some trickery it could be made so it takes 1 bit per array item. It would be a bit slow but as we are talking about masked arrays I'm not too worried here.

Aside from that the currently chosen mask value is machine dependent - probably not relevant for the platforms we really care about but its a bit of a sore spot.

For unsigned integer arrays the currently chosen mask value would be zero which does not really work. Also for lowish precision integers the risk for false positives might be a bit too large - for floating points I agree that this would not be an issue.

All in all I think the that using an actual mask array would solve these issues at the expense of storage size and performance.

krischer · 2019-09-24T12:34:48Z

Hi @malcolmw

Is there still an interest in following up this?

malcolmw · 2019-09-24T15:07:52Z

Hi, @krischer,

This is not a priority for me anymore, so feel free to close this PR if it isn't a significant value-adding feature for pyASDF. However, if you think it is a useful feature you would like to merge to facilitate work with long segments of potentially-gappy continuous data, I am happy to help out.

krischer · 2019-09-24T15:24:18Z

I do actually think that this would be a very nice addition to the data format. As pointed out in a comment above I'd prefer the data model of actually carrying along a second mask data set (could be name the same as the actual dataset, just prefixed with __MASK__ or so).

This would mirror to a certain extend how numpy's masked arrays work and it could also be properly integrated into the format. I could take care of adding it to the format definition and the validator if there is still interest in implementing this.

malcolmw · 2019-09-30T15:02:51Z

Sounds good. I'm happy to implement, though it will be a while before I get to this.

Malcolm White added 2 commits May 19, 2018 13:26

Add I/O for masked Traces

bed89ce

Make backwards compatible with older ASDF versions I/O of masked arrays of float/integer data. Ignore some test files.

Add I/O for masked Traces

6d590f1

Make backwards compatible with older ASDF versions I/O of masked arrays of float/integer data. Ignore some test files. Fix broken identity test

malcolmw force-pushed the dev/masked_trace_IO branch from bed89ce to 6d590f1 Compare May 19, 2018 22:54

Fix broken test

b0751e5

Add test for masked Trace I/O

c6db4d2

krischer reviewed Sep 11, 2018

View reviewed changes

malcolmw mentioned this pull request Sep 12, 2018

Add HDF5 RegionReferences for efficiently windowing waveform segments from continuous data #48

Closed

krischer mentioned this pull request Oct 28, 2019

Slow performance when adding gappy data to ASDF #57

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add I/O for masked Traces #49

Add I/O for masked Traces #49

malcolmw commented May 19, 2018

coveralls commented May 19, 2018 •

edited

Loading

krischer left a comment

krischer commented Sep 24, 2019

malcolmw commented Sep 24, 2019

krischer commented Sep 24, 2019

malcolmw commented Sep 30, 2019

Add I/O for masked Traces #49

Are you sure you want to change the base?

Add I/O for masked Traces #49

Conversation

malcolmw commented May 19, 2018

coveralls commented May 19, 2018 • edited Loading

krischer left a comment

Choose a reason for hiding this comment

krischer commented Sep 24, 2019

malcolmw commented Sep 24, 2019

krischer commented Sep 24, 2019

malcolmw commented Sep 30, 2019

coveralls commented May 19, 2018 •

edited

Loading