Conflicting data in GOES dataset #25
-
|
The goes dataset contains several entries where multiple different values are given for the same timestamp in the data. For example, files I did not do a thorough investigation but assume this is a result of adding the missing data. Which files should we assume to be true? Edit: After investigating this slightly more, it appears like roughly 500k entries are affected by this. My guess would be that the new data filling the data gap (#17) is somehow shifted or from a different source than the old data. As each file covers ~2 months of data, and there were two gaps in the original data, the conflicting data occurs around the edges of the old gaps. However, this may also indicate that all of the data within the gaps is problematic. Depending on how the source dataset is created, you might need to re-create the entire dataset in one go to avoid this issue. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Hi David -- Thank you for bringing this to our attention. We are working on cleaning up the affected GOES files, but in the meantime, it shouldn't matter which files you assume to be true. The different values are the result of the multiple GOES satellites from which this data is collected. The data originating from one satellite is not any more correct than data originating from another, even though their values may differ slightly for the same time frame. If it would be helpful, we can provide more information about how these competing datasets were merged after we've resolved any duplicate timestamps within the affected files. |
Beta Was this translation helpful? Give feedback.
-
|
Update: The GOES data for the period between 2007 and 2013 was adjusted over the weekend to improve information density and establish a convention for handling competing data values from multiple GOES satellites. Each row and column is now populated independently to maximize the data available for each time step. This means that each row and column may contain values reported by one or multiple GOES payloads. For elements with multiple available values, higher-numbered satellites take precedence (i.e. data from g13 supersedes data from g10 if both are available for the same timestamp and data type). |
Beta Was this translation helpful? Give feedback.

Hi David -- Thank you for bringing this to our attention. We are working on cleaning up the affected GOES files, but in the meantime, it shouldn't matter which files you assume to be true. The different values are the result of the multiple GOES satellites from which this data is collected. The data originating from one satellite is not any more correct than data originating from another, even though their values may differ slightly for the same time frame.
If it would be helpful, we can provide more information about how these competing datasets were merged after we've resolved any duplicate timestamps within the affected files.