Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revising the noise calculation in FITACF3.0 #420

Merged
merged 4 commits into from Jun 19, 2021
Merged

Conversation

pasha-ponomarenko
Copy link
Contributor

This pull request deals with the problem described in issue #419 when very high (Inf) SNR values sporadically occurr across all range gates in BOREALIS data processed with FITACF3 (note that those are absent if you process the same data with FITACF2.5). This effect is regularly observed in CLY data:

20210203 1600 00 cly a v3

Testing of the bug fix is very simple: run make_fit from this branch on the respective rawacf file and plot it:

      make_fit -fitacf-version 3.0 20210203.1600.00.cly.a.rawacf > 20210203.1600.00.cly.a.v3_m.fitacf
          
      time_plot -png -km -frang 0 -erang 3500 -a -phi -vmax 100 -vmin -100 -b 5 -st 16:00 -ex 02:00 20210203.1600.00.cly.a.v3_m.fitacf > 20210203.1600.00.cly.a.v3_m.png 

The vertical stripes should disappear:

20210203 1600 00 cly a v3_m

Copy link

@ecbland ecbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pasha-ponomarenko On slide 7 of your notes in #419 you said the following:

A sanity check flagging zero skynoise values can be added here as well.

This sounds like a good idea. Do you want to add it?


Also, could you please add a new entry under "modifications" at the top of the file?

Modifications:
2020-03-11 Marina Schmidt (SuperDARN Canada) removed all defined
constants and included rmath.h

@pasha-ponomarenko
Copy link
Contributor Author

pasha-ponomarenko commented Jun 15, 2021

I just want to outline the problem background in a bit more detail so that we can find a suitable solution.
We have a significant portion of a historic dataset from which ACFs with lag 0 power below a certain (still unknown) threshold were dropped.
Currently, the data used for the mean noise calculation are extracted from the bottom third of the sorted lag 0 power values. If the average of non-zero values in this subset is below 1(???), the search noise level is used instead.
I can only guess about the logic behind this arbitrary selection procedure. It looks like the code developers anticipated that if too many ACFs are excluded form analysis due to their low power, then the noise estimate becomes unreliable, and they designed an alternative way to calculate it based on the “search” noise. According to the comments in the modification history, this procedure required some power level adjustment for the “search” noise to become compatible with the “sky” noise.
Based on the above, we have some issues to deal with:
(1) The removal of the low-power ACFs leads to overestimating the background noise level.
I am not sure if there is a proper way to compensate for it as an unknown amount of information about the low-power "tail" of the noise distribution is irreversibly lost.
(2) The use of the “clear frequency search” data as a substitute is rather questionable due to the way this parameter is evaluated. As follows form observations it might be just incompatible with the "normal" noise estimates.
(3) It is hard to understand why the replacement would be triggered at a non-zero mean noise value of one (1) rather than zero (0).
With this respect I have a strategic question: what is the reproducibility policy with respect to different versions of FITACF?

I would prefer to get rid of yet another ad hoc procedure: to select the “weakest” lag 0 powers from the whole set of non-zero values and to abandon using the “search noise” as a substitute.
While it will lead to overestimating the noise level/underestimating SNR in the old data,
I am pretty sure that this effect should be minimal. However, to be on the safe side, we would just need to look at a couple of examples with "dropped" ACFs.
Your thoughts?

@ecbland
Copy link

ecbland commented Jun 15, 2021

@pasha-ponomarenko With this respect I have a strategic question: what is the reproducibility policy with respect to different versions of FITACF?

My view is that we should avoid making modifications to FITACF2.5 that would have any impact on the data pre-selection or the fitted parameters. I think small "self-contained" changes are ok if there is a good reason to make them (e.g. #411), but nothing more significant than that. So I'd recommend that we review the noise calculation in FITACF3.0 only. We can rename it to FITACF3.1 if that seems appropriate.

Two general notes about reproducibility:

  • Changes to the analysis software that may affect the scientific interpretation of the data should be discussed with the PIs. Changing the noise calculation may fall into this category (depending on what effect it has on the data)
  • Each RST release has a DOI, so authors can specify which version was used to process their data so that their work is reproducible.

@pasha-ponomarenko I would prefer to get rid of yet another ad hoc procedure: to select the “weakest” lag 0 powers from the whole set of non-zero values and to abandon using the “search noise” as a substitute.

Completely agree...the search noise is not a good substitute for the "sky noise" -- If anyone is interested, I have some fun examples buried in my notes 😜

@ecbland
Copy link

ecbland commented Jun 15, 2021

Does anyone know when the radars stopped discarding ACFs below the threshold?

@ecbland ecbland changed the title Update preprocessing.c Revising the noise calculation in FITACF3.0 Jun 15, 2021
@pasha-ponomarenko
Copy link
Contributor Author

Adding for reference the clear frequency search code. I have got it from Dieter few years ago:
fclrgc214.txt
Do we have a repository for this sort of documents?

@ecbland
Copy link

ecbland commented Jun 15, 2021

@pasha-ponomarenko Adding for reference the clear frequency search code...Do we have a repository for this sort of documents?

Yes, there is a separate ROS archive on Github: https://github.com/SuperDARN/ros-archive

https://github.com/SuperDARN/ros-archive/blob/ros.1.25/qnx/code/src.lib/radarqnx4/fclrgc214.1.15/src/fclrgc214.c

@pasha-ponomarenko
Copy link
Contributor Author

pasha-ponomarenko commented Jun 18, 2021

@ecbland, I made a quick fix (now in lne 840 of preporcessing.c) that performs the skynoise replacement by the search noise only if the search noise value is non-zero. In this way we preserve the old analysis output but resolve the "streaks" issue for BOREALIS data (and any other data that have zero search noise values).
This is a temporary measure, and this issue should be resolved in a more fundamental fashion in the next release.
Please check my syntax :-)

@ecbland
Copy link

ecbland commented Jun 18, 2021

Thanks @pasha-ponomarenko! I'll take a look ASAP. There are indeed some other radars that haven't used the clear frequency search at one point or another, so I will test with them as well.

Copy link

@ecbland ecbland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this for a few radars. I didn't find any non-BOREALIS examples where this change would matter, but it is possible there are examples out there in the historical data. In any case, this PR fixes the divide by zero problem.

@pasha-ponomarenko
Copy link
Contributor Author

@ecbland, yes these modificaitons do make sense! Please go ahead and implement those changes!

@ecbland ecbland merged commit 32bef98 into develop Jun 19, 2021
@ecbland ecbland deleted the borealis_snr_streaks branch June 19, 2021 14:01
@ecbland ecbland mentioned this pull request Jun 21, 2021
11 tasks
@ecbland
Copy link

ecbland commented Aug 7, 2021

I'm adding another example here to answer a specific question from a PI about the Clyde River data on 29 May 2021.

The plots below show that this PR solves the issue with the vertical stripes in BOREALIS data.

  1. fitacf v3.0 from RST4.5
  2. fitacf v3.0 from RST4.6 (which includes the fix)
  3. fitacf v2.5 provided for comparison

20210529 cly fitacf3_RST4 5
20210529 cly fitacf3_RST4 6
20210529 cly fitacf2

pasha-ponomarenko added a commit to pasha-ponomarenko/backscatter that referenced this pull request Aug 9, 2021
Resolve an issue causing infinite SNR vlaues in BOREALIS data with noise level < 1. For more detail see SuperDARN/rst#420 (comment).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Very high SNR value sporadically occurring across all range gates in BOREALIS data processed with FITACF3
2 participants