Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

channel 4 BT to radiance conversion #57

Closed
carloshorn opened this issue Apr 15, 2020 · 42 comments · Fixed by #67
Closed

channel 4 BT to radiance conversion #57

carloshorn opened this issue Apr 15, 2020 · 42 comments · Fixed by #67

Comments

@carloshorn
Copy link
Collaborator

Dear pygac team,

My colleagues Marie Doutriaux Boucher and Oliver Sus at EUMETSAT encountered a potential mistake in the conversion from brightness temperature to radiance in pygac.

The critical part is the linear temperature transformation in the exponent. The temperature in the equation is an effective brightness temperature. In order to use the measured brightness temperature, it is possible to re-define the constants of the linear equation.

It appears like the coefficient A has changed sign in pygac, but has not been divided by B. Since B is close to one, the results will not change dramatically. However, the differences accumulate when transforming back and forth.

Abhay Devasthale is already double checking the coefficients and comparing to patmos-x.

@mraspaud
Copy link
Member

Thanks for reporting this! I'll let @abhaydd look into this first then.

@carloshorn
Copy link
Collaborator Author

Hi @mraspaud and @abhaydd,
I was asked to create a branch with the modified coefficients with respect to this issue. After looking into calibration.py, I would like to propose a change of the code.
According to a comment line after the long declaration of coeffs, all coefficients are taken from data files available at https://cimss.ssec.wisc.edu/patmosx/instrument_files/. I would suggest to remove the hard coded coefficients by a function that reads these data files and creates the dictionary coeffs. This has the advantage, that the code itself would not depend on the coefficients and allows the user to easily update/change them. Furthermore, it prevents copy-paste errors (e.g. #49 ) and would therefore facilitate the maintenance of the code.
Since this would move some responsibility to the user, the pygac software should somehow log the input. Do you think that the version tag at the bottom of these data files is sufficient?

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 15, 2020

@carloshorn I understand your standpoint. But I think the code SHOULD depend on the coefficients. For climate applications, it is extremely important to be able to tag a particular version of PyGAC to a particular set of coefficients. It could easily happen that the files, from which we directly read the coefficients, will be misplaced or deleted by the user. Or it might get difficult to trace the history of the coefficients because the user might make some changes in those files later on, may they be deliberate or accidental. Also, which coefficient file format should PyGAC follow? What if the PATMOS-x changes its existing file format? So, in my opinion, the price to pay would be higher than trying to achieve some flexibility here. Maybe I am just missing something here or not understanding what you mean. Then please correct me. Thanks.

@carloshorn
Copy link
Collaborator Author

Hi @abhaydd,
Thanks for the quick reply. It is exactly the tagging of the input data which was also my concern.
In my opinion, reading these data files as an auxiliary input would be a similar step as reading the TLE files. In the case of the TLE files, it is already the responsibility of the user to get these files and to store them, or at least note the source for a reproducible processing campaign.
However, I see your point in trading a code dependency on the PATMOS-x format which is not under our control against the hard coded solution which is more error-prone in the implementation.
As a software developer, I would prefer that the user takes care of his files and provide some help to trace back the coefficients with some log messages or by providing a reference to the auxiliary data in the output product. As a user, it is of cause quite handy to have pygac as a single reference. However, at https://cimss.ssec.wisc.edu/patmosx/instrument_files/, I have seen that there are several versions of these coefficients. Do you think that there is a use case where someone wants to reprocess some files with old coefficients, but taking advantage of code features that were implemented in later releases of pygac? In this case, a control of these coefficients by the user via input files would also be desirable.
What do you think?

@mraspaud
Copy link
Member

How about having default coefs in the code but allow the user to replace it with a custom file, where a version number must be provided?

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 15, 2020

@carloshorn I think your idea of reading calibration coefficients from the files (instead of having them hard-coded directly in the code) is good. But I just don't want PyGAC to depend on a particular file format to do that (e.g. PATMOS-x). If we were to have them in the separate files: 1) I would rather like to define our own format and let the users fill-in those coefficients. I agree that there will be a risk of making mistakes while copy-pasting, but the user has to bear then some responsibility. 2) And this next point is very important, each release of PyGAC should then also have the default calibration file tagged to it and stored on github as a part of that particular release. This is to ensure that a) we have a traceability for the climate applications and b) to help the users to use PyGAC without having to think about arranging those files. For example, when a satpy user calls pygac to just process/visualize few GAC orbits. In such low-demanding, but time-constrained cases, it would be too much to ask the user to define a calibration file. Having a default file would help.
Would something like this work from the software point of view?

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 15, 2020

@mraspaud Ahhh...just saw your comment. Yes, pretty much what you said, but have everything in the files. User-defined as well as default ones.

@mraspaud
Copy link
Member

mraspaud commented Apr 15, 2020

Some ideas:

  1. About the versioning of the files: ideally they should come from a Git repository, so we could get the commit hash. Otherwise an md5sum of the file used, on top of version?
  2. File format: I don't think I really matters what we choose, but since the volume of data is small, something human readable would be good. Json?
  3. File formats: converting scripts could be created for other file formats.
  4. The pyspectral has quite a few calibration functions available already, would it make sense to delagate the pygac calibration to pyspectral? With the right coefficients obviously.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 16, 2020

I am not in a position nor do I have knowledge in this to comment your suggestions @mraspaud. I will only say this. We should try to keep it simple and avoid too many interdependencies. Would adding pyspectral make things easier or complex for a user? Can we guarantee that pyspectral in future will not diverge away in a fashion that would make dependency more complex (e.g. pyspectral may in future in turn depend on other additional packages that the user would have to think about)? To think about all of this just to read few cal-coefficients seems like a too much demand on a user. Or maybe nowadays anaconda will install and do everything smoothly. Don't know.

@mraspaud
Copy link
Member

pip and conda will take care of all the dependencies, so it's a non-issue imo. But it doesn't have to be linked to custom cal coefs, ie we can take this at another time.

@carloshorn
Copy link
Collaborator Author

I agree with @abhaydd that adding another dependency for reading a hand full of coefficients would make it more complicated. Especially, because the user would then need to reference the version of pyspectral for the coefficients which would break the benefit of a single reference.
I like the idea of using a simple human readable format to extract the data.
As a starting point, I attached a small python script to convert the data from the patmos-x files into some other format (.csv, .json).
store_coeffs.txt

@carloshorn
Copy link
Collaborator Author

Using

import json
with tar.with_suffix('.json').open('w') as json_file:
    json.dump(all_data, json_file, indent=4, sort_keys=True)

at the end of above script, the json file looks much prettier, e.g.

{
    "metopa": {
        "PRT1_0": "276.6194",
        "PRT1_1": "0.050919",
        "PRT1_2": "1.471e-06",
        "PRT1_3": "0.0",
        "PRT1_4": "0.0",
        "PRT2_0": "276.6511",
        "PRT2_1": "0.050892",
        "PRT2_2": "1.489e-06",
        "PRT2_3": "0.0",
        "PRT2_4": "0.0",
        "PRT3_0": "276.6597",
        "PRT3_1": "0.050845",
        "PRT3_2": "1.521e-06",
        "PRT3_3": "0.0",
        "PRT3_4": "0.0",
        "PRT4_0": "276.3685",
        "PRT4_1": "0.050992",
        "PRT4_2": "1.482e-06",
        ...

This is just an example, how the json file could look like. In pygac, I think we do not use all parameters defined in patmos-x files. Furthermore, pygac has a different naming convention for these parameters, e.g.

coeffs = {
    'metopb': {'ah': np.array([0.166, 0.183, 0.201]),
               'al': np.array([0.055, 0.061, 0.029]),
               'bh': np.array([2.019, 1.476, 1.748]),
               'bl': np.array([2.019, 1.476, 1.748]),
               'ch': np.array([-0.201, -0.137, -0.033]),
               'cl': np.array([-0.201, -0.137, -0.033]),
               'c_dark': np.array([39.70, 40.00, 40.30]),
               'c_s': np.array([501.12, 500.82, 501.32]),
               'l_date': 2012.77,
               ...

In any case, I would like to use some more self-explanatory names. Any suggestions or reference, where I could get some inspiration?

If you are fine with json, I would start with the implementation. Naming convention can be updated later on.

@mraspaud
Copy link
Member

I'm good with that. Should the first step be to write a script to export the pygac values to json ?

@carloshorn
Copy link
Collaborator Author

I started with a script that reads the patmos-x data, because the current pygac coefficients have been derived from them, see

pygac/pygac/calibration.py

Lines 408 to 409 in 4c458ba

"""Source Patmos-X Coeffs: Version Tag 'y2017r1_sbaf'
https://cimss.ssec.wisc.edu/patmosx/avhrr_cal.html"""

Now, I will write the mapping from patmos-x to pygac coefficients. This way, we avoid any typo issues or discrepancies to the original source. (I guess you have checked these numbers several times, but still another #49 could be hidden the hard coded values) .

Of cause, I will compare the outcome to the current values and let you know if I find something.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 20, 2020

Thanks a lot @carloshorn that you have started with this.

@carloshorn
Copy link
Collaborator Author

I did the comparison and found 16 deviating values, e.g. the last value of this arrays

'bh': np.array([0.609, 0.980, -0.016]),
'bl': np.array([0.609, 0.980, -0.016]),

should be 1.224 instead of -0.016 according to avhrr_2_instr.dat, second column of line with comment "! ch3a low gain S0,S1,S2" and the following line.

The following three scripts will reproduce my results:
store_coeffs.txt
Update of the above script with the translation to pygac name convention.
pygac_coeffs.txt
Simple pygac coeffs to json dump
compare_coeffs.txt
Comparison of both resulting json files. For a better comparison, I recommend to open the two json files with meld or any other diff editor.

@carloshorn
Copy link
Collaborator Author

@abhaydd, beside the patmos-x web page, I would also like to give a scientific paper as reference to these coefficients. Some papers are listed at the end of patmos-x Solar Calibration documentation, which one would you cite as a reference, or is there a more recent paper for the 2017 data?

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 21, 2020

@carloshorn Thanks. I think the citation of Heidinger et al (2010) is still valid for the original methodology. I am not sure if they have published the most recent update in a peer-reviewed journal yet.

@carloshorn
Copy link
Collaborator Author

For the implementation, do you prefer to correct the 16 deviating coefficients in a separate step and compare/validate the results somehow, and in a later step change the pygac coefficient called 'a' according to the original issue and compare/validate again, or should I do both changes at once?

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 22, 2020

@carloshorn I would suggest to wait till we evaluate and compare the existing calibration. Note that our version of coefficients may be different that what you see online there on patmos-x website. That doesn't mean it is incorrect.

@carloshorn
Copy link
Collaborator Author

I started with the implementation. The current version simply exports the current coefficients as they are into a json file.
If I use the json file directly created from patmos-x files, the test do not pass anymore. After taking a closer look, I noticed, that for example the field 'c_s' is not defined for all spacecrafts in the pygac coefficients, which however are possible to define from the patmos-x data.
This leads to a different program flow after the following line

if cal.c_s is not None:

@abhaydd, do you have time this week for a short skype or webex meeting to talk about the usage of these coefficients. I feel like the mapping from patmos-x to pygac coefficients is not as "linear" as expected and includes some switches or so.

@mraspaud, I did not start a PR yet, because I would like to know which flow you prefer. Do you like a separate PR for the export into json and another one for the change of coefficients, or is a single PR for both okay for you?

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 22, 2020

@carloshorn Shall we try Skype on Friday morning say at 9:30? I will be on leave the next full week.

@sfinkens
Copy link
Member

Coming a bit late to the discussion (again...) but I very much like the idea of separating software and coefficients. Two examples:

  • We sometimes want to update coefficients, but at the same time we want to keep the rest of pygac as it is. So I've been finding myself cherry-picking these calibration commits into the old code.
  • IIRC @helgaweb wanted to compare old vs new calibrations using satpy as "pygac interface". But the old version of pygac wasn't even satpy compatible. So she had the same problem.

Regarding the versioning: What if we create a DOI for each set of calibration coefficients? You can cite DOIs very easily, too.

@sfinkens
Copy link
Member

Oh and regarding pyspectral: @mraspaud was your idea to move the coefficients or the calibration methods to pyspectral?

@mraspaud
Copy link
Member

@sfinkens yes, DOI is good too. About pyspectral, I was thinking about the code, not the coeffs. But maybe pygac's calibration methods are too specific, I don't know.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 22, 2020

@sfinkens @mraspaud I am not sure we should be the ones creating DOIs for these coefficients. They are not ours. They officially belong to NOAA (Andrew Heidinger).

@helgaweb
Copy link

Hi,
@sfinkens - indeed :)
It would be very helpful to have a possibility to easily trace back the coeffs, which were applied.
I haven't followed the conversation - sorry for any ideas which may have come up earlier. I would even go further here and write the coeffs name and version in the netCDF header data, which could be done with a kind of library or DOIs. I’m not sure about the license/property rights of e.g. the PATMOS-x coeffs, as @abhaydd pointed out too. One of my colleagues created a python tool (kind of library) to pull VIS coeffs from different sources (@sfinkens you remember this), which might be interesting for this.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 22, 2020

I think all patmos-x level-2 files even contain the coefficients, as global attributes, that were applied to that orbit.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 22, 2020

There is so much potential to improve pygac and make it more user friendly. If we four groups (eumetsat, dwd, uni bern and smhi) could work together unofficially, it would lift pygac to another level. I will try to spend some time on it during the online pytroll workshop. Also we could in future tackle the LAC component of it.

@sfinkens
Copy link
Member

@carloshorn Your boss mentioned that he has frequent contact to the patmos-x developers. Do you think he could ask them whether they would be willing to create a DOI for their coefficients? If there is already a paper describing the methods, it's hopefully not that much work (of course we can offer help). Then we would be able to cite that DOI in the pygac FDR documentation.

@sfinkens
Copy link
Member

Regarding pyspectral: I would be ok with moving the methods to pyspectral (if they are not too specific - I don't know either). It's an extra dependency, but it runs in the pytroll family ;)

@rsgb-neuhaus
Copy link

Hi all, @helgaweb brought me here. I am the guy she referenced as the one who built a tool to calculate and inject VIS calibration coefficients. I developed the tool in C for the UniBe AVHRR processing chain, over the years a ton of different coefficient sources were added as well as lots of code because every source / project uses different notations and approaches. Some months ago @helgaweb approached me because of the PATMOS coefficients. I collected all available coefficient sets from the PATMOS project, cleaned up their data files such that no values were missing and the files could be automatically processed, created a CSV file, and finally built a sqlite database of it. As I am trying to get into python I developed a proof of concept tool to read the coefficients from the database, however the output does not take into account the sensor degradation. @carloshorn I think the CSV file and / or the sqlite database would help you immensely for a start as you can probably derive the json data easily from it.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 22, 2020

Welcome @rsgb-neuhaus 😊

@carloshorn
Copy link
Collaborator Author

carloshorn commented Apr 23, 2020

Hi all,
I started with #58 which should focus on the user interaction.
In a later step, we should make another PR focusing on the calibration itself dealing with the question of using pyspectral, dealing with the discrepancy between my attempt to map patmos-x data to pygac and the existing coeffs, and also dealing with the original question of this issue.
Welcome @rsgb-neuhaus sounds like you have a lot of experience with these coefficients. Maybe we can open a discussion board on slack.

@sfinkens
Copy link
Member

Hi @rsgb-neuhaus ! Interesting project! Are you planning to publish this coefficient database?

But I agree with @carloshorn that for the moment we should focus on moving the current coefficients to an external file with a well defined format. Once we have that sorted out, you would have a nice interface to pass any of the coefficient sets in your database to pygac.

@abhaydd
Copy link
Collaborator

abhaydd commented Apr 24, 2020

@carloshorn I am on skpye. Don't have your-id. Mine is: abhay.devasthale

@mraspaud
Copy link
Member

@abhaydd @carloshorn we're trying to organize a meeting on slack, check the #pygac channel

@rsgb-neuhaus
Copy link

Just started a project over at https://github.com/rsgb-neuhaus/viscal_db - time to teach an old svn dog some git tricks. I still need to figure out the license stuff for the scripts and of course the data; while I don't think that Andrew Heidinger as the author of the PATMOS coefficients will oppose to re-distributing the data I would like to get his permission first before I upload his data files.

@helgaweb
Copy link

@rsgb-neuhaus Well done documentation! And I especially like that you accounted for more than one set of PATMOS-x coeffs per year! Would have made my work much easier earlier this year ;)

@carloshorn
Copy link
Collaborator Author

Back to the original issue, I compared the coefficients of NOAA-15 in the code, the patmos-x coefficients and those from the KLM guide table (D.1-11) and found out, that there is actually a mistake in the pygac implementation.

The pygac code follows the steps from 7.1.2.4 in the KLM guide, which lists the linear transformation using these coefficients as equation (7.1.2.4-3), which is implemented here:

tsBB = cal.a[chan] + cal.b[chan] * tBB

The pygac coefficients coefficients for NOAA-15 are listed here:

pygac/pygac/calibration.py

Lines 300 to 303 in 2d7fbaf

'a': np.array([1.624481, 0.338243, 0.304856]),
'b': np.array([1.0 / 1.001989,
1.0 / 1.001283,
1.0 / 1.000977]),

where the values of 'b' evaluate to [0.99801495, 0.99871864, 0.99902395]

The patmos-x values are:

-1.624481   !a1_3b
1.001989    !a2_3b
-0.338243   !a1_4
1.001283    !a2_4
-0.304856   !a1_5
1.000977    !a2_5

which reveals that the pygac values for 'a' correspond to the negative patmos-x values of 'a1' and the pygac 'b' values correspond to the reciprocal values of the patmos-x values 'a2'.
However, the table in the KLM guide shows the following values

Channel A B
3B 1.621256 0.998015
4 0.337810 0.998719
5 0.304558 0.999024

The difference between the values in the KLM guide to the pygac values is exactly given by the division by 'b' as presumed at the beginning of this issue.

The mistake in pygac might have originated from a side note that I found in the KLM guide

In the previous version of this documentation, the A coefficients in Tables D.1-11 and D.2-12 were minus numbers and the B coefficients were slightly greater than one. They were used to convert radiance into equivalent blackbody temperature, and converted “effective” temperature TBB* into TBB, instead of the reverse way as shown in Equation 7.1.2.4-3.

@carloshorn
Copy link
Collaborator Author

As mentioned before, I will open the PR to correct this issue after finishing #58.

@mraspaud
Copy link
Member

@carloshorn thank you for your excellent detective work!

The pygac code follows the steps from 7.1.2.4 in the KLM guide, which lists the linear transformation using these coefficients as equation (7.1.2.4-3), which is implemented here:

Makes me think that it would be nice to have comments in the code referring to the equation numbers.

For the rest, I'll let @abhaydd give the final word, but it indeed looks suspicious.

This was referenced Jun 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants