Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to include invlib in linemixing #281

Merged
merged 13 commits into from
Jan 19, 2021

Conversation

riclarsson
Copy link
Contributor

This is code that should not be merged before the linker error is fixed and the actual best-fits are done well.

@simonpf
I am trying here to include your optimizations.h. It creates linker issues because of the inclusion of the C++ files in the headers, as you noted in our email-exchange.

I want to use this style of minimization to mimic python curve_fit so ARTS can compute the line mixing parameters Rosenkranz coefficients by itself from data. The examples are in linemixing.cc's linemixing_ecs_rosenkranz_adaptation. This is one of several best-fits that I would like to attempt. The specific ones for now are:

DPL:

T4:

(I might possibly want to add a few more from lineshapemodel.h later.)

@riclarsson
Copy link
Contributor Author

@simonpf Thank you! I have some questions because I have been unable to use the minimization-routines even though it now compiles.

In python, I can write the following:

from numpy import array, mean
import scipy as sp
import scipy.optimize

T = array([200., 205., 210., 215., 220., 225., 230., 235., 240., 245., 250., 255., 260., 265., 270., 275., 280., 285., 290., 295., 300., 305., 310., 315., 320., 325., 330., 335., 340., 345., 350.])

Y = array([-3.01073e-06, -3.11330e-06, -3.21125e-06, -3.30471e-06, -3.39379e-06, -3.47865e-06, -3.55942e-06, -3.63624e-06, -3.70925e-06, -3.77860e-06, -3.84440e-06, -3.90681e-06, -3.96596e-06, -4.02197e-06, -4.07498e-06, -4.12510e-06, -4.17245e-06, -4.21715e-06, -4.25932e-06, -4.29905e-06, -4.33645e-06, -4.37162e-06, -4.40466e-06, -4.43566e-06, -4.46471e-06, -4.49189e-06, -4.51729e-06, -4.54097e-06, -4.56303e-06, -4.58353e-06, -4.60253e-06])

def T4(T, X0, X1, X2):
    T0 = 296.0
    G = T0 / T
    return (X0 + X1 * (G - 1)) * G**X2

x, cov = sp.optimize.curve_fit(T4, T, Y, [mean(Y), -mean(Y)*1e-2, -0.75])

This will yield me a decent approximation of the best-fit.

Now using your code, I am supposed to call

template<typename Real, typename Vector, typename CostFunction, typename Minimizer>
int minimize( CostFunction &J,
              Minimizer &M,
              const Vector &x0,
              Vector       &xi,
              unsigned int max_iter,
              Real         tol )

as far as I can see. My short question is: How do I call this so that it is similar to the python case above?

So far, I have tried the following things:

  1. Create a T4 struct that
    a) Defines Eigen::Vector3d gradient(const Eigen::Vector3d&). The gradient is simply computed analytically and summed up over the values above.
    b) Defines Eigen::Matrix3d Hessian(const Eigen::Vector3d&). The Hessian is also simply computed analytically and summed up over the values above.
    c) Defines Numeric cost_function(const Eigen::Vector3d&, bool). This returns the standard deviation of Y minus the evaluated values. I have no idea what the bool is supposed to do here, but it is required by invlib::LevenbergMarquardt
    d) Defines Numeric criterion(const Eigen::Vector3d&, const Eigen::Vector3d&).
  2. Define a solver that works for Eigen-types. I first tried using your eigen.h interface but it requires files that don't exist.
  3. Tried and failed to define a dot-function that works with invlib::LevenbergMarquardt. I think this is what the eigen.h interface once did.

I can get the gradient descent Minimizer to work. As in, it gives gives less good results thanscipy but not bad results. The other methods don't work very well. I have iterated the above several times with different ways of computing the gradient and Hessian (g = J.T * (y - evaluated), and H=J.T * J being one of the options) but nothing works very well and the current method seems best but not good. I am therefore very unsure now how I am to use the minimize-function. Is there some documentation I am missing?

@riclarsson
Copy link
Contributor Author

@olemke This now updates Eigen. I didn't find any issues with the update other than several new long to int compiler warnings. Since I want to use an "unsupported" module, I also had to add some new files (prompting the update of version because I don't want them from different versions). I changed the layout of the Eigen folder so that the "unsupported" module is separated from the supported module on a folder level. Is this OK?

I also attach a working version of a cminpack "port" that I first implemented before finding Eigen had the fits. I think there were a few interface changes I had to do to get it working with Eigen, but I prefer the Eigen code because it is readable. I attach the cminpack-code anyways since it worked with pure matpack types in case there is a good reason to not use Eigen unsupported here. I prefer the Eigen interface.

minpack.zip

@olemke
Copy link
Member

olemke commented Jan 12, 2021

I'm all for sticking with Eigen. I see two different warning behaviours with the updated Eigen. As you mentioned GCC warns about the conversions. With Clang, I get lots of warnings inside Eigen where enum values are shadowed within the same class.
To get rid of the warning clutter, I propose to remove these two warning categories from the toplevel CMakelists.txt file:

ARTS_ADD_COMPILER_FLAG (Wshadow)
ARTS_ADD_COMPILER_FLAG (Wconversion)

@riclarsson
Copy link
Contributor Author

@olemke Is there really no way to just say that some files should ignore those warnings? Otherwise I will remove them.

@olemke
Copy link
Member

olemke commented Jan 13, 2021

You can put this pragma boilerplate around the places where Eigen gets included in ARTS:

#pragma GCC diagnostic push
#pragma GCC diagnostic ignored "-Wshadow"
#pragma GCC diagnostic ignored "-Wconversion"
#pragma GCC diagnostic ignored "-Wfloat-conversion"
#include <Eigen/Dense>
#pragma GCC diagnostic pop

This will also fix the warnings in Clang.

@riclarsson
Copy link
Contributor Author

@olemke I went with the warnings removal. I also added similar tricks in CMake for the imgui and wigner libraries to remove their warnings (using PRIVATE, which I think means the compilation options are not forwarded).

@riclarsson riclarsson marked this pull request as ready for review January 18, 2021 12:43
@riclarsson
Copy link
Contributor Author

This can be merged if all checks pass.

Merging this closes #282. Even if I did not end up using invlib because I couldn't figure it out, Simon's fix was great as a start. I think that the curve_fit-routine I linked can be used by others so there is some benefits.

Merging this also closes #136 . The latter is more controversial to close because we are not getting exactly the same results as the original paper. However, what we get is reasonable and it should just be a matter of finding minor differences now. When the issue was opened, the results were vastly different. Now, there is at least one instrument that would benefit from merging this code (TEMPERA) as it would be the only way for them to combine low altitude/high pressure line mixing and high altitude/low pressure Zeeman effect. To see the outputs here, it is possible to look at test_gui because it shows the band at high pressure. When this is merged, I will update arts-xml-data/spectroscopy/cat with line data for the band, and also the lbl_check-function for helping the users so that it is easy to use this code.

Note: Merging this should open a new issue for anyone that is mathematically inclined. The constructor of EquivalentLines that are at this time on lines 13-41 requires analytical partial derivatives. I have been unable to figure this out. I can get the eigenvalues derivative to mostly work in a python example, but I do not know how it applies to the presented problem. Since I cannot solve this issue, all partial derivatives are computed by perturbation. This is ugly, but it is all I could do for now. I will not open a new issue because I am not sure github is where we intend to keep issues that should be solved but for which we have no solutions.

riclarsson and others added 10 commits January 19, 2021 10:41
Also add curve_fit similar to python scipy.optimize.curve_fit
but with a little bit different interface

This addition required adding the unsupported/ modules of Eigen
if this is not desired, I have a port of the underlying MinPack
code that accepts a mostly similar interface.  I prefer using the
Eigen interface since it is possible to read and understand instead
of relying heavily on pointers inside lapack/blas functions

I cannot figure out how to use invlib for the task of minimizing these
things, and it seems to not be too important to do so since the resulting
output is actually not very easy to use (the Rosenkranz paramters don't
seem to actually work very well when coming from ECS).  Still, the
method to confirm that is tedious without this code, so the code must be
available somehow...
These just makes the type easier to deduce for the compiler removing
a lot of compiler complaints from some code that I was writing while
debugging the line mixing.  It also renames some of the conversions
to be named explicitly after what they do (no longer "hitran 2 arts"
but "kaycm 2 hz") since there are plenty of non-hitran uses of these
conversions

For some unknown reason, this also changes slightly the output of one
of the tests that rely heavily on these conversions, but it did not
seem critical enough to change since the new output should be at least
as good as the old
using GCC pragmas

Also began adding Jacobian calculations to the ECS linemixing module
as it has to be done via perturbations before I figure out how to do
a derivative of a Eigenvalue system...
Note that this requires perturbations because otherwise it will
not work.  The main problem is I do not know how to compute
derivatives reliable of the constructor of EquivalentLines
web-browser for Doxygen.  This option is recommended for two
reasons:

1) Use this if you do not have LaTeX installed; or
2) Use this if you want to formulas look prettier in the HTML output

Using old method, PNGs were generated with all the formulae in the
doxygen comments ("\f[ ... \f]" for outline, and "\f$ ... \f$" for
inline formulae).

I hope this works because the old method generates a whole lot of
very long and cumbersome error output...
@riclarsson
Copy link
Contributor Author

@olemke This branch is now up-to-date with the recent merges. It can be merged.

The caveats here are:

  1. I update Eigen. This lead to so many compiler warnings that I disabled some for Eigen includes using your pragma solution.
  2. I add several (private) CMake compile options to disable some more warnings for 3rdparty libraries that compiles. This just hides the warnings. The main problem here is the addition of using JavaScript LaTeX instead of ghostscript. I don't know if some systems will have problems because of this. Currently on my main branch, the equations are not rendering for me so the system in place was not working. The new way uses MathJax, which is advertised as "it just works". You can actually copy-paste the LaTeX from the generated doxygen, so I think it is just working and is a good replacement.

The advantages are:

  1. New absorption routine that is as up-to-date you can be for O2 (but will still need some minor adjustments).
  2. New curve_fit algorithm that has a clear interface (I think and hope others will agree).
  3. Only Fortran and a weird variable tracking warning remains as compiler complaints. I think the former needs to be given an earlier Fortran standard to compile towards to disappear since it seems some of our 3rdparty libraries are using non-compliant modern Fortran. But I don't understand Fortran and its warnings. The latter needs some tracking number to increase for a single file. I am not sure that's a good idea, and I think it might go away if we could just constexpr-ify some step in generating WsvRecord (since that would keep the number of variables low).
  4. Simon made his invlib linkable from elsewhere in ARTS.

@olemke
Copy link
Member

olemke commented Jan 19, 2021

I will not open a new issue because I am not sure github is where we intend to keep issues that should be solved but for which we have no solutions.

Putting it in Github would at least make sure that it's not completely lost. A comment/FIXME about the problem in the code could also helpful.

-Wno-format-overflow is GCC-specific and triggers an 'unknown warning'
warning in Clang. It is now only added for GCC.

Merge target properties for disotest and cdisort into one statement.
@riclarsson
Copy link
Contributor Author

I'll add a comment in the code. I'll also add a github issue when it is merged. I will mark the issue with what combination of tags? I will put these two "enhancement" and "help wanted" for now.

riclarsson and others added 2 commits January 19, 2021 13:05
Unmerge targets because only disotest needs
ENABLE_ORIGINAL_OUTPUT_HANDLING
@olemke
Copy link
Member

olemke commented Jan 19, 2021

I'll add a comment in the code. I'll also add a github issue when it is merged. I will mark the issue with what combination of tags? I will put these two "enhancement" and "help wanted" for now.

Yes, those two fit.

@olemke
Copy link
Member

olemke commented Jan 19, 2021

The main problem here is the addition of using JavaScript LaTeX instead of ghostscript. I don't know if some systems will have problems because of this. Currently on my main branch, the equations are not rendering for me so the system in place was not working. The new way uses MathJax, which is advertised as "it just works". You can actually copy-paste the LaTeX from the generated doxygen, so I think it is just working and is a good replacement.

Very nice. The MathML formulas look really good and should nowadays be supported by all modern browsers.

This was linked to issues Jan 19, 2021
@olemke olemke mentioned this pull request Jan 19, 2021
@olemke olemke merged commit e2e32fa into atmtools:master Jan 19, 2021
@riclarsson riclarsson deleted the invlib_in_linemixing branch January 19, 2021 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Using optimization in other parts of ARTS ECS Model needs fixes
2 participants