Fix #125 and #149#153
Conversation
This has changed the format of binary output population files, because previously value or norm and norminv was saved in the files. Also removed unused function sourceFunc from sourcefunc.c.
|
Apparently the old gcc version (at least with its default settings) used by travis-ci doesn't like the inlining. |
Fixed path length projection, half-index offset and doubling of the effect of the local cell. The fixes required redoing the whole velospline thing, which is now replaced with a piecewise linear approximation and erf lookup. The new lineshape calculation also does the velocity projection correctly.
Optimisations to critical loops and inlining of sourceFunc_cont and _line and FastExp.
|
I'm going to make some further changes to this, probably including fix to #154. |
|
I've updated this PR to include fix to #154 (and some stupid bugs in earlier commits...). It would now be very good if someone(s) had a look - I'd be surprised if I got everything right the first time. Some very preliminary benchmarking (#146 and others) indicates that this PR significantly reduces the disagreement between LIME and another code that I've been using. This PR includes some optimisations, including inlining of some functions. The speed gain is not huge but not negligible either, maybe 10 to 15 percent. However, inlining needs a compiler with C99 support (which is why the PR fails travis-ci checks). |
|
I'll try to look at it tomorrow. |
Previous commit had some non-working test code that wasn't meant to be committed.
We now use a more recent version of GCC that supports inlining.
This should be fixed with tlunttil/lime#1 |
Removed inlining of FastExp and sourceFunc_line and _cont. This slowed down the example model runtime from 1m35s to 1m55s on my laptop (with -f -n -p 8).
Fix travis build
|
I'm back from my vacation, should I look this over now? |
|
Yes, please. There may well be all sorts of strange things going on there. I meant to do further testing/thinking last week, but all sorts of things happened and I didn't have time for that in the end. I've removed the inlining for now, but we'll need to make some decision about what C language features are allowed in the code. |
For the time being I am afraid I vote pretty firmly for ANSI only. The reason for this is that my primary project or task involving Lime is to add facilities to allow it to be called from CASA. The latter does not seem to ship with its own version of gcc (thank heavens), but for various reasons we run CASA locally on machines with RHEL6. The gcc version on these RHEL6 machines is 4.4.7, which does not seem to support C99 fully, at least by default. Until I clarify the requirements for doing what I am tasked with doing, I would say, let's wait with C99. As soon as I can get a green board with this, then by all means, let's use these features; but I am just not at this point yet. |
|
Ok. We can always revert to my earlier commit with inlining when the time is ripe. There are some things in the code (at least variable length arrays in a couple of places and erf-function to construct the look-up table) that weren't officially added until in C99, but were supported by gcc by default even in old(ish) versioins. I guess these are fine? |
|
So long as it compiles on my RH6 machines, I don't mind. |
|
The changes fall into two groups, and I have different opinions about them.
I fully support 1 but only partially support 2. My reasoning in more detail is as follows. 2 breaks into 2 sub-parts: 2a) The way the projected speed v as a function of distance is approximated. 2b) The numerical integration scheme employed. For 2a you have gone from the old polynomial interpolation to piecewise linear; for 2b you have gone from rectangle-rule to error function. I like the change rectangle->erf but (at least for the present) I question whether we should move away from the polynomial scheme. The advantage of the latter was perhaps not so much the high order approximation of v, as the adaptive (in a crude fashion) generation of sample intervals. The variation of velocity (and thus line shape) between grid points seems to me to be at present the major bottleneck of approximation in Lime and thus I think we need to tread carefully before we make it any cruder. Indeed I think some movement towards more generic expression of various routines, controlled by global variables/macros, e.g. the polynomial order, even though this will probably slow things down, would be a good thing, because by playing with the various quantities (polynomial order, fineness of sampling etc) we can find out which are important and which not. There is I think a deal of such refinement left to do with Lime and we should be cautious about hard-wiring routines until we know which values are optimal. Thus what I would suggest would be a re-write of velospline.c:getVelocities() to bring it back similar to the present (1.5) scheme, but with (i) a flexible number of coefficients, ruled by a macro NUM_VEL_COEFFS, and (ii) storage of a polynomial for each cartesian coordinate; then recast photon.c:calcLineAmpPWLin() (what does 'PW' refer to, by the way?) so that it performs the erf integrations as now, but preserves the old adaptive sampling scheme, and obtains its linear approximations for v(x) from the polynomial scheme. I can PR this after the acceptance of your PR as-is if you like. Detailed comments are as follows.
and loop to FAST_EXP_MAX_TAYLOR at main.c:32 instead of hard-wiring the value 9.
where where where Making the change of variable yields I don't guarantee that there are no mistakes in my working here, but this is the correct line to pursue. (This is the single item which it is necessary to fix before the PR can be accepted - all the other suggestions are optional or simply discussion items for further work.)
|
piecewise, duh |
|
I'll need to look again at what I did with integration... Regarding the polynomial scheme: isn't the adaptiveness in some sense false accuracy, because the interpolating polynomial is fixed? There's no reason to think that sampling the polynomial more densely provides a better estimate of the true velocity. Answers to your other questions (in the same order):
|
True, but that is not really the point of the adaptive scheme. The problem is that where the velocity varies quickly (regardless of how accurately we track its precise changes), so will the line amplitude, in a highly nonlinear fashion. The adapative scheme doesn't really care about the accuracy, it just adapts to rapid changes in v by choosing finer intervals. I am advocating for retaining the polynomial scheme with (in the medium term) an easily-settable number of orders, but I would actually be surprised if it turned out that terms greater than order 2 or so made much difference. I do think we ought to find this out, though. |
Memory is terrible - I am ashamed to admit that I forgot that one could use |
I would vote 'no'. |
|
So would I, if for no other reason than easier merging... |
|
I have now merged the latest master branch and included a fix to #176. Both of these provided ample opportunities for mistakes, so another set of eyes looking at this would be good (I'm mainly thinking of @allegroLeiden and @ParfenovS ). I banished the spline/interpolation stuff, so the new erf-method is the only one remaining. Therefore the results are not identical to previous ones even with -DTEST, but at least the code compiles and the results in example model don't look hugely strange. As for #176, there seemed to be a strange case of 'premature integration' of all blended lines separately. I'm not sure what the idea here was - I think the old version was clearly wrong. But somehow I never noticed it until @ParfenovS pointed it out. This version is maybe ~20% slower than the previous one, mainly because of in/out splitting. There are many things that could be optimised, but I think it's better to leave all performance fine-tuning to a separate PR after the functional PRs are done. |
at lines 403 and 404 in photon.c (getjbar function) should be removed. |
|
Right, they were leftovers from the old version. I'll fix that and any other bugs once you have gone through the changes. |
|
expTau[iline] and tau[iline] are not increasing in photon.c One can move the following line: |
|
Oops. I was very careless in copy-pasting code from one place to another. We wouldn't even necessarily need the tau[] array anymore, because everything except maser check is done with expTau and it would be easy to change that, too. |
|
getjbar uses both |
|
vfac_loc comes from a pure Gaussian (that is, the local profile at the grid point), while vfac includes the effect of velocity gradient in the Voronoi cell containing the grid point (which is needed for calculating the change in intensity inside that cell). So I think it's needed, though the effect is probably very small in realistic cases. Though I think that the test at the beginning of the outermost loop in getjbar should be against, vfac_loc, not vfac. |
|
Ok. I see. |
|
I didn't find any other problems. |
|
Thanks for your very detailed work @ParfenovS . I'll fix the bugs you found and maybe add a few more comments in the code to explain what is being done and why. |
Fixed bugs. Changed large struct parameters to sourceFunc_line to pointers. Removed unneeded 'tau' array from photon().
|
I pushed a new version with bugs fixed (I hope). I also made the code a bit more readable by replacing magic numbers with things defined in the header file and removed the 'tau' array from photon() that wasn't really necessary. Although I wrote that I wouldn't do any optimisations in this PR, I decided to do one thing: calls to This PR should take care of issues #124, #125, #149, #154, and #176. |
|
It's funny, because I thought the |
|
I had done it in an earlier commit, but then it got reverted when I merged in #152. |
|
Ok, I've had a look through and could not see anything that did not look kosher. Image looks ok. I'll merge it. BTW I am thinking we could probably use the erfs also in raytrace in place of |
|
Yeah, I thought about the raytrace, too. It's perhaps not something to go for in 1.7, I think there are several more serious issues to take care of first. |
This PR tries to fix issues #125 and #149. I had to do some rather major changes to various parts of photon.c; most importantly, I changed the whole line profile/velocity integration.
This PR also includes some optimisations that had a small (but not insignificant) effect on running time.
There are still some things about blended lines that are not done quite correctly.