Skip to content

Parallelized code#50

Closed
allegroLeiden wants to merge 16 commits intolime-rt:masterfrom
allegroLeiden:parallel
Closed

Parallelized code#50
allegroLeiden wants to merge 16 commits intolime-rt:masterfrom
allegroLeiden:parallel

Conversation

@allegroLeiden
Copy link
Copy Markdown
Contributor

There are a lot of commits here, but most of them are rearrangements to prepare the way for the final one, which allows the user to divide the bottleneck processing between several threads (via a new optional user-settable parameter 'nThreads').

The task now makes use of the OpenMP API provided by the gomp library to divide up between threads two sections of code: (i) the solution of the radiative transport equations; (ii) raytracing to make the final image.

In the new module fast_exp.c there was a line to include ieee754.h, as a kind of crude attempt to test the compliance with this
standard that the FastExp function requires (see issue lime-rt#49). However it seems that Mac OSX doesn't make use of this header file.
I thus simply deleted the include for now.
The single main loop over grid cells is broken into 3. This is in preparation for the parallelisation. No change in run time or
output is detected.
The reason for this is to separate the pointers which the parallel threads need to write to from the remainder of the molData
attributes, which are read-only within the crucial code blocks. I put three of the attributes in question, 'jbar' 'phot' and
'vfac', into a separate structure, which is now declared, allocated, written to and freed within the code lines which will
eventually be within the parallel block. The attribute 'ds' does not depend on species and so was just declared as a pointer to
double outside of any struct. I also renamed it to halfFirstDs to be a bit more descriptive.

The run time and output values appear to be unaffected.
This is done to make it easier later on to corral the thread-private stuff in a parallel block.
Again this is done to prepare for the parallelisation, when we will need separate RNGs for the separate threads. Note that we
will still get repeatable results when the TEST flag in the makefile is set. The output is now slightly different than the
previous commit produced, because in effect different random seeds are in use both in the random selection of initial photon
directions and the random dither of starting rays in the raytracing section. The present output however is expected to be
identical to that produced by single-thread running of the parallelized code.
Pointer expTau in photon() is now freed after use.
I had called it molDataPrivate, now it is gridPointData.
There were many pointers which were malloced via a statement of the form

  <name> = malloc(sizeof(<data type of pointer>) * <number of entries>);

This is not very conservative. Better is

  <name> = malloc(sizeof(*<name>) * <number of entries>);

I've changed this in numerous instances.
In the case in molinit() that the opacities file contains no entries, bail_out() was called without a following call to exit().
This is now fixed.
Also defined defaultNThreads in lime.h. par.nThreads is set to this if the user leaves it unset. (We really need to separate
user-settable parameters from task configuration information!)
In f07953a I introduced arrangements to provide separate random number generators when we add the facility to run several
threads in parallel. This scheme however gave rise to a data race condition on the pointer 'ran'. Although this seemed to be
harmless in practice, in the interests of conservative programming, an alternative scheme is now used which should avoid the data
race.
Detailed changes:
  - Changed the definition in the Makefile of CC to gcc -fopenmp;
  - Added some tests on _OPENMP to lime.h;
  - Added the necessary OMP pragmas in aux.c, stateq.c and raytrace.c;
  - Added a new function greetings_parallel() in curses.c (not yet used).

Running with 4 threads was observed to decrease run-time by about a factor of 2.5. The output shows small differences, which are
probably due to the data race condition on the values of some grid-point-specific quantities. This does not appear to be
significant.
@smaret smaret added this to the Release 1.5 milestone Jul 15, 2015
@smaret smaret mentioned this pull request Jul 15, 2015
@smaret
Copy link
Copy Markdown
Contributor

smaret commented Jul 15, 2015

Superseded by #56.

@smaret smaret closed this Jul 15, 2015
@allegroLeiden allegroLeiden deleted the parallel branch July 23, 2015 09:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants