Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visualization improvements #752

Merged

Conversation

gassmoeller
Copy link
Member

This hopefully addresses #749. @spco could you give this branch a try on your cluster? The new default behaviour should be more stable than before.
Unfortunately testing everything is quite hard, because tests will not catch if a file was written in a background thread, or in a temporary location first. So before merging we should give this a thorough manual testing.
Also reviewing the diffs is a pain, because the logic was very complicated (and still is, although its simpler now). I found it easier to just look at the old state and the new state separately.
Somewhat unrelated changes that occur in this pull request:

  • previously we supported writing in the background for file formats other than vtu and hdf5 (e.g. gnuplot). I found it easier and more readable to separate this into another branch of the if-loop and did not reimplement this feature. I guess that is fine, since nobody should use an ascii output for models so large that they require writing in a temporary storage and/or in a background thread.
  • previously we wrote .pvtu,. pvd and ,.visit files even for other output formats (except hdf5). Was this a glitch? They did not seem to work in Visit or Paraview anyway. I removed this behaviour.

@bangerth
Copy link
Contributor

Something seems amiss with the tests, and it appears to be output related. Can you take a look?

@bangerth
Copy link
Contributor

Specifically, test results are here: http://cdash.kyomu.43-1.org/viewTest.php?onlyfailed&buildid=1118

@@ -5,6 +5,22 @@
* 1.3. All entries are signed with the names of the author. </p>
*
* <ol>
* <li> Changed: ASPECT by default wrote one output file per process that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you say "per MPI process" to make it clear what "process" refers to here?

@tjhei
Copy link
Member

tjhei commented Feb 15, 2016

Some tests expect the .visit/.pvtu even though they use gnuplot format.

@tjhei
Copy link
Member

tjhei commented Feb 15, 2016

I would like to propose we completely remove the background writing thing. If the runs are small is doesn't matter and if they are big you shouldn't write to an NFS anyway.

@gassmoeller
Copy link
Member Author

I dont know, the clusters that I used most a few years back only had NFS and local temp folders. I think in that configuration it made sense to write to local first and move to NFS in the background. Or are you proposing to simply remove the background thread, but keep the possibility to write to tmp first?

The last commit should fix a remaining bug and the tests (I removed the .visit/.pvtus).

@spco
Copy link
Contributor

spco commented Feb 15, 2016

I've run this on my cluster split over 6 nodes, total 17 cores. All seems to be working this end :)

@bangerth
Copy link
Contributor

I suppose that a lot of smaller clusters only have NFS, and that people run a lot of jobs on them with 100 or 200 cores. For these, I don't think that the current strategy is inherently a bad idea.

@tjhei
Copy link
Member

tjhei commented Feb 15, 2016

If you feel strongly about keeping it and it is not on by default (tmp and copy), I am okay with leaving it in. How does MPI IO with grouping behave on a standard NFS system in comparison to writing and copying hundreds of files per time step?

@bangerth
Copy link
Contributor

I must admit I don't know. On a different question, using MPI IO yields only one file -- does anyone have experience what that means for visualizing it in parallel? Does every one of potentially 64 or more cores have to read the entire VTU file?

@tjhei
Copy link
Member

tjhei commented Feb 15, 2016

You can pick how many files you want to generate.

@jperryhouts
Copy link
Contributor

My cluster uses NFS (file access is notoriously slow on this cluster)

$ time ls > /dev/null

real    0m2.195s
user    0m0.001s
sys 0m0.007s
  • that's only on a fresh login, I guess the second time it runs some of that information is cached

But anyway, I'm all for putting file access stuff in the background whenever possible.

@gassmoeller
Copy link
Member Author

@jperryhouts, would you have an hour to check @tjhei's question? How does a job with lets say 50 processes and visualization output for every timestep perform on your cluster with the current master for 'set Number of grouped files = 0' vs. 'set Number of grouped files = 1'? I currently do not have access to such a nice NFS cluster as yours 😄.
Anyway, even if we do not settle the question which is the best solution, do we agree that this PR improves the situation and removes crashes for a number of users (so we should consider just merging and thinking about further optimization afterwards)?

@jperryhouts
Copy link
Contributor

I won't have time to test it this week, but I can do it on Monday.

And yeah, go ahead with this PR, I didn't mean to hold that up, just wanted to point out that some of us are still computing as if we were in the stone age, and thus might need the legacy functionality. :)

@gassmoeller gassmoeller force-pushed the visualization_improvements branch 2 times, most recently from 8ac94e6 to 0bb8df9 Compare February 19, 2016 17:48
@gassmoeller
Copy link
Member Author

Ok, thanks. And I was not implying your cluster was slow, I hope you did not take that personally 😉. When I was still in Potsdam I was often computing on a cluster, which had a terribly unstable NFS, so that models often froze or crashed, although the computing performance was ok. That was what pushed me to applying for computing time on an outside cluster for the first time.
Unfortunately that kind of behaviour would not help us benchmark the output time here.

I rebased the branch and when the tester is finished this might be ready for merge, unless somebody wants to give it another round of review.

"On large clusters it can be advantageous to first write the "
"output to a temporary file on a local file system and later "
"move this file to a network file system. If this variable is "
"set it will be interpreted as a temporary storage location.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...is set TO A NONEMPTY STRING, it will...

@bangerth
Copy link
Contributor

The patch is indeed hard to read, but to the best of my abilities I think it looks good. Can you squash and merge yourself?

(One question, though: the patch removes a bunch of .pvtu/.vtu/.visit files. What's up with that? Is this unrelated and an accident?)

@gassmoeller
Copy link
Member Author

No it is related and by purpose. So far we have written .pvtu, .visit and .pvd files even when writing .gnuplot or other output formats. These files do not work, and I did not see a point in keeping this behaviour so I removed this output in the code (they are now only written for .vtu output) and had to remove the test output as well.

@bangerth
Copy link
Contributor

Ah, ok, thanks for the explanation. I figured you had already explained that somewhere above, but I didn't quite get what was happening.

@bangerth
Copy link
Contributor

As I said, please squash and merge.

Fix parsing of parameter

Add entry to changes.h

Minor improvements

Fix comment

Minor fixes

Fix grouped output and tests

Improve documentation
@tjhei
Copy link
Member

tjhei commented Feb 19, 2016

I am late to the party, but I just looked over it again: 👍

gassmoeller added a commit that referenced this pull request Feb 19, 2016
@gassmoeller gassmoeller merged commit 7d21570 into geodynamics:master Feb 19, 2016
@gassmoeller gassmoeller deleted the visualization_improvements branch April 8, 2016 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants