Corrupt xvgs check slows down gromacs parser and makes unnecessary backups #81

nathanmlim · 2016-02-25T00:09:01Z

The addition of checking for corrupt xvgs slows down the gromacs parser quite a bit which may or may not be related to the use of numpy.genfromtxt? It might be better/faster to iterate over the lines and check for lines with different numbers of elements.

In the gromacs parser, prior to the call removeCorruptLines(), I had it make a backup of the xvgs in a separate folder. This is not always totally necessary in the case where there are no corrupt lines in the xvgs. Instead, it should only make backups of offending xvgs.

davidlmobley · 2016-02-25T00:31:33Z

Maybe run a bit of benchmarking?

Also, one fix could be to make the handling of corrupt xvgs be optional, or possible to be bypassed... In some applications speed will be important.

nathanmlim · 2016-02-25T00:39:59Z

I'm not sure how best to handle this, since this feature is currently gromacs specific which would make adding something like a feature flag...maybe a bit unnecessary? Could be the best way to handle this is to generalize the corruption check and then add a feature flag for it.

davidlmobley · 2016-02-25T00:44:52Z

Hannes has been arguing that we ought to have a way to pass parser-specific
options (he has a proposal for how to do it). This is another good argument
for that. Generalizing the corruption check to other data formats would be
a pain.

I'd fix the backup issue at this point, and then do the benchmarking to see
how much of a speed hit this is versus sensible alternatives. We can leave
the "making it optional" aspect for later.

On Wed, Feb 24, 2016 at 4:39 PM, Nathan Lim notifications@github.com
wrote:

I'm not sure how best to handle this, since this feature is currently
gromacs specific which would make adding something like a feature
flag...maybe a bit unnecessary? Could be the best way to handle this is to
generalize the corruption check and then add a feature flag for it.

—
Reply to this email directly or view it on GitHub
#81 (comment)
.

David Mobley
dmobley@gmail.com
949-385-2436

nathanmlim added enhancement Medium priority labels Feb 25, 2016

nathanmlim mentioned this issue Feb 25, 2016

Only make backups of corrupt xvgs #82

Merged

davidlmobley closed this as completed in #82 Feb 25, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Corrupt xvgs check slows down gromacs parser and makes unnecessary backups #81

Corrupt xvgs check slows down gromacs parser and makes unnecessary backups #81

nathanmlim commented Feb 25, 2016

davidlmobley commented Feb 25, 2016

nathanmlim commented Feb 25, 2016

davidlmobley commented Feb 25, 2016

Corrupt xvgs check slows down gromacs parser and makes unnecessary backups #81

Corrupt xvgs check slows down gromacs parser and makes unnecessary backups #81

Comments

nathanmlim commented Feb 25, 2016

davidlmobley commented Feb 25, 2016

nathanmlim commented Feb 25, 2016

davidlmobley commented Feb 25, 2016