Refactor FEInterface to speed up dof_indices? #101

benkirk · 2013-04-26T19:02:39Z

I was looking into the DofMap::dof_indices()function to see what I might do to speed it up.

To see just how expensive e.g. FEInterface::n_dofs_at_node() is, I actually had to run the code through the preprocessor to see how all the macros expanded.

What I see is that it becomes a double switch which dispatches to another double switch, and it gets called obviously for each node on the element.

I was thinking about making the current n_dof_at_node() a private member of FEInterface, and using it to build up static arrays, say

_n_dofs_at_node[DIM][FETYPE][ELEM_TYPE][NODENUM];

but to do this right, it will probably involve cleaning up our FE_FAMILY enum so it is packed, which means writing a string or something instead of a number to the restart files.

@roystgnr, @jwpeterson, @pbauman, @friedmud - Thoughts on this or on the broader issue of FEInterface altogether?

-Ben

roystgnr · 2013-04-26T19:20:24Z

On Fri, 26 Apr 2013, Benjamin S. Kirk wrote:

I was looking into the DofMap::dof_indices()function to see what I might do to speed it up.

Sounds great! It'll be interesting to figure out how to benchmark the
change well, though...

What I see is that it becomes a double switch which dispatches to another double switch, and it
gets called obviously for each node on the element.

I was thinking about making the current n_dof_at_node() a private member of FEInterface, and
using it to build up static arrays, say

_n_dofs_at_node[DIM][FETYPE][ELEM_TYPE][NODENUM];

Isn't "turn switch statements into branch tables" a common compiler
optimization anyway? Although that's a compiler optimization we may
be inadvertently disabling with our sparse enum. It'd be interesting
to start by just packing the enum and seeing if the code gets any
faster with g++ or icpc.

but to do this right, it will probably involve cleaning up our
FE_FAMILY enum so it is packed, which means writing a string or
something instead of a number to the restart files.

@roystgnr, @jwpeterson, @pbauman, @friedmud - Thoughts on this

I want to support old xdr/xda restart files indefinitely, so that's
going to require maintaining an "old number to new-number/string"
table somewhere.

I'd like us to move to something more extensible for a restart file
format in the future. I was originally thinking hdf5-based; now that
that describes exodus/nemesis too I'm wondering if it might be
possible to add extensions there. This might be too much work for us
to bite off right now, but it might be a reason to just use the old
numbering via translation in XDR/XDA I/O for now, then switch to a
less ambiguous string identifier at the same time we're changing the
rest of the format too.

or on the broader issue of FEInterface altogether?

I despised the mass of switches in FEInterface until roughly last
week, when I discovered myself having to recommend basically the exact
same design pattern to someone to make up for the inability to do

templated virtual functions in C++.

Roy

roystgnr · 2013-04-26T19:22:56Z

Yaagh. I thought maybe the email-mangling that you guys had noticed had something to do with HTML email, but the quote characters and paragraph breaks in my alpine output just got shredded there.

benkirk · 2013-04-26T19:27:09Z

Isn't "turn switch statements into branch tables" a common compiler optimization anyway? Although that's a compiler optimization we may be inadvertently disabling with our sparse enum. It'd be interesting to start by just packing the enum and seeing if the code gets any faster with g++ or icpc.

It is, but when we have a double switch that then makes a function call to another double switch in a different translation unit and don't usually do inter-procedural optimization I can't imagine a compiler in the world that could do that too efficiently...

As for the restart bit - yes, probably worth its own thread. In the meantime I can have a simple int->string mapping to handle old restart files.

friedmud · 2013-04-26T19:49:21Z

On Fri, Apr 26, 2013 at 1:20 PM, roystgnr notifications@github.com wrote:

Sounds great! It'll be interesting to figure out how to benchmark the
change well, though...

I will be able to tell you how much of a speed up it is... we spend a
decent amount of time in dof_indices()....

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor FEInterface to speed up dof_indices? #101

Refactor FEInterface to speed up dof_indices? #101

benkirk commented Apr 26, 2013

roystgnr commented Apr 26, 2013

roystgnr commented Apr 26, 2013

benkirk commented Apr 26, 2013

friedmud commented Apr 26, 2013

Refactor FEInterface to speed up dof_indices? #101

Refactor FEInterface to speed up dof_indices? #101

Comments

benkirk commented Apr 26, 2013

roystgnr commented Apr 26, 2013

templated virtual functions in C++.

roystgnr commented Apr 26, 2013

benkirk commented Apr 26, 2013

friedmud commented Apr 26, 2013