Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor FEInterface to speed up dof_indices? #101

Open
benkirk opened this issue Apr 26, 2013 · 4 comments
Open

Refactor FEInterface to speed up dof_indices? #101

benkirk opened this issue Apr 26, 2013 · 4 comments

Comments

@benkirk
Copy link
Member

benkirk commented Apr 26, 2013

I was looking into the DofMap::dof_indices()function to see what I might do to speed it up.

To see just how expensive e.g. FEInterface::n_dofs_at_node() is, I actually had to run the code through the preprocessor to see how all the macros expanded.

What I see is that it becomes a double switch which dispatches to another double switch, and it gets called obviously for each node on the element.

I was thinking about making the current n_dof_at_node() a private member of FEInterface, and using it to build up static arrays, say

_n_dofs_at_node[DIM][FETYPE][ELEM_TYPE][NODENUM];

but to do this right, it will probably involve cleaning up our FE_FAMILY enum so it is packed, which means writing a string or something instead of a number to the restart files.

@roystgnr, @jwpeterson, @pbauman, @friedmud - Thoughts on this or on the broader issue of FEInterface altogether?

-Ben

@roystgnr
Copy link
Member

On Fri, 26 Apr 2013, Benjamin S. Kirk wrote:

I was looking into the DofMap::dof_indices()function to see what I might do to speed it up.

Sounds great! It'll be interesting to figure out how to benchmark the
change well, though...

What I see is that it becomes a double switch which dispatches to another double switch, and it
gets called obviously for each node on the element.

I was thinking about making the current n_dof_at_node() a private member of FEInterface, and
using it to build up static arrays, say

_n_dofs_at_node[DIM][FETYPE][ELEM_TYPE][NODENUM];

Isn't "turn switch statements into branch tables" a common compiler
optimization anyway? Although that's a compiler optimization we may
be inadvertently disabling with our sparse enum. It'd be interesting
to start by just packing the enum and seeing if the code gets any
faster with g++ or icpc.

but to do this right, it will probably involve cleaning up our
FE_FAMILY enum so it is packed, which means writing a string or
something instead of a number to the restart files.

@roystgnr, @jwpeterson, @pbauman, @friedmud - Thoughts on this

I want to support old xdr/xda restart files indefinitely, so that's
going to require maintaining an "old number to new-number/string"
table somewhere.

I'd like us to move to something more extensible for a restart file
format in the future. I was originally thinking hdf5-based; now that
that describes exodus/nemesis too I'm wondering if it might be
possible to add extensions there. This might be too much work for us
to bite off right now, but it might be a reason to just use the old
numbering via translation in XDR/XDA I/O for now, then switch to a
less ambiguous string identifier at the same time we're changing the
rest of the format too.

or on the broader issue of FEInterface altogether?

I despised the mass of switches in FEInterface until roughly last
week, when I discovered myself having to recommend basically the exact
same design pattern to someone to make up for the inability to do

templated virtual functions in C++.

Roy

@roystgnr
Copy link
Member

Yaagh. I thought maybe the email-mangling that you guys had noticed had something to do with HTML email, but the quote characters and paragraph breaks in my alpine output just got shredded there.

@benkirk
Copy link
Member Author

benkirk commented Apr 26, 2013

Isn't "turn switch statements into branch tables" a common compiler optimization anyway? Although that's a compiler optimization we may be inadvertently disabling with our sparse enum. It'd be interesting to start by just packing the enum and seeing if the code gets any faster with g++ or icpc.

It is, but when we have a double switch that then makes a function call to another double switch in a different translation unit and don't usually do inter-procedural optimization I can't imagine a compiler in the world that could do that too efficiently...

As for the restart bit - yes, probably worth its own thread. In the meantime I can have a simple int->string mapping to handle old restart files.

@friedmud
Copy link
Member

On Fri, Apr 26, 2013 at 1:20 PM, roystgnr notifications@github.com wrote:

Sounds great! It'll be interesting to figure out how to benchmark the
change well, though...

I will be able to tell you how much of a speed up it is... we spend a
decent amount of time in dof_indices()....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants