Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EnergyPlus crash when having unused constructions and conduction finite difference algorithms #8919

Closed
1 of 3 tasks
jcyuan2020 opened this issue Jul 26, 2021 · 8 comments · Fixed by #8937
Closed
1 of 3 tasks
Assignees

Comments

@jcyuan2020
Copy link
Contributor

jcyuan2020 commented Jul 26, 2021

Issue overview

It was reported that while running an up-converted v9.5 idf, the program that uses conduction finite difference algorithms would crash. However, the original v9.1 file the program runs fine.

An investigation reveals that the root cause for the crash is the "unused" construction surfaces in IDF files. EnergyPlus would not proceed to get a non-zero time finite difference time step when it encounters an "unused" construction--an "unused" construction means that the construction is defined but has never been referenced anyplace else in the idf. The zero finite difference time step will cause a divide-by-zero error in a function that will be called later, thus causing the program to crash.

The associated code blocks for the problems are here:

if (!state.dataConstruction->Construct(ConstrNum).IsUsed) continue;

Here it will skip some blocks if the construction is "unused"; as a result of the skipping, the time step will not get updated and will still keep the zero initialization here:
ConstructFD(ConstrNum).DeltaTime = Delt;

Then it will finally cause the div-by-zero error here and the program will crash:

for (int J = 1, J_end = nint(state.dataGlobal->TimeStepZoneSec / Delt); J <= J_end; ++J) { // PT testing higher time steps

This should be considered as a code bug as such "unused" constructions, although redundant, should still be tolerated and should not be even be treated as errors (some non-severe warning might be adequate). However, definitely it should not have caused a program crash.

Some code changes are needed to fix the bug. However, for such an IDF file to run correctly, a temporary workaround is to check the construction definitions and remove all "unused" constructions.

Details

Some additional details for this issue (if relevant):

  • Platform (Operating system, version): All
  • Version of EnergyPlus (if using an intermediate build, include SHA): Recent versions such as v9.5 and v9.6 pre-release have the problem; some earlier versions such as v9.1 do not have such a problem.
  • Helpdesk ticket number: 15755

Checklist

Add to this list or remove from it as applicable. This is a simple templated set of guidelines.

  • Defect file added (list location of defect file here)
  • Ticket added to Pivotal for defect (development team task)
  • Pull request created (the pull request will have additional tasks related to reviewing changes that fix this defect)
@RKStrand RKStrand self-assigned this Jul 26, 2021
@RKStrand
Copy link
Contributor

@jcyuan2020 Just checking--are you working on this or can I take this on? Thanks!

@RKStrand
Copy link
Contributor

@jcyuan2020 Also, do you have a sample idf that shows the crash? I took a file from the test suite that uses Conduction Finite Difference, added an unused construction, and ran it. It didn't crash so I'm wondering if there are some more specific conditions that cause the crash that the user noted.

@jcyuan2020
Copy link
Contributor Author

jcyuan2020 commented Jul 26, 2021

@RKStrand Please feel free to go head with the fix, and the defect case files are sent to your email.
I am not sure if the generalization applies to all the cases with some extra unused construction. But for this particular case, the following lines are related to the problem:

if (!state.dataConstruction->Construct(ConstrNum).IsUsed) continue;

Here it will skip some blocks if the construction is "unused"; as a result of the skipping, the time step will not get updated and will still keep the zero initialization here:
ConstructFD(ConstrNum).DeltaTime = Delt;

Then it will finally cause the numerical div-by-zero error here and the program will crash:

for (int J = 1, J_end = nint(state.dataGlobal->TimeStepZoneSec / Delt); J <= J_end; ++J) { // PT testing higher time steps

@RKStrand
Copy link
Contributor

Thanks, @jcyuan2020!

@RKStrand
Copy link
Contributor

This is interesting. I saw the crash with the supplied file and went into the debugger. It crashes at a place where it is trying to assign a local "auto const" where the index for the array ends up being zero. It is doing this for an internally created interzone surface. There are other interzone surfaces in the test suite that use CondFD. However, for those few files in the test suite that combine CondFD and interzone surfaces, the construction for the interzone surfaces is always a symmetrical construction that when reversed is the same. The test file that is crashing is non-symmetrical--two layers that are not the same so when reversed it is not the same. What appears to be happening in the code is that this is not handled properly? In any case, if I take this file and make the interzone construction symmetrical (by eliminating one of the material layers), the file runs to completion without a crash. So, I think that is good evidence that something is not right in the way the CondFD code is handling this type of situation. Or, it's a problem in the creation of the internally accounted for interzone surfaces, maybe specifically the case where the construction is not symmetrical. One of the unused constructions is the correct reverse description for the interzone surface but I think it isn't being used because the user file is going with the internally defined interzone surface method.

@mjwitte
Copy link
Contributor

mjwitte commented Jul 29, 2021

That should have produced a warning at least about the mismatched constructions, but maybe that check doesn't happen for CondFD the same way it does for CTF? And I thought the auto-generated surface searched for a companion reverse construction, but I haven't looked to see where that happens in the code.

Example file SingleFamilyHouse_HP_Slab.idf is one of several that uses the Zone outside boundary condition and the construction is asymmetrical.

@RKStrand
Copy link
Contributor

Update: it seems to be limited to the auto-generated interzone surfaces with non-symmetrical constructions. I took one of the test suite input files that use CondFD and has interzone partitions (but defined in both zones--not autogenerated in the second), modified a construction for one set to be non-symmetrical, and this file doesn't crash.

@mjwitte But the example file SingleFamilyHouse_HP_Slab.idf does not use CondFD, does it? I'm not finding any HeatBalanceAlgorithm statement in that file which means it defaults to CTF. It looks like this is specifically limited to when CondFD is used in conjunction with auto-defined interzone partitions that have a non-symmetrical construction. I don't think we have any input files that explore that possibility?

RKStrand added a commit that referenced this issue Jul 30, 2021
This commit includes the fix and the unit test for defect #8919.  Under a particular condition (CondFD used, interzone surface that has another zone as the outside boundary condition causing E+ to create it internally, and the user supplying the reversed construction but it isn't used by any user defined surface), a variable was not being set (IsUsed) for the reversed construction causing a crash (array index of zero).  This is now fixed and this particular case trapped by the new unit test.
@RKStrand
Copy link
Contributor

RKStrand commented Jul 30, 2021

Subtle, very subtle. For this particular case (CondFD solution, interzone surface that is defined with a zone outside boundary condition forcing E+ to internally create the other surface, where the user creates the reversed construction but it is not used by any user defined surface), IsUsed was not being set to true like it should have. This meant that in the CondFD routines, it was skipping the reversed construction when it was supposed to figure out the number of nodes. As a result, the number of nodes remained at 0 and this resulted in an array index issue (array index was 0) that caused EnergyPlus to crash. It is fixed now (user input file that showed the defect now runs successfully)...ci running (shouldn't bring up any problems) and PR on its way...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants