Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Row of Zeros in Jacobian for MechanicsWithTemperature Problem #3

Open
JustinClough opened this issue Apr 21, 2020 · 14 comments
Open

Row of Zeros in Jacobian for MechanicsWithTemperature Problem #3

JustinClough opened this issue Apr 21, 2020 · 14 comments

Comments

@JustinClough
Copy link
Contributor

I'm trying to add temperature as a DOF to the same panel-problem I've been working on. (Same as what I presented on back at the user's meeting.) I tried to follow this example but I'm getting an error that reads:

SuperLU's gsequ function returned with info = 4 > 0, and info <= A.nrow = 18904.  This means that row 4 of A is exactly zero.

I take this to mean there's a row of all zeros in my Jacobian; is that correct? My first thought was to check the BC's. But checking and double checking, both the mechanics and thermal problems have enough Dirichlet BC's to ground their respective problems. Any advice on what to look at next?

For completeness:
I'm using this version of Albany: 9f4290e
With this version of Trilinos: trilinos/Trilinos@a0ba716
My full output file is attached as well as the input and material yamls and the mesh (.msh) file.
mesh_and_lable_plate.msh.TXT
output.txt
input.yaml.TXT
material.yaml.TXT

@JustinClough
Copy link
Contributor Author

Related, what does the Thermal Transient Coefficient represent/do?

I've found that it's assigned in the MechanicsWithTemperature module as a material parameter (examples here, here, and here) that all give it a value of 1.0 or zero. I also found that it's a required parameter as the model won't run without it set.

I found that it's first used here and then used in transport evaluators like this one.

@lxmota, I was told that you might have some insights on this variable?

@lxmota
Copy link
Collaborator

lxmota commented Apr 21, 2020

@JustinClough The thermal transient coefficient is just a switch to turn on or off terms that depend on the time derivative of the variable in question. In your case, tdot. If these transient terms are significant in your simulation, the transient coefficient should be 1.0.

As for the Jacobian being potentially singular, I see that you are using very different settings for the solver to those I normally use. This is what I have normally:

    NOX:
      Direction:
        Method: Newton
        Newton:
          Forcing Term Method: Constant
          Rescue Bad Newton Solve: true
          Linear Solver:
            Tolerance: 1.0e-06
          Stratimikos Linear Solver:
            NOX Stratimikos Options: { }
            Stratimikos:
              Linear Solver Type: Belos
              Linear Solver Types:
                Belos:
                  VerboseObject:
                    Verbosity Level: none
                  Solver Type: Block GMRES
                  Solver Types:
                    Block GMRES:
                      Convergence Tolerance: 1.0e-06
                      Output Frequency: 1
                      Output Style: 1
                      Verbosity: 0
                      Maximum Iterations: 200
                      Block Size: 1
                      Num Blocks: 200
                      Flexible Gmres: false
              Preconditioner Type: MueLu
              Preconditioner Types:
                Ifpack2:
                  Prec Type: ILUT
                  Overlap: 1
                  Ifpack2 Settings:
                    'fact: ilut level-of-fill': 3.0
                MueLu:
                  verbosity: none
                  number of equations: 3
                  'coarse: max size': 500
                  multigrid algorithm: sa
                  max levels: 4
                  'aggregation: type': uncoupled
                  'aggregation: drop scheme': classical
                  'smoother: type': CHEBYSHEV
                  'smoother: params':
                    'chebyshev: degree': 2
                    'chebyshev: ratio eigenvalue': 7.0
                    'chebyshev: min eigenvalue': 1.0
                    'chebyshev: zero starting solution': true
                  'smoother: pre or post': both
                  'repartition: enable': true
                  'repartition: partitioner': zoltan2
                  'repartition: start level': 2
                  'repartition: min rows per proc': 800
                  'repartition: max imbalance': 1.1
                  'repartition: remap parts': false
      Line Search:
        Method: Backtrack
        Full Step:
          Full Step: 1.0
      Nonlinear Solver: Line Search Based
      Printing:
        Output Precision: 3
        Output Processor: 0
        Output Information:
          Error: true
          Warning: false
          Outer Iteration: true
          Parameters: false
          Details: false
          Linear Solver Details: false
          Stepper Iteration: true
          Stepper Details: false
          Stepper Parameters: false
      Solver Options:
        Status Test Check Type: Complete
      Status Tests:
        Test Type: Combo
        Combo Type: OR
        Number of Tests: 5
        Test 0:
          Test Type: RelativeNormF
          Tolerance: 1.0e-08
        Test 1:
          Test Type: MaxIters
          Maximum Iterations: 6
        Test 2:
          Test Type: Combo
          Combo Type: AND
          Number of Tests: 2
          Test 0:
            Test Type: NStep
            Number of Nonlinear Iterations: 2
          Test 1:
            Test Type: NormF
            Tolerance: 1.0e-06
        Test 3:
          Test Type: FiniteValue
        Test 4:
          Test Type: NormF
          Tolerance: 1.0e-06

The tolerance may not be appropriate for your problem, but I don't use Amesos2 or SuperLU at all, so I don't even know if those are working. Maybe it's worth trying the settings above.

Also, going forward, I won't be able to support the version of LCM hosted on SNLComputation. I forked Albany for the LCM project, and it's located here:

https://github.com/lxmota/Albany

Since you are using LCM only, consider using that fork instead. It's already diverging significantly, and so it's the only version of Albany for which I'll be able to answer questions due to limited time and resources.

@ikalash
Copy link
Collaborator

ikalash commented Apr 21, 2020

@lxmota : thanks for helping Justin! Justin needs SCOREC, so I am afraid he cannot use your fork.

@lxmota
Copy link
Collaborator

lxmota commented Apr 21, 2020

@ikalash As far as I know, he is using Gmesh, as I remember when I worked with him in ABQ, and from the input files above.

But you are right, if @JustinClough requires SCOREC, I removed that from my fork.

@JustinClough
Copy link
Contributor Author

@lxmota Thanks for explaining the thermal transient coefficient variable. I swapped out the solver in my input for yours and (after some more tweaking) it seems to be working!

Yes, I am using Gmsh for geometry and meshing. I'm planning on sharing a fork soon with the group from RPI and they'll need SCOREC compatibility. I will keep your fork in mind.

I'll leave this issue open for now and update it later today with how the run with Alejandro's input yaml goes in case I still have questions or run into problems.

@ikalash
Copy link
Collaborator

ikalash commented Apr 22, 2020

Glad to hear it's working @JustinClough ! Regarding SCOREC: I think RPI is moving away from SCOREC as well, at least this is what Antoinette said a few weeks ago. Perhaps given this situation it makes sense for USC and RPI to work in @lxmota 's branch? @RuixiongHu can probably weigh in.

@JustinClough
Copy link
Contributor Author

Sorry for the late reply. I tried using the solver parameters that you gave above, @lxmota. It was able to take a few time steps but only by reducing the step size ridiculously small. I found a step size of 0.1s to work for just the mechanics problem but this was reducing it to 10^-7.

Is there any reason why the issue I was having with the direct solver wouldn't also come up with the iterative solver?

Also, I've been trying to print the jacobian to file but haven't been able to. I thought I just needed to add:

  Debug Output:
    Write Jacobian to MatrixMartket: 1

near the top of the input file. Is there something elsewhere in the input file that I need to add or change?

@maxrpi
Copy link
Contributor

maxrpi commented Apr 24, 2020 via email

@JustinClough
Copy link
Contributor Author

No, not a cut and paste but thanks for catching the typo. It is spelled correctly in the input file, I just mis-typed when I wrote that comment.

@ikalash
Copy link
Collaborator

ikalash commented Apr 24, 2020

Try setting it to -1. That will dump all the Jacobians.

@JustinClough
Copy link
Contributor Author

Now that I have the Jacobians from my problem, it looks like the very first one Albany constructs has all zeros in every 4th row. Any Jacobian after that (either in the same time step but a different Newton solve, or in another time step) looks fine. I didn't check any values by hand, just going by the magnitude of the non-zeros and where they are in the matrix.

I did some digging and found that this is happening in one of the tests too. But not in one that's very similar.

I'll try to look into what the special difference between the two is. I know LCM isn't part of the current master branch, but is it okay to leave this issue open until I resolve it?

@ikalash
Copy link
Collaborator

ikalash commented Apr 27, 2020

The issue is what I hypothesized when you first mentioned this issue. The temperature gets evaluated here: https://github.com/lxmota/Albany/blob/master/src/LCM/evaluators/transport/TransportResidual_Def.hpp. You will see there is a lot of logic there to add to the evaluator only if delta_time_(0.0) > 0, e.g. https://github.com/lxmota/Albany/blob/master/src/LCM/evaluators/transport/TransportResidual_Def.hpp#L226 . This is the cause of the problem. @calleman21 put this option in, and has explained it to me several times in the past, but I don't recall the details of why the logic is there. I think it has to do with a hack-ish way to initialize the temperature dofs. @lxmota do you remember what Coleman was intending to accomplish when he put in the logic? You could try removing the delta_time_(0.0) logic and see what happens.

@lxmota
Copy link
Collaborator

lxmota commented Apr 28, 2020

@ikalash @JustinClough Coleman put that in there because he was having trouble getting negative values for delta_time. That is a huge problem for rate dependent materials such as the Crystal Plasticity model. Things have changed so much since that code was first introduced, that I don't know if it's needed anymore. Yes, you could try just removing that logic.

@JustinClough
Copy link
Contributor Author

I think the negative delta_time bug did get sorted out a few months ago (https://github.com/SNLComputation/Albany/issues/517) so I think it'd be safe to pull the logic check in the evaluator.

I need to wrap up a few thing before the semester ends so I probably won't be able to make the edit any time soon. Once I do get around to it, I'll make the changes in the RPI-USC fork. Thanks for the help!

@lxmota lxmota transferred this issue from sandialabs/Albany Apr 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants