Skip to content

Numerically stable sigmoid function #436

Merged
nkoukpaizan merged 6 commits into
developfrom
lukel/sigmoid-dev
Jun 8, 2026
Merged

Numerically stable sigmoid function #436
nkoukpaizan merged 6 commits into
developfrom
lukel/sigmoid-dev

Conversation

@lukelowry

@lukelowry lukelowry commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Description

Updates CommonMath smooth step function behavior and documentation, and removes a stale expected-failure marker for the New England contingency-analysis example now that it passes.

Proposed changes

  • Replace the logistic sigmoid with the stable tanh form.
  • Define the sigmoid scale MU in one place and use it consistently in sigmoid, ramp.
  • Add plots for the CommonMath documentation.
  • Remove dsigmoid as its unused.
  • Update CommonMath.md equations for the tanh sigmoid and smooth ramp form.
  • Remove WILL_FAIL TRUE from newengland_ca, since ContingencyAnalysis now completes successfully.

Checklist

  • All tests pass for the focused coverage listed below.
  • Code compiles cleanly with flags -Wall -Wpedantic -Wconversion -Wextra.
  • The new code follows GridKit™ style guidelines.
  • There are unit tests for the new code.
  • The new code is documented.
  • The feature branch is rebased with respect to the target branch.

Changelog changes N/A

Further comments

The ContingencyAnalysis fix was completely unexpected. I needed to fix this function for my feature branch of the REECB implementation to run correctly. I was happy to see that this fixed another issue.

@PhilipFackler How often did you encounter this when implementing ContingencyAnalysis? I am curious, that would be useful for related commentary in the paper.

I'd also like to note here that this likely means we can fix/remove the inconsistent scaling that exists in IEEET1 and TGOV1 and some other models, I think. The failures were originally thought to be a solver issue but it was a NaN/inf issue. Before this change I was unable to make MU much larger than 240 but now I can increase it to be very large, so I can make the piecewise approximations as very sharp which is helpful for validation purposes

@lukelowry lukelowry requested review from PhilipFackler and pelesh June 5, 2026 12:26
@lukelowry lukelowry added the bug Something isn't working label Jun 5, 2026
@lukelowry lukelowry force-pushed the lukel/sigmoid-dev branch from 17bb4ac to 7bc160b Compare June 5, 2026 18:40

@nkoukpaizan nkoukpaizan left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few minor comments.

The sensitivity of tanh to mu is different from exp (and to the abs forms), so it makes sense that NaNs occur at different values of mu.

@lukelowry please make sure to document (plot) the different forms we are exploring somewhere (PR, technical document or paper) for future reference.

Comment thread GridKit/CommonMath.hpp
Comment thread GridKit/CommonMath.hpp Outdated
Comment thread GridKit/CommonMath.md Outdated
\sigma(x) &= \dfrac{1}{1+e^{-\mu x}} \\
\rho(x) &= \dfrac{(\mu x+\lvert\mu x\rvert)/2+\log(1+e^{-\lvert\mu x\rvert})}{\mu} \\
\sigma(x) &= \dfrac{1}{2}\left(1+\tanh\left(\dfrac{\mu x}{2}\right)\right) \\
\rho(x) &= \dfrac{x+\lvert x\rvert}{2}+\dfrac{\ln(1+e^{-\mu\lvert x\rvert})}{\mu} \\

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed this in a previous merge, but the explanation of the "softplus" form for $\rho$ is no longer in the documentation. I'm now noticing this with the tanh versus exp . Could $\rho$ also be written with the same elemental functions as $\sigma$?

@lukelowry lukelowry Jun 7, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation of $\sigma$ before and after the proposed change is mathematically equivalent. it's just the numerical computation that's different. (i.e. the exp definition at very large negative values of $x$ computes like $\dfrac{1}{\infty}$ whereas the tanh version is identical result but stable)

Regarding $\rho$, to my knowledge, no, this can not use the same elemental functions and still be numerically stable.

I added justifications to the doc for clarity so this tracked information

@lukelowry

Copy link
Copy Markdown
Collaborator Author

@nkoukpaizan I went ahead and added plots since I already had some available. Let me know what you think.

@nkoukpaizan nkoukpaizan left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple of rendering issues for CommonMath.md on GH. After those are fixed, this should be good to merge.

@lukelowry

lukelowry commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator Author

There are a couple of rendering issues for CommonMath.md on GH. After those are fixed, this should be good to merge.

@nkoukpaizan Fixed, thank you for catching those rendering issues!

@lukelowry lukelowry force-pushed the lukel/sigmoid-dev branch from 808a046 to 3e350e7 Compare June 7, 2026 22:29
@nkoukpaizan nkoukpaizan merged commit 88426c2 into develop Jun 8, 2026
6 checks passed
@lukelowry lukelowry deleted the lukel/sigmoid-dev branch June 8, 2026 17:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants