Skip to content

Comments

fix: eliminate RuntimeWarnings in von Mises-Fisher loss backward pass#824

Merged
carlosm-silva merged 2 commits intographnet-team:mainfrom
The-Blenderers:main
Sep 8, 2025
Merged

fix: eliminate RuntimeWarnings in von Mises-Fisher loss backward pass#824
carlosm-silva merged 2 commits intographnet-team:mainfrom
The-Blenderers:main

Conversation

@carlosm-silva
Copy link
Collaborator

🎯 Summary

Fixes RuntimeWarnings in LogCMK.backward when processing arrays containing zero values, while maintaining mathematical accuracy and numerical stability.

🔬 Mathematical Background

This PR implements a numerically stable solution for computing gradients in the von Mises-Fisher loss function. The core issue was that while the mathematical limit:

$$\lim_{\kappa \to 0} \left(\frac{1}{\kappa} - \frac{1}{\tanh(\kappa)}\right) = 0$$

is well-defined, naive floating-point evaluation triggers division by zero warnings.

Detailed mathematical derivations and proofs are provided in:

  • Enhanced docstrings in src/graphnet/training/loss_functions.py
  • New documentation: docs/source/models/von_mises_fisher_mathematical_background.md

🔧 Changes

Code Changes

  • Fixed zero handling: Replaced np.where() with boolean masking to avoid double evaluation
  • Enhanced docstrings: Added comprehensive mathematical background with Google-style formatting
  • Comprehensive tests: Added test_logcmk_backward_zero_handling() with warning capture

Implementation Details

  • Small κ: Uses Taylor approximation -κ/3 for |κ| < 1e-6
  • Large κ: Uses exact formula 1/κ - 1/tanh(κ)
  • Error bound: Truncation error ≤ |κ|³/45 ≲ O(10⁻²¹) for threshold values

Documentation

  • Mathematical background: Complete derivations, Taylor series analysis, and error bounds
  • Implementation details: Explains boolean masking approach and threshold selection
  • Integration: Added to docs/source/models/models.rst

🧪 Testing

# Run the specific test
python -m pytest tests/training/test_loss_functions.py::test_logcmk_backward_zero_handling -v

# Expected: PASSED with NO RuntimeWarnings

Test Coverage

  • ✅ Zero values don't raise exceptions
  • ✅ No RuntimeWarnings generated
  • ✅ Gradients remain finite for all inputs
  • ✅ Mathematical accuracy: f(0) = 0 exactly
  • ✅ Correct Taylor approximation for small κ
  • ✅ Arrays with multiple zeros handled correctly

📊 Before/After

Before:

RuntimeWarning: divide by zero encountered in divide
RuntimeWarning: invalid value encountered in subtract

After:

tests/training/test_loss_functions.py::test_logcmk_backward_zero_handling PASSED [100%]

🔗 References

  • von Mises-Fisher distribution: Wikipedia
  • arXiv:1812.04616, Section 8.2
  • Original MIT License implementation by Max Ryabinin (2019)

✅ Checklist

  • Follows PEP8 conventions
  • Google-style docstrings with type hints
  • Comprehensive unit tests added
  • Mathematical background documented
  • No RuntimeWarnings generated
  • All existing tests pass
  • Clean implementation without code duplication

Mathematical accuracy preserved • Numerical stability achieved • Warnings eliminated

- Replace np.where with boolean masking to avoid double evaluation
- Add comprehensive unit tests for zero handling in LogCMK.backward
- Enhanced docstrings with mathematical background and implementation details
- Added mathematical background documentation in docs/source/models/

Fixes division by zero warnings while maintaining numerical accuracy.
Error bound analysis shows <1e-21 accuracy for |κ| < 1e-6.
Copy link

@shubhamos-ai shubhamos-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📚 Well-structured code! Consider adding more detailed documentation, implementing CI/CD pipelines, and adding performance monitoring for production readiness.

Copy link
Collaborator

@RasmusOrsoe RasmusOrsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @carlosm-silva thank you very much for this clean contribution!

I did a few checks on the compatibility of the gradients from this change in backward w.r.t. to the current implementation and was happy to find that they agree except for kappa=0.

When I took a closer look at the documentation introduced in the PR, I found that the math appears to be rendered incorrectly (see here). Could you double-check that it's OK?

Tagging @Aske-Rosted for completeness. @Aske-Rosted Do you have comments?

@@ -0,0 +1,200 @@
# Mathematical Background: von Mises-Fisher Loss Implementation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the math here appears not to render correctly in the markdown file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, it renders correctly when I copy and paste it into my local Markdown environment (Obsidian). The minus sign is being interpreted as the start of a list. I'm unfamiliar with GitHub's Markdown notation, so I'm unsure how to proceed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never mind, I fixed it in 510f4ba

Corrected mathematical expressions in the von Mises-Fisher documentation so they are properly rendered in Github markdown.
@RasmusOrsoe
Copy link
Collaborator

@carlosm-silva thank you for the quick iteration! The docs now look much better. I have no further comments at this stage.

Lets give @Aske-Rosted a chance to catch up before we proceed

@Aske-Rosted
Copy link
Collaborator

Hi @carlosm-silva,

Thanks for this very well documented contribution!
I too think that this looks ready to go and do not have any further comments or suggestions.

@RasmusOrsoe RasmusOrsoe self-requested a review September 5, 2025 07:21
Copy link
Collaborator

@RasmusOrsoe RasmusOrsoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@carlosm-silva carlosm-silva merged commit 1bb42ad into graphnet-team:main Sep 8, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants