Fix warning capture #1136

dbochkov-flexcompute · 2023-08-31T17:28:46Z

While trying to incorporate warnings capture into GUI, MC team noticed that when we have validation errors, setting td.log.set_capture(True) changes returned validation errors to something not very informative. This happens because during initialization of models we use try-finally in Tidy3dBaseModel.__init__() https://github.com/flexcompute/tidy3d/pull/1136/files#diff-d331aef8c3290adc3733d4ce375c5a4b139ad41bd5634113ca404cc827e3d949L76-R82 to try to finish warning capture no matter what. But our warning capture parsing assumes the models are well-built. So, when there are validation errors the instances are incomplete and parsing produces its own errors that are not related to simulation validation.

A seemingly straightforward way to fix this is to just remove try-finally construction, so that we don't attempt to built warning capture tree if there are any errors. Initially, it wouldn't work as intended, but after I added missing discriminator=TYPE_TAG_STR everything is fine. I believe this is because when there is no discriminator pydantic tries to initialize with every model possible and those that fail, exit before reaching log.end_capture(self) in Tidy3dBaseModel.__init__(). Previously finally: would ensure we still execure that.

Additional improvements:

added log.start_capture()/log.end_capture(self) into Simulation.validate_pre_upload() to capture those warnings too
added an optional argument custom_loc: List = None to log.warning() so that additional information can be provided about warning location. This is useful for Simulation validators, which otherwise would just point at the simulation itself.
added an optional argument capture: bool = True to log.warning() to avoid capturing not very informative and somewhat arbitrary frequency passed to 'Medium.eps_model()' is outside of 'Medium.frequency_range' = ...,
expanded warning capture test significantly to produce and capture pretty much every possible warning. Also added checks that td.log.set_capture(True) doesn't screw anything if there are validation errors.

Note that all these changes doesn't affect python client warning behavior. This only matters for output of td.log.captured_warnings() when td.log.set_capture(True)

dbochkov-flexcompute · 2023-09-05T15:16:19Z

Having issues with having this PR pass github tests. Everything seems fine on linux and macos, but failing on windows https://github.com/flexcompute/tidy3d/actions/runs/6086188923. There two issues:

monitor storage size for (atleast) FieldTimeMonitor seems to be different on linux vs windows. In my test simulation setup on linux I get {'flux': 44.0, 'mode': 864.0, 'time': 19296371088.0, 'n2f_monitor': 1440.0} for sim.monitor_data_sizes but in windows test runs it is {'flux': 44.0, 'mode': 864.0, 'time': 2116501904.0, 'n2f_monitor': 1440.0} https://github.com/flexcompute/tidy3d/actions/runs/6086188923/job/16511966069#step:5:446
the second type of issues is not being able to get hash of jax-related objects. weird that it shows up only in windows tests

          # Check the class type and its superclasses for a matching encoder
          for base in obj.__class__.__mro__[:-1]:
              try:
                  encoder = ENCODERS_BY_TYPE[base]
              except KeyError:
                  continue
              return encoder(obj)
          else:  # We have exited the for loop without finding a suitable encoder
  >           raise TypeError(f"Object of type '{obj.__class__.__name__}' is not JSON serializable")
  E           jax._src.traceback_util.UnfilteredStackTrace: TypeError: Object of type 'JVPTracer' is not JSON serializable

@momchil-flex @tylerflex let me know if you have any suggestions

tylerflex · 2023-09-05T15:25:52Z

Hm, not sure about the monitor size thing, but in the meantime it could be possible to just change the test to relax the monitor sizes a bit until we figure it out.

Regarding the jax part. So the JVPTracer is what jax uses to store the gradient information. It can't be serialized to json unfortuantely. Does this come up when trying to serialize a warning or error due to the changes in this PR? I'm having trouble pinpointing where this is happening in the code. I also have no idea why this would occur in windows and not the other OS :/

tylerflex · 2023-09-05T15:37:38Z

for some reason I couldn't see the error but now I do, so yea I guess

tidy3d\log.py:167: in end_capture
[1056](https://github.com/flexcompute/tidy3d/actions/runs/6086188923/job/16511966069?pr=1136#step:5:1057)
      model_fields = model.get_submodels_by_hash()

I wonder if we need to instead wrap whatever block this is in a try except TypeError and then find an alternative way to handle unhashable objects? Or we could try a different way to get the model fields, I think model.__fields__ should work?

dbochkov-flexcompute · 2023-09-05T16:15:10Z

Actually, after looking more carefully, .get_submodels_by_hash() shouldn't be even triggered in those test cases. It should only be invoked if log.set_capture(True) is called. I guess what happened is that because of the failed test log.set_capture(False) was not called, so any test afterwards continued warning capture. Making the very first failed test passed (and successfully calling log.set_capture(False)), removed fails in those adjoint tests as well.

We still need to figure out why calculated monitor size is different on different OS's, and make warning capture work with adjoint related objects. The latter would be needed when GUI start integrating the adjoint feature.

For now, this PR passes tests and ready to be reviewed

tylerflex

Thanks @dbochkov-flexcompute . it looks good to me, I dont 100% follow all of the details so feel free to get another review if you are unsure, but given the tests are passing and the issue seems cleared up, I approve. The improvements are also quite nice , thanks 👍

tylerflex · 2023-09-05T16:47:43Z

tests/test_package/test_log.py

@@ -89,49 +93,166 @@ def test_logging_warning_capture():
    monitor_flux = td.FluxMonitor(
        center=(0, 0, 0),
        size=(8, 8, 8),
-        freqs=freqs,
+        freqs=list(freqs),


did this not work with freqs passed as a numpy array? I would have thought ArrayLike would handle this?

no, I was just playing around with adding frequencies outside of source range (something like freqs=list(freqs) + [1]), and this remained like that accidentally

dbochkov-flexcompute force-pushed the daniil/fix-warning-capture branch from f27f045 to ae141c4 Compare August 31, 2023 17:29

momchil-flex added the 2.4 label Sep 1, 2023

dbochkov-flexcompute added 2 commits September 4, 2023 16:31

warnings capture bugfix

f80abe3

adding custom locations to non-init warnings

f434a1f

dbochkov-flexcompute force-pushed the daniil/fix-warning-capture branch from ae141c4 to f434a1f Compare September 4, 2023 21:36

figuring out windows test fail

c709f5c

dbochkov-flexcompute force-pushed the daniil/fix-warning-capture branch from 0357117 to c709f5c Compare September 5, 2023 15:44

dbochkov-flexcompute marked this pull request as ready for review September 5, 2023 16:15

dbochkov-flexcompute requested a review from tylerflex September 5, 2023 16:15

tylerflex approved these changes Sep 5, 2023

View reviewed changes

momchil-flex merged commit 10ef9bf into pre/2.4 Sep 6, 2023
14 checks passed

momchil-flex deleted the daniil/fix-warning-capture branch September 6, 2023 21:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix warning capture #1136

Fix warning capture #1136

dbochkov-flexcompute commented Aug 31, 2023 •

edited

Loading

dbochkov-flexcompute commented Sep 5, 2023

tylerflex commented Sep 5, 2023

tylerflex commented Sep 5, 2023

dbochkov-flexcompute commented Sep 5, 2023

tylerflex left a comment

tylerflex Sep 5, 2023

dbochkov-flexcompute Sep 5, 2023

Fix warning capture #1136

Fix warning capture #1136

Conversation

dbochkov-flexcompute commented Aug 31, 2023 • edited Loading

dbochkov-flexcompute commented Sep 5, 2023

tylerflex commented Sep 5, 2023

tylerflex commented Sep 5, 2023

dbochkov-flexcompute commented Sep 5, 2023

tylerflex left a comment

Choose a reason for hiding this comment

tylerflex Sep 5, 2023

Choose a reason for hiding this comment

dbochkov-flexcompute Sep 5, 2023

Choose a reason for hiding this comment

dbochkov-flexcompute commented Aug 31, 2023 •

edited

Loading