Warning info attempt #899

dbochkov-flexcompute · 2023-05-23T22:10:06Z

Wanted to get some opinions on this attempt to trace warning info during object initialization @tylerflex, @momchil-flex

Basically, the idea is to have a dict trace in logger in which:

objects (any derivative of Tidy3dBaseModel) write hierarchically their info when they get initialized
logger write warnings (and anything else if needed)

so that it allows to trace where each of the warnings occurred exactly. It's only initialized and used during Tidy3dBaseModel.__init__() and cleared out after it is finished. In some more detail, for each initialized object we store hierarchically three lists/dicts:

"children": hashes of initialized children objects
"logs": warnings occured during initialization
"fields": names of parent's fields pointing at the current object (could be multiple if fields point at the same object)

This is what I get in tests/_test_warning_info.py:

{
    "3935611705871406881": {
        "children": {
            "4411795910099000673": {
                "children": {},
                "logs": [
                    [
                        "WARNING",
                        "A large number (30) of frequency points is used in a broadband source. This can slow down simulation time and is only needed if the mode fields are expected to have a very sharp frequency dependence. 'num_freqs' < 20 is sufficient in most cases."
                    ]
                ],
                "fields": [
                    [
                        "sources",
                        0
                    ]
                ],
                "type": "GaussianBeam"
            },
            "-4149933756106994820": {
                "children": {},
                "logs": [
                    [
                        "WARNING",
                        "The ``normal_dir`` field is relevant only for surface monitors and will be ignored for monitor flux, which is a box."
                    ]
                ],
                "fields": [
                    [
                        "monitors",
                        1
                    ]
                ],
                "type": "FluxMonitor"
            }
        },
        "logs": [
            [
                "WARNING",
                "Structure at structures[0] was detected as being less than half of a central wavelength from a PML on side x-min. To avoid inaccurate results, please increase gap between any structures and PML or fully extend structure through the pml. Skipping check for structure indexes > 0."
            ]
        ],
        "fields": [],
        "type": "Simulation"
    }
}

With some processing we can easily reconstruct

sim -> sources -> 0 -> (GaussianBeam) "A large number (30) of frequency points is used in a broadband source. This can slow down simulation time and is only needed if the mode fields are expected to have a very sharp frequency dependence. 'num_freqs' < 20 is sufficient in most cases."

sim -> monitors -> 1 -> (FluxMonitor) "The ``normal_dir`` field is relevant only for surface monitors and will be ignored for monitor flux, which is a box."

sim -> (Simulation) "Structure at structures[0] was detected as being less than half of a central wavelength from a PML on side x-min. To avoid inaccurate results, please increase gap between any structures and PML or fully extend structure through the pml. Skipping check for structure indexes > 0."

which kind of mimics pydantic info.

Not sure exactly if this would impact performance, maybe the best to access this experimentally. If you don't see any major downsides, red flags, we could probably work on this further.

By the way, currently we don't have many warnings in initialization of Simulation components, maybe just one or two more to the two already shown above. Most of warnings are at Simulation level, i.e., based on interplay between different components.

tylerflex · 2023-05-24T17:49:20Z

tidy3d/components/base.py

@@ -75,9 +75,89 @@ def __hash__(self) -> int:

    def __init__(self, **kwargs):
        """Init method, includes post-init validators."""
+
+        # initialize warning tracing and appoint ourselves as root who strated it


in the future, would be best to separate this into a couple methods, so the init method would look something like

def __init__(self, **kwargs): """Init method, includes post-init validators.""" self._initialize_log_cache() super().__init__(**kwargs) self._post_init_validators() self._consolidate_log_cache()

yeah, I just left it all exposed for now, if this is something we want to move forward with, then I'll definitely wrap it all into separate functions

dbochkov-flexcompute · 2023-05-24T18:05:10Z

@lucas-flexcompute

lucas-flexcompute · 2023-05-25T03:18:05Z

@dbochkov-flexcompute, @tylerflex I've tried to merge the capturing functionality with my caching PR over in #895. I have not tried to match ids (or hashes) to actual objects, I only logged them in a tree-like structure. I've moved the capture logic to the Logger class, because I believe that makes more sense (except for figuring out the children indexes and the parents fields). In any case, doing this map with hashes will be expensive. I'm not sure if the GUI has access to python ids, but they could use those to finish mapping when/if they need it, instead of doing in the python front-end. I've also used a try/finally block because we want to make sure we close capture contexts.

The previous caching/suppression was changed to work with the capture system. Suppressed messages on the global cache have highest priority, so if they are suppressed they wont be captured. Messages that are not displayed and added to the log consolidation are still captured (but the final consolidation summary is not). I believe this should be the desired behavior, right?

So we use global suppression for deprecation messages or things like letting the user know there is a remote mode solver the first time they use the local one. The consolidation cache is used within a context manager to control of emission of too many messages in loops: only the first is emitted and the remaining are consolidated, but all are captured if the capture functionality is enabled.

tylerflex · 2023-05-25T12:47:40Z

Thanks Lucas, I think this is good to define the desired logging behavior (with regard to caching, suppressing, and consolidating). And then work together to combine the functionality. I think I'd defer to both of you regarding the details regarding the specific information contained in the log and how this is architected.

Iin terms of general working behavior, my thinking is that maybe suppressed warnings should not be displayed in the python front end, but should still be captured in the log capture, that way the GUI or python front end can handle them however they like (eg, by placing an orange "warning" indicator on each structure that warns).

Essentially this would mean something like:

initialize object
capture all warnings and log messages into a single datastructure
display the appropriate amount of log messages by processing this datastructure.

Perhaps the context manager / caching could be used to flag log messages in step 2 to be handled in step 3? I'm not sure if you think this approach is feasible or potentially too difficult.

My thinking is that it might be good to have all of the warning info, however, my worry is that for hundreds of thousands of warnings or more, the log itself could be far too large to handle efficiently and it might end up slowing down operation?

Just a few thoughts

lucas-flexcompute · 2023-05-25T15:53:24Z

Right now we capture only within BaseModel initialization. Do you mean capture all the messages all the time? I think that might be too much depending on the user session. That would be the role of adding a file logger to store the complete session history.

Inverting the logic between suppression and capturing is easy enough, so that all messages are captured before suppression. In fact, right now, that would make no difference, because the capture and suppression contexts have no overlap, but it is good to have that correctly setup if it eventually happens.

@dbochkov-flexcompute What do you think about identifying the exact field and index of each children in the log tree? Is this something we could forward to GUI, since we have no use for that information on the python interface and (as it is) it seems to be an O(n²) operation?

dbochkov-flexcompute · 2023-05-25T16:40:53Z

In any case, doing this map with hashes will be expensive. I'm not sure if the GUI has access to python ids, but they could use those to finish mapping when/if they need it, instead of doing in the python front-end.

@dbochkov-flexcompute What do you think about identifying the exact field and index of each children in the log tree? Is this something we could forward to GUI, since we have no use for that information on the python interface and (as it is) it seems to be an O(n²) operation?

Regarding whether it's only for GUI or not, I remember we briefly touched upon possibly providing hierarchy paths to warnings on python frontend as well. But I am leaning to agree that it wouldn't actually be that useful on python frontend, because interaction with object initialization is quite different. On python frontend we typically initialize objects one by one, so we get warnings about simulation components instantly, and when we put them together into a Simulation object, they are already initialized, so not much of hierarchy to capture. In GUI we parse the entire simulation for errors and warnings at once from a file https://github.com/flexcompute/tidy3d-client-service/blob/master/framework-1.2/service.py#L47-L62, and there it makes more sense to capture the hierarchy of objects and warnings. Having said that, maybe the hierarchy capture should not be a default option and only triggered by hands.

Regarding how to map exactly, I think matching hashes after logging is done is definitely an alternative. But I have two concerns about it:

Performance-wise: it feels like matching smaller sets as we go (currently) should be more optimized and less time consuming compared to if we try to match the entire list of warnings to the entire object hierarchy in the end. Maybe something like O(k*n^2) vs O((k*n)^2). However, maybe the overhead of executing many smaller matchings compared to one big matching operations (something along the lines set(warnings).intersection(objects)) outweights the optimization gains.
It seems like to properly reconstruct paths to warnings to display verbally (e.g.: "simulation > structures > 2 > geometry > "something wrong with polyslab..."), GUI part would still need to construct a hierarchy of hashes and then parse it to make matchings. Is there a better way to do construct such an hierarchy compared to doing it in __init__ as in this PR? If not, maybe it would be better that this is provided by tidy3d team anyway since GUI team is not as familiar with Tidy3D internals as we are. For the second part, matching hashes to fields, can we pull up an object using its hash to find out what are its fields?

Given all that I am kinda leaning towards that it would be the easiest and not terribly performance degrading option just to have hierarchy capturing with field matching as a manually triggered functionality invoked only when parsing on GUI side

dbochkov-flexcompute · 2023-05-25T17:25:52Z

So, in some sense we could make it completely orthogonal: on python frontend turn on hashing but turn off tree capturing, when parsing on GUI side do the opposite (so we can mark all warned objects)

lucas-flexcompute · 2023-05-25T18:19:42Z

I decomposed the field mapping into first building a hash map of all fields in the model, that should improve performance quite a bit. It is also done piece by piece, instead of all at the end, as you mentioned. Please take a look at the current state of #895

I like the idea of shutting off this functionality on the python front-end. It could be a global config flag (td.config.capture_validation_logs that the GUI could set on initialization).

The logger is only printing this information at the moment. How do you see it interacting with the client services?

Also, feel free to push into the other branch, so that all changes are consolidated in a single place. (And move the discussion over there too?)

dbochkov-flexcompute · 2023-05-26T21:39:05Z

Closing this as it was implemented in #895

warning info attempt

4be3e7d

tylerflex reviewed May 24, 2023

View reviewed changes

tylerflex requested a review from lucas-flexcompute May 24, 2023 18:05

dbochkov-flexcompute closed this May 26, 2023

dbochkov-flexcompute deleted the daniil/warning-info branch September 5, 2023 23:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warning info attempt #899

Warning info attempt #899

dbochkov-flexcompute commented May 23, 2023

tylerflex May 24, 2023 •

edited

Loading

dbochkov-flexcompute May 24, 2023

dbochkov-flexcompute commented May 24, 2023

lucas-flexcompute commented May 25, 2023

tylerflex commented May 25, 2023

lucas-flexcompute commented May 25, 2023

dbochkov-flexcompute commented May 25, 2023

dbochkov-flexcompute commented May 25, 2023

lucas-flexcompute commented May 25, 2023

dbochkov-flexcompute commented May 26, 2023

Warning info attempt #899

Warning info attempt #899

Conversation

dbochkov-flexcompute commented May 23, 2023

tylerflex May 24, 2023 • edited Loading

Choose a reason for hiding this comment

dbochkov-flexcompute May 24, 2023

Choose a reason for hiding this comment

dbochkov-flexcompute commented May 24, 2023

lucas-flexcompute commented May 25, 2023

tylerflex commented May 25, 2023

lucas-flexcompute commented May 25, 2023

dbochkov-flexcompute commented May 25, 2023

dbochkov-flexcompute commented May 25, 2023

lucas-flexcompute commented May 25, 2023

dbochkov-flexcompute commented May 26, 2023

tylerflex May 24, 2023 •

edited

Loading