Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proper equality checking between tidy3d base model #1237

Merged
merged 1 commit into from
Nov 8, 2023
Merged

Conversation

tylerflex
Copy link
Collaborator

@tylerflex tylerflex commented Nov 7, 2023

Would be good to check:

  • my logic, just in case I screwed something up.
  • whether this slows down the backend in any meaningful way, as the equality checking is more robust.

Previously, we were checking equality between tidy3d components by comparing their _json_strings. However, these did not contain DataArray objects, so this equality checking was not complete.

If we compare at a self.dict() == other.dict() we run into problems if np.ndarray are present, because we don't make it clear we want to take all().

This PR does the equality checking via a recursive function, which handles various edge cases, including the np.ndarray.

Once this was working, it actually exposed a few bugs in our test_IO because the loaded and saved sims were not the same as their data had changed due to some updates in CustomMedium. This was fixed by comparing json_strings in the tests.

Note: this also fixes #1235 which was caused by the merging of two CustomMediums when they shouldn't have been merged (due to this lax equality checking)

There is one test failing in test_log.py @lucas-flexcompute could you look at it when you get time? I'm not sure why it would fail due to this PR, but this is the error message. Seems the simulation can't be validated?

______________________ test_logging_warning_capture _______________________

    def test_logging_warning_capture():
        # create sim with warnings
        domain_size = 12

        wavelength = 1
        f0 = td.C_0 / wavelength
        fwidth = f0 / 10.0
        source_time = td.GaussianPulse(freq0=f0, fwidth=fwidth)
        freqs = np.linspace(f0 - fwidth, f0 + fwidth, 11)

        # 1 warning: too long run_time
        run_time = 10000 / fwidth

        # 2 warnings: frequency outside of source frequency range; too many points
        mode_mnt = td.ModeMonitor(
            center=(0, 0, 0),
            size=(domain_size, 0, domain_size),
            freqs=list(freqs) + [0.1],
            mode_spec=td.ModeSpec(num_modes=3),
            name="mode",
        )

        # 2 warnings: too high num_freqs; too many points
        mode_source = td.ModeSource(
            size=(domain_size, 0, domain_size),
            source_time=source_time,
            mode_spec=td.ModeSpec(num_modes=2, precision="single"),
            mode_index=1,
            num_freqs=50,
            direction="-",
        )

        # 1 warning: ignoring "normal_dir"
        monitor_flux = td.FluxMonitor(
            center=(0, 0, 0),
            size=(8, 8, 8),
            freqs=list(freqs),
            name="flux",
            normal_dir="+",
        )

        # 1 warning: large monitor size
        monitor_time = td.FieldTimeMonitor(
            center=(0, 0, 0),
            size=(2, 2, 2),
            stop=1 / fwidth,
            name="time",
        )

        # 1 warning: too big proj distance
        proj_mnt = td.FieldProjectionCartesianMonitor(
            center=(0, 0, 0),
            size=(2, 2, 2),
            freqs=[250e12, 300e12],
            name="n2f_monitor",
            custom_origin=(1, 2, 3),
            x=[-1, 0, 1],
            y=[-2, -1, 0, 1, 2],
            proj_axis=2,
            proj_distance=1e10,
            far_field_approx=False,
        )

        # 2 warnings * 4 sources = 8 total: too close to each PML
        # 1 warning * 3 DFT monitors = 3 total: medium frequency range does not cover monitors freqs
        box = td.Structure(
            geometry=td.Box(center=(0, 0, 0), size=(11.5, 11.5, 11.5)),
            medium=td.Medium(permittivity=2, frequency_range=[0.5, 1]),
        )

        # 2 warnings: inside pml
        box_in_pml = td.Structure(
            geometry=td.Box(center=(0, 0, 0), size=(domain_size * 1.001, 5, 5)),
            medium=td.Medium(permittivity=10),
        )

        # 2 warnings: exactly on sim edge
        box_on_boundary = td.Structure(
            geometry=td.Box(center=(0, 0, 0), size=(domain_size, 5, 5)),
            medium=td.Medium(permittivity=20),
        )

        # 1 warning: outside of domain
        box_outside = td.Structure(
            geometry=td.Box(center=(50, 0, 0), size=(domain_size, 5, 5)),
            medium=td.Medium(permittivity=6),
        )

        # 1 warning: too high "num_freqs"
        # 1 warning: glancing angle
        gaussian_beam = td.GaussianBeam(
            center=(4, 0, 0),
            size=(0, 2, 1),
            waist_radius=2.0,
            waist_distance=1,
            source_time=source_time,
            direction="+",
            num_freqs=30,
            angle_theta=np.pi / 2.1,
        )

        plane_wave = td.PlaneWave(
            center=(4, 0, 0),
            size=(0, 1, 2),
            source_time=source_time,
            direction="+",
        )

        # 2 warnings: non-uniform grid along y and z
        tfsf = td.TFSF(
            size=(10, 15, 15),
            source_time=source_time,
            direction="-",
            injection_axis=0,
        )

        # 1 warning: bloch boundary is inconsistent with plane_wave
        bspec = td.BoundarySpec(
            x=td.Boundary.pml(), y=td.Boundary.periodic(), z=td.Boundary.bloch(bloch_vec=0.2)
        )

        # 1 warning * 1 structures (perm=20) * 4 sources = 20 total: large grid step along x
        gspec = td.GridSpec(
            grid_x=td.UniformGrid(dl=0.05),
            grid_y=td.AutoGrid(min_steps_per_wvl=15),
            grid_z=td.AutoGrid(min_steps_per_wvl=15),
            override_structures=[
                td.Structure(geometry=td.Box(size=(3, 2, 1)), medium=td.Medium(permittivity=4))
            ],
        )

>       sim = td.Simulation(
            size=[domain_size, 20, 20],
            sources=[gaussian_beam, mode_source, plane_wave, tfsf],
            structures=[box, box_in_pml, box_on_boundary, box_outside],
            # monitors=[monitor_flux, mode_mnt, monitor_time, proj_mnt],
            monitors=[monitor_flux, mode_mnt, proj_mnt],
            run_time=run_time,
            boundary_spec=bspec,
            grid_spec=gspec,
        )

tests/test_package/test_log.py:191:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
tidy3d/components/base.py:98: in __init__
    super().__init__(**kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

__pydantic_self__ = Simulation()
data = {'boundary_spec': BoundarySpec(x=Boundary(plus=PML(name=None, type='PML', num_layers=12, parameters=PMLParams(sigma_or...j_distance=10000000000.0, x=(-1.0, 0.0, 1.0), y=(-2.0, -1.0, 0.0, 1.0, 2.0))], 'run_time': 3.3356409519815207e-10, ...}
values = {'center': (0.0, 0.0, 0.0), 'courant': 0.99, 'medium': Medium(name=None, frequency_range=None, allow_gain=False, nonli...spec=None, heat_spec=None, type='Medium', permittivity=1.0, conductivity=0.0), 'run_time': 3.3356409519815207e-10, ...}
fields_set = {'boundary_spec', 'grid_spec', 'monitors', 'run_time', 'size', 'sources', ...}
validation_error = ValidationError(model='Simulation', errors=[{'loc': ('sources',), 'msg': "object of type 'NoneType' has no len()", 'ty...: 'type_error'}, {'loc': ('normalize_index',), 'msg': "object of type 'NoneType' has no len()", 'type': 'type_error'}])

    def __init__(__pydantic_self__, **data: Any) -> None:
        """
        Create a new model by parsing and validating input data from keyword arguments.

        Raises ValidationError if the input data cannot be parsed to form a valid model.
        """
        # Uses something other than `self` the first arg to allow "self" as a settable attribute
        values, fields_set, validation_error = validate_model(__pydantic_self__.__class__, data)
        if validation_error:
>           raise validation_error
E           pydantic.v1.error_wrappers.ValidationError: 5 validation errors for Simulation
E           sources
E             object of type 'NoneType' has no len() (type=type_error)
E           boundary_spec
E             'NoneType' object is not iterable (type=type_error)
E           monitors
E             could not validate `_warn_monitor_simulation_frequency_range` as `sources` failed validation (type=value_error.validation)
E           grid_spec
E             object of type 'NoneType' has no len() (type=type_error)
E           normalize_index
E             object of type 'NoneType' has no len() (type=type_error)

../../../../.pyenv/versions/3.10.9/lib/python3.10/site-packages/pydantic/v1/main.py:341: ValidationError
-------------------------- Captured stdout call ---------------------------
14:52:50 EST WARNING: A large number (50) of frequency points is used in a
             broadband source. This can lead to solver slow-down and increased
             cost, and even introduce numerical noise. This may become a hard
             limit in future Tidy3D versions.
             WARNING: The ``normal_dir`` field is relevant only for surface
             monitors and will be ignored for monitor flux, which is a box.
             WARNING: A large number (30) of frequency points is used in a
             broadband source. This can lead to solver slow-down and increased
             cost, and even introduce numerical noise. This may become a hard
             limit in future Tidy3D versions.
             WARNING: Angled source propagation axis close to glancing angle.
             For best results, switch the injection axis.
             WARNING: 'geometry=Box(type='Box', center=(50.0, 0.0, 0.0),
             size=(12.0, 5.0, 5.0)) name=None type='Structure'
             medium=Medium(name=None, frequency_range=None, allow_gain=False,
             nonlinear_spec=None, modulation_spec=None, heat_spec=None,
             type='Medium', permittivity=6.0, conductivity=0.0)' (at
             `simulation.structures[3]`) is completely outside of simulation
             domain.
             WARNING: Structure at 'structures[2]' has bounds that extend
             exactly to simulation edges. This can cause unexpected behavior. If
             intending to extend the structure to infinity along one dimension,
             use td.inf as a size variable instead to make this explicit.
             WARNING: Suppressed 1 WARNING message.
             WARNING: The medium associated with structures[0] has a frequency
             range: (5.000000e-01, 1.000000e+00) (Hz) that does not fully cover
             the frequencies contained in monitors[0]. This can cause
             inaccuracies in the recorded results.
             WARNING: Suppressed 2 WARNING messages.
             ERROR: could not validate
             `_warn_monitor_simulation_frequency_range` as `sources` failed
             validation
========================= short test summary info =========================
FAILED tests/test_package/test_log.py::test_logging_warning_capture - pydantic.v1.error_wrappers.ValidationError: 5 validation errors for Si..

Copy link
Collaborator

@momchil-flex momchil-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems a bit precarious in terms of expandability / edge cases, but probably best to handle correctly yeah.

return False

elif isinstance(val1, np.ndarray) or isinstance(val2, np.ndarray):
if not np.allclose(np.array(val1), np.array(val2)):
Copy link
Collaborator

@momchil-flex momchil-flex Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This errors if val1 and val2 are arrays of different shapes. There may be other edge cases too (val1 is array while val2 is something on which np.array(val2) errors?) How about:

try: 
    are_equal =  np.allclose(np.array(val1), np.array(val2))
except:
    return False

if not are_equal:
    return False

Copy link
Collaborator Author

@tylerflex tylerflex Nov 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, good point. I actually saw this function recommended on the internet for something related so maybe I'll do

try:
    np.testing.assert_equal(val1, val2)
except AssertionError:
    return False

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ianwilliamson, that's a much better way to do it.

@dbochkov-flexcompute
Copy link
Contributor

There is one test failing in test_log.py @lucas-flexcompute could you look at it when you get time? I'm not sure why it would fail due to this PR, but this is the error message. Seems the simulation can't be validated?

It probably makes sense for me to look at


if isinstance(val1, tuple) or isinstance(val2, tuple):
val1 = dict(zip(range(len(val1)), val1))
val2 = dict(zip(range(len(val2)), val2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_logging_warning_capture seems to fail when it tries to compare two mediums with frequency_range == None and frequency_range != None. We need to take into account cases when either val1 or val2 is None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok thanks for taking a look, I'd add that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding

if (val1 is None) ^ (val2 is None):
    return False

makes everything pass for me

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if val1 is None and val2 is None, shouldn't it be True?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh wait. this is exclusive or I guess?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, apparently this is XOR in Python

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a bit-wise XOR operator, not the logical one. Not that it matters in this case, but the logical XOR would be (val1 is None) != (val2 is None)

@tylerflex tylerflex force-pushed the tyler/fix/eq branch 2 times, most recently from a9bda7e to 1dd8082 Compare November 8, 2023 11:55
Copy link
Collaborator

@lucas-flexcompute lucas-flexcompute left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lists cannot be part of our models, right? That would be the only missing case that I can think of that is not covered by the general == test.

val2 = dict2[key]

# if one of val1 or val2 is None (exclusive OR)
if (val1 is None) ^ (val2 is None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be even better to check that the values have the same type (unless we want to return true if a list has the same items as a tuple, for example): if type(val1) != type(val2): return False

This would also eliminate the need for all ors in the remaining if clauses.

Copy link
Collaborator Author

@tylerflex tylerflex Nov 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one problem with this is that (I think? in ArrayLike) there can be situations where val1 is a list and val2 is a np.ndarray with the same data. But if we want this situation to be != , then what you suggest makes a lot of sense.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea this does seem to fail some tests, but might be worth looking into more deeply. Not sure.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this case it was because one was type float and one was type numpy.float64
image

Any ideas on how to handle this type checking maybe more loosely?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another thing that comes up is comparing a tidy3dcomplex and complex.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried a few different approaches (such as trying to cast val1 to val2's type and vice versa) and it seems there are just a lot of edge cases to handle here, if you have a solution that works, let me know, otherwise I might leave it how it is for now.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, handling all those cases would be painful. It seems better to leave only the None check. We wouldn't be able to foresee user-derived types anyways (a custom extension of float for example).


if isinstance(val1, tuple) or isinstance(val2, tuple):
val1 = dict(zip(range(len(val1)), val1))
val2 = dict(zip(range(len(val2)), val2))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is a bit-wise XOR operator, not the logical one. Not that it matters in this case, but the logical XOR would be (val1 is None) != (val2 is None)

tests/test_components/test_IO.py Outdated Show resolved Hide resolved
@tylerflex tylerflex merged commit 890761c into pre/2.5 Nov 8, 2023
12 checks passed
@tylerflex tylerflex deleted the tyler/fix/eq branch November 8, 2023 22:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants