Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add metadata to Tidy3D components #1493

Merged
merged 1 commit into from
May 1, 2024
Merged

Add metadata to Tidy3D components #1493

merged 1 commit into from
May 1, 2024

Conversation

tylerflex
Copy link
Collaborator

@tylerflex tylerflex commented Feb 23, 2024

Addresses #1352

How do you all feel about this?

  1. Used the name .attrs, because it follows the same convention as xarray, pandas, maybe others.
  2. attrs are dictionaries (mutable) so they can be set on the fly, sim.attrs['foo'] = bar.
  3. we still can't set the attrs directly since tidy3d objects dont support item assignment, eg.sim.attrs = {} doesn't work.
  4. in xarray, attrs remain when the objects are copied and updated. but are removed when the objects are modified. We only need to handle the copy part since we can't modify tidy3d objects. therefore if sim.copy(), then the new object's attrs are the same.
  5. Important point, since attrs are mutable, we want to ensure our internal code doesn't depend on attrs in any way. Otherwise we introduce state. So as a rule, we don't use attrs for any calculation? That being said, maybe we can set attrs internally if it makes sense to, but never code like if obj.attrs["something"] == something: ...

All of the logic is spelled out in the test.

Also please check the description and see if it is clear.

Thanks

@tylerflex tylerflex added the 2.7 will go into version 2.7.* label Feb 23, 2024
@tylerflex tylerflex self-assigned this Feb 23, 2024
@tylerflex tylerflex linked an issue Feb 23, 2024 that may be closed by this pull request
@tylerflex tylerflex force-pushed the tyler/attrs branch 2 times, most recently from 15e42eb to d8cabc8 Compare February 23, 2024 14:55
Copy link
Collaborator

@lucas-flexcompute lucas-flexcompute left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question: it is possible to get an error if the user adds non-serializable data to the dictionary, because it will be included in the json representation of the model, right?

I think that's the correct behavior here, but we might also want to add a note about that in the description (and maybe try to catch it early and warn the user when creating the json string).

@tylerflex
Copy link
Collaborator Author

Just one question: it is possible to get an error if the user adds non-serializable data to the dictionary, because it will be included in the json representation of the model, right?

Yea it should be, good point. I'll add a test and also see what I can do about catching it.

@tylerflex tylerflex force-pushed the tyler/attrs branch 3 times, most recently from 728ec83 to f5601aa Compare February 23, 2024 19:05
@momchil-flex
Copy link
Collaborator

  1. Important point, since attrs are mutable, we want to ensure our internal code doesn't depend on attrs in any way. Otherwise we introduce state. So as a rule, we don't use attrs for any calculation? That being said, maybe we can set attrs internally if it makes sense to, but never code like if obj.attrs["something"] == something: ...

This point sounds like going against the idea of using these for the jax_info. Or would that be considered internal usage? Basically when submitting a JaxSimulation to the server, we split it into a Simulation and JaxInfo, and add the latter as attrs to the former.

One more question to bring up (@xin-flex) if two simulations only differ in their attrs should they be considered the same or different in terms of caching?

@tylerflex
Copy link
Collaborator Author

This point sounds like going against the idea of using these for the jax_info. Or would that be considered internal usage? Basically when submitting a JaxSimulation to the server, we split it into a Simulation and JaxInfo, and add the latter as attrs to the former.

I think we could consider setting jax_info as a pd.Field in the JaxSimulation instead of using attrs maybe?

@tylerflex
Copy link
Collaborator Author

Hm, but then how do we upload this information? maybe that doesn't work after all.

@tylerflex
Copy link
Collaborator Author

@momchil-flex could you take a look into this PR and let me know if you need any other changes? I think I'll be doing the jax refactor stuff we discussed without attrs, although I'm still working that out.

Copy link
Collaborator

@momchil-flex momchil-flex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way this is currently, identical simulations with different attrs will not go through the caching mechanism, right? I think this may be fine, but I do remember maybe at some point discussing having the opposite behavior? It's probably harder to implement though. Any thoughts?

"operation of Tidy3D as it is not used internally. "
"Note that, unlike regular Tidy3D fields, ``attrs`` are mutable. "
"For example, the following is allowed for setting an ``attr`` ``obj.attrs['foo'] = bar``. "
"Also note that if objects that are not serializable (eg types) are added to "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by "eg types"? Like if I have something like [Callable] in my attrs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that you can't put int in your attrs, e.g. attrs = {'test': int}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just removed this eg types comment because it might just be confusing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note that `Tidy3D`` will raise a ``TypeError`` if ``attrs`` contain objects that can not be serialized.

tidy3d/components/base.py Outdated Show resolved Hide resolved
tidy3d/components/grid/grid.py Outdated Show resolved Hide resolved
tidy3d/components/grid/grid.py Outdated Show resolved Hide resolved
@tylerflex
Copy link
Collaborator Author

The way this is currently, identical simulations with different attrs will not go through the caching mechanism, right? I think this may be fine, but I do remember maybe at some point discussing having the opposite behavior? It's probably harder to implement though. Any thoughts?

I'm not excactly sure how the caching functions. Do you know what it compares based of? In my mind, it probably makes most sense to ignore attrs when deciding whether to load a cached dataset but I don't know

@tylerflex tylerflex force-pushed the tyler/attrs branch 2 times, most recently from 3cb3266 to a99f294 Compare May 1, 2024 17:29
@momchil-flex
Copy link
Collaborator

The way this is currently, identical simulations with different attrs will not go through the caching mechanism, right? I think this may be fine, but I do remember maybe at some point discussing having the opposite behavior? It's probably harder to implement though. Any thoughts?

I'm not excactly sure how the caching functions. Do you know what it compares based of? In my mind, it probably makes most sense to ignore attrs when deciding whether to load a cached dataset but I don't know

I believe it compares the hdf5 file byte by byte, so it will take some engineering to modify this to exclude attrs.

But, in some use cases, that's probably safer. For example, the SimulationData will contain the Simulation that contains the attrs. If caching kicks in, the returned SimulationData will have different attrs than the Simulation that was submitted. To avoid this it would take even more engineering...

@tylerflex
Copy link
Collaborator Author

k, I feel comfortable taking the most conservative route and just caching based on attrs for now.. or until someone complains :)

@tylerflex tylerflex force-pushed the tyler/attrs branch 2 times, most recently from acd82f2 to c44de29 Compare May 1, 2024 17:39
@tylerflex tylerflex added the rc2 2nd pre-release label May 1, 2024
@momchil-flex momchil-flex merged commit cbd2349 into pre/2.7 May 1, 2024
16 checks passed
@momchil-flex momchil-flex deleted the tyler/attrs branch May 1, 2024 22:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.7 will go into version 2.7.* rc2 2nd pre-release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add metadata to tidy3d components
3 participants