Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: don't copy unit registry if it is already the same thing #3180

Merged
merged 1 commit into from
May 19, 2021

Conversation

Xarthisius
Copy link
Member

I'll admit: I don't know what I'm doing. If this gets merged in, there is a chance some unnamed PhD student will probably lose a couple of weeks trying to debug a subtle issue of nonconforming units somewhere affecting their results. They'll git bisect and then git blame and learn my name. They won't know me, but they'll hate me. With passion.
However, on a slim chance that what I'm suggesting doesn't break the Universe horribly, let's run a full testsuite on this PR. Afterwards, I'll provide a concrete example how this benefits a specific use case.

@Xarthisius Xarthisius added enhancement Making something better performance labels Apr 9, 2021
@matthewturk
Copy link
Member

noidea

@Xarthisius
Copy link
Member Author

It turns out this change doesn't do much. It's one of those cases where you cannot trust a profiler (whether it's pyinstrument or line_profiler)

(main) $ kernprof -lv doc/source/cookbook/hse_field.py
Total time: 43.1662 s
File: /home/xarth/codes/xarthisius/yt/yt/data_objects/construction_data_containers.py
Function: _sanitize_edge at line 754
Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   754                                               @profile
   755                                               def _sanitize_edge(self, edge):
   756      1182       3335.0      2.8      0.0          if not is_sequence(edge):
   757                                                       edge = [edge] * len(self.ds.domain_left_edge)
   758      1182      31009.0     26.2      0.1          if len(edge) != len(self.ds.domain_left_edge):
   759                                                       raise RuntimeError(
   760                                                           "Length of edges must match the dimensionality of the " "dataset"
   761                                                       )
   762      1182       1004.0      0.8      0.0          if hasattr(edge, "units"):
   763      1182   43113203.0  36474.8     99.9              edge_units = edge.units.copy()
   764      1182       4201.0      3.6      0.0              edge_units.registry = self.ds.unit_registry
   765                                                   else:
   766                                                       edge_units = "code_length"
   767      1182      13411.0     11.3      0.0          return self.ds.arr(edge, edge_units, dtype="float64")

(this PR) $ kernprof -lv doc/source/cookbook/hse_field.py
Total time: 0.034908 s
File: /home/xarth/codes/xarthisius/yt/yt/data_objects/construction_data_containers.py
Function: _sanitize_edge at line 754

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
   754                                               @profile
   755                                               def _sanitize_edge(self, edge):
   756      1182       3013.0      2.5      8.6          if not is_sequence(edge):
   757                                                       edge = [edge] * len(self.ds.domain_left_edge)
   758      1182      29481.0     24.9     84.5          if len(edge) != len(self.ds.domain_left_edge):
   759                                                       raise RuntimeError(
   760                                                           "Length of edges must match the dimensionality of the " "dataset"
   761                                                       )
   762      1182       1028.0      0.9      2.9          if hasattr(edge, "units"):
   763      1182        959.0      0.8      2.7              if edge.units.registry is self.ds.unit_registry:
   764      1182        427.0      0.4      1.2                  return edge
   765                                                       edge_units = edge.units.copy()
   766                                                       edge_units.registry = self.ds.unit_registry
   767                                                   else:
   768                                                       edge_units = "code_length"
   769                                                   return self.ds.arr(edge, edge_units, dtype="float64")

I really wish I could say I shaved 40s off, but:

(master) $ time python doc/source/cookbook/hse_field.py
real	0m24.542s
user	0m24.367s
sys	0m0.176s

(this PR) $ time python doc/source/cookbook/hse_field.py
real	0m24.794s
user	0m24.782s
sys	0m0.955s

While I still believe this change is "the right thing to do"^{TM} it's suddenly way less exciting...

@neutrinoceros
Copy link
Member

@Xarthisius do you still think this is worth merging ?

@matthewturk
Copy link
Member

I think this is still worth merging.

@neutrinoceros
Copy link
Member

sounds like we're at two implicit approvals. I'll formalise mine

@matthewturk matthewturk merged commit 202fca8 into yt-project:main May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Making something better performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants