Skip to content

Memory leak in repeated Simulation create/run/destroy cycles (Prism geometry) #3192

@Luochenghuang

Description

@Luochenghuang

Memory leak in repeated Simulation create/run/destroy cycles (Prism geometry)

Summary

When running parameter sweeps that repeatedly create, run, and destroy mp.Simulation objects (e.g., sweeping over azimuthal mode numbers or dipole positions in cylindrical coordinates), process RSS grows monotonically and is never reclaimed — even after calling sim.reset_meep(), del sim, and gc.collect().

The leak has two independent root causes spanning two repositories:

  1. MEEP (simulation.py): reset_meep() doesn't release self.geps — accounts for ~70% of the leak.
  2. libctl (utils/geom.c): init_prism() leaks arrays when called more than once on the same prism object — accounts for ~30% of the leak.

Minimal reproducer

import gc, resource
import meep as mp

mp.verbosity(0)

def rss_mb():
    return resource.getrusage(resource.RUSAGE_SELF).ru_maxrss / 1024

geo = [mp.Prism(
    vertices=[mp.Vector3(0, 0, 0.2*j), mp.Vector3(1, 0, 0.2*j),
              mp.Vector3(1, 0, 0.2*j + 0.15), mp.Vector3(0, 0, 0.2*j + 0.15)],
    height=mp.inf, axis=mp.Vector3(0, 1, 0),
    material=mp.Medium(index=3.32),
) for j in range(5)]

rss0 = rss_mb()
for i in range(20):
    sim = mp.Simulation(
        resolution=200,
        cell_size=mp.Vector3(2, 0, 2),
        dimensions=mp.CYLINDRICAL,
        m=i % 3,
        boundary_layers=[mp.PML(0.3, direction=mp.R)],
        sources=[mp.Source(
            src=mp.GaussianSource(2.0, fwidth=0.4),
            component=mp.Er,
            center=mp.Vector3(0.5, 0, 0.5),
        )],
        geometry=geo,
        force_complex_fields=True,
    )
    sim.run(until=2)
    sim.reset_meep()
    del sim
    gc.collect()
    print(f"iter {i}: RSS = {rss_mb():.0f} MB  (delta = {rss_mb() - rss0:.0f} MB)")

Observed behavior

RSS grows with every iteration and never stabilizes:

iter 0:  RSS = 142 MB (delta = 12 MB)
iter 5:  RSS = 148 MB (delta = 18 MB)
iter 10: RSS = 155 MB (delta = 25 MB)
iter 19: RSS = 162 MB (delta = 32 MB)

Python-side memory (tracked via tracemalloc) stays flat — the leak is entirely in the C++ layer.

Expected behavior

RSS should remain roughly constant across iterations. reset_meep() should release all resources allocated during init_sim().

Root cause analysis

Bug 1: reset_meep() doesn't release self.geps (MEEP — ~70%)

reset_meep() clears self.fields and self.structure but does not clear self.geps — the geom_epsilon C++ object allocated by _set_materials() during init_sim(). This object holds the processed geometry tree, material data, and conductivity profiles. Each call to init_sim() allocates a new geps, but the old one is never freed by reset_meep().

Call chain that creates geps:

Simulation.init_sim()
  → Simulation._set_materials()
    → mp._set_materials(...)   # returns new geom_epsilon*
  → self.geps = ...            # stores it, but reset_meep() never clears it

Bug 2: init_prism() leaks arrays on re-initialization (libctl — ~30%)

When MEEP constructs a geom_epsilon, it copies the geometry list via geometric_object_copy()prism_copy(), which allocates 7 internal arrays per prism. Then it calls geom_fix_object_list()geom_fix_object_ptr()init_prism(), which unconditionally mallocs fresh arrays for the same 6 derived fields — overwriting the pointers from prism_copy() without freeing them.

Leaked fields per prism per geom_epsilon construction:

  • vertices_p.items
  • vertices_top_p.items
  • top_polygon_diff_vectors_p.items
  • top_polygon_diff_vectors_scaled_p.items
  • vertices_top.items
  • workspace.items

Call chain:

geom_epsilon::geom_epsilon(...)
  → geometric_object_copy()     # prism_copy() allocates arrays
  → geom_fix_object_list()
    → geom_fix_object_ptr()
      → init_prism()            # malloc's new arrays, old pointers leaked

Confirmed via valgrind (--leak-check=full): 382 KB definitely lost in 3 iterations, all traced to init_prism() at geom.c lines 2678, 2720, 2802, 2809, 2816.

Key observations

  1. No leak without sim.run() — creating and destroying simulations without running them does not leak.
  2. No leak with empty geometry — the leak only manifests when geometry objects (especially Prism) are present.
  3. Leak scales linearly with grid size:
Resolution Grid pixels Leak/iter
100 40,000 ~0.4 MB
200 160,000 ~1.1 MB
  1. self.geps is the primary source (~70%) — explicitly setting sim.geps = None before cleanup reduces the leak from ~1.7 to ~0.5 MB/iter.
  2. Residual ~30% is in libctl's init_prism — confirmed via valgrind; the leaked arrays are from prism_copy() being overwritten by init_prism().

Proposed fixes

Fix 1: MEEP python/meep/simulation.py

 def reset_meep(self):
     self.fields = None
     self.structure = None
+    self.geps = None
     self.dft_objects = []
     self.num_chunks = self._num_chunks_original
     self.chunk_layout = self._chunk_layout_original
     self._is_initialized = False

Fix 2: libctl utils/geom.c

Free existing arrays in init_prism() before reallocating (safe because free(NULL) is a no-op):

 void init_prism(geometric_object *o) {
   prism *prsm = o->subclass.prism_data;
   ...
+  // Free any previously allocated arrays (init_prism may be called more than
+  // once on the same object, e.g. via geom_fix_object_list after prism_copy).
+  free(prsm->vertices_p.items);
+  free(prsm->vertices_top_p.items);
+  free(prsm->top_polygon_diff_vectors_p.items);
+  free(prsm->top_polygon_diff_vectors_scaled_p.items);
+  free(prsm->vertices_top.items);
+  free(prsm->workspace.items);
+
   // compute vertices in prism coordinate system
   prsm->vertices_p.num_items = num_vertices;
   prsm->vertices_p.items = (vector3 *)malloc(num_vertices * sizeof(vector3));

Zero-initialize the prism struct on first creation so the free() calls above are safe:

 geometric_object make_slanted_prism_with_center(...) {
   ...
   prism *prsm = o.subclass.prism_data = MALLOC1(prism);
   CHECK(prsm, "out of memory");
+  memset(prsm, 0, sizeof(prism));
   prsm->vertices.num_items = num_vertices;

Impact

Any workflow that runs many simulations in a loop — mode convergence sweeps, dipole position scans, optimization loops — accumulates leaked memory proportional to (iterations x grid size).

Workaround

Until the fixes are merged, users can mitigate ~70% of the leak by manually clearing geps after each simulation:

sim.reset_meep()
sim.geps = None   # <-- workaround
del sim
gc.collect()

Environment

  • MEEP built from source (commit 7e7b985, version 1.32.0-beta)
  • libctl 4.5.1
  • Python 3.11
  • Linux, single-process and MPI (leak occurs in both)
  • Cylindrical coordinates (dimensions=mp.CYLINDRICAL)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions