Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leaks #312

Merged
merged 1 commit into from Nov 29, 2021
Merged

Fix memory leaks #312

merged 1 commit into from Nov 29, 2021

Conversation

hunse
Copy link
Collaborator

@hunse hunse commented Feb 12, 2021

To address #311.

TODO:

  • Look at Probe.target to see if there's a similar memory leak there.
  • Find the memory leak when running (it's the biggest).
  • Look for more memory leaks when run isn't called (there's still one, but it's much smaller).

Test script:

This can also be run with run_steps commented out to test for builder memory leaks.

import gc

import nengo
import nengo_loihi
import numpy as np

from guppy import hpy

h = hpy()


class NewClass:
    def __init__(self):
        self.input_size = 10
        self.n_neurons = 500
        self.initialize_nengo()

    def initialize_nengo(self):
        network = nengo.Network()
        with network:
            def input_func(t):
                return np.ones(10)

            def output_func(t, x):
                self.output = x

            input_layer  = nengo.Node(output=input_func, size_in=0, size_out = self.input_size)
            ensemble     = nengo.Ensemble(n_neurons=self.n_neurons, dimensions=1 )
            output_layer = nengo.Node(output=output_func, size_in=self.n_neurons, size_out=0)

            conn_in  = nengo.Connection(input_layer, ensemble.neurons, transform=np.ones((self.n_neurons,self.input_size)))
            conn_out = nengo.Connection(ensemble.neurons, output_layer)

        self.network = network

    def run(self, steps, num_resets):
        print(h.heap())

        for i in range(num_resets):
            with nengo_loihi.Simulator(self.network, precompute=True) as sim:
            # with nengo.Simulator(self.network, progress_bar=False) as sim:
                sim.run_steps(steps)
                # pass

            del sim

            print('finished iteration:', i)
            if i % 25 == 0:
                gc.collect()
                print(h.heap())

steps = 1000
num_resets = 101

nengo_class = NewClass()
nengo_class.run(steps,num_resets)

@kshivvy
Copy link

kshivvy commented Apr 22, 2021

What is the process to have this merged into the master branch and reflected in the next release? For now we are using the fix-memory-leak branch in our code.

@hunse
Copy link
Collaborator Author

hunse commented Apr 27, 2021

Hi @kshivvy. We're planning on doing a release for the new version of NxSDK within the next month or so. We'll merge this into master and release it as part of that release.

@hunse
Copy link
Collaborator Author

hunse commented Nov 18, 2021

Ok, fixed one last leak with the nodes in nengo_loihi.builder.inputs not being garbage collected because they were using self.update as their output function, creating a circular reference. This was a pretty significant leak, and I'm not seeing any more leaks now, so this is ready to go.

@hunse
Copy link
Collaborator Author

hunse commented Nov 18, 2021

For the record, here's the script that I was using to do the memory profiling:

import gc
import weakref

import nengo
import numpy as np

import nengo_loihi

use_tracemalloc = False
# use_tracemalloc = True

if use_tracemalloc:
    import tracemalloc

    tracemalloc.start()
    tracemalloc.start(25)

else:
    from guppy import hpy

    h = hpy()


def snapshot():
    if use_tracemalloc:
        return tracemalloc.take_snapshot()
    else:
        return h.heap()


def print_snapshot(snap):
    if use_tracemalloc:
        if (
            isinstance(snap, list)
            and len(snap) > 0
            and isinstance(snap[0], tracemalloc.StatisticDiff)
        ):
            for stat in snap[:20]:
                print(stat)
                # print("%s memory blocks: %.1f KiB" % (stat.count, stat.size / 1024))
                for i, line in enumerate(stat.traceback.format()):
                    print(line)
                    if i > 5:
                        break

        else:
            print(snap)
    else:
        print(snap)


def snapshot_diff(snap0, snap1):
    if use_tracemalloc:
        return snap1.compare_to(snap0, "traceback")
    else:
        return snap1 - snap0


class NewClass:
    def __init__(self):
        self.input_size = 10
        self.n_neurons = 500
        self.initialize_nengo()

    def initialize_nengo(self):
        network = nengo.Network()
        # with network:
        #     ens = nengo.Ensemble(n_neurons=self.n_neurons, dimensions=1, label="a")
        #     probe = nengo.Probe(ens)

        with network:

            def input_func(t):
                return np.ones(10)

            def output_func(t, x):
                self.output = x

            input_layer = nengo.Node(
                output=input_func, size_in=0, size_out=self.input_size
            )
            ensemble = nengo.Ensemble(n_neurons=self.n_neurons, dimensions=1)
            output_layer = nengo.Node(
                output=output_func, size_in=self.n_neurons, size_out=0
            )

            conn_in = nengo.Connection(
                input_layer,
                ensemble.neurons,
                transform=np.ones((self.n_neurons, self.input_size)),
            )
            conn_out = nengo.Connection(ensemble.neurons, output_layer)

        self.network = network

    def run(self, steps, num_resets):
        snap_pre = snapshot()
        print_snapshot(snap_pre)

        snap0 = None
        snapi = None
        for i in range(num_resets):
            with nengo_loihi.Simulator(self.network, precompute=True) as sim:
                sim.run_steps(steps)

            del sim

            if i % 25 == 0 or i == num_resets - 1:
                print("finished iteration:", i)
                gc.collect()

                snapi = snapshot()
                # print_snapshot(snapi)
                if i == 0:
                    snap0 = snapi

                snapd = snapshot_diff(snap_pre, snapi)
                print_snapshot(snapd)

        if snapi is not None and snap0 is not None and snapi is not snap0:
            print("Dynamic allocation (diff between first reset and n-th reset)")
            snapd = snapshot_diff(snap0, snapi)
            print_snapshot(snapd)
            # import pdb; pdb.set_trace()


# steps = 10
# steps = 100
steps = 1000

# num_resets = 101
# num_resets = 51
num_resets = 26
# num_resets = 10
# num_resets = 3
# num_resets = 1

nengo_class = NewClass()
nengo_class.run(steps, num_resets)

@tbekolay tbekolay force-pushed the fix-memory-leak branch 3 times, most recently from da23679 to 3a21e72 Compare November 26, 2021 21:37
@hunse hunse force-pushed the fix-memory-leak branch 2 times, most recently from f05831e to 8ecf6a1 Compare November 26, 2021 22:38
Copy link
Member

@tbekolay tbekolay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good!

- Use processes and step classes instead of member function
  to avoid cyclic reference. This fixes a memory leak and
  is cleaner on the builder/inputs.py side, but it requires
  a bit of hackery inside the Simulator because getting access
  to the contained step function requires introspecting on
  closure variables. Still, it works, and it means that multiple
  simulators can be built from the same model, which wasn't true
  before.
- Store spike_targets on Model instead of on Node.

Co-authored-by: Eric Hunsberger <eric.hunsberger@appliedbrainresearch.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants