Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to acquire cache log can still crash simulation #1335

Closed
jgosmann opened this issue Jul 12, 2017 · 0 comments · Fixed by #1336
Closed

Failure to acquire cache log can still crash simulation #1335

jgosmann opened this issue Jul 12, 2017 · 0 comments · Fixed by #1336
Labels

Comments

@jgosmann
Copy link
Collaborator

This happens because the shrink method does not catch the timeout exception that might occur.

Corresponding stack trace:

----------------------------------------------------------------------
Job started (Wed, 12 Jul 2017 16:24:54, EDT)
----------------------------------------------------------------------
running IMemTrial#20170712-162501-4d0998e6
Traceback (most recent call last):
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/utils/lock.py", line 21, in acquire
    self.filename, os.O_CREAT | os.O_RDWR | os.O_EXCL)
FileExistsError: [Errno 17] File exists: '/scratch/jgosmann/nengo-decoder-cache/index.lock'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/spool/slurmd/job22941/slurm_script", line 23, in <module>
    Worker(store=task.store).start(execute, sys.argv[1], sys.argv[2])
  File "/home/jgosmann/Documents/projects/psyrun/psyrun/backend/distribute.py", line 354, in start
    return_data=False)
  File "/home/jgosmann/Documents/projects/psyrun/psyrun/mapper.py", line 98, in map_pspace_hdd_backed
    store.append(filename, dict_concat((get_result(fn, p),)))
  File "/home/jgosmann/Documents/projects/psyrun/psyrun/mapper.py", line 36, in get_result
    result.update(fn(**params))
  File "/home/jgosmann/Documents/projects/imem/psy-tasks/task_tcm_immediate.py", line 39, in execute
    return IMemTrial().run(**kwargs)
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/pytry/trial.py", line 87, in run
    result = self.execute_trial(p)
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/pytry/nengo.py", line 39, in execute_trial
    self.sim = Simulator(model, dt=p.dt)
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/simulator.py", line 153, in __init__
    self.model.build(network, progress_bar=self.progress_bar)
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/builder/builder.py", line 121, in build
    built = Builder.build(self, obj, *args, **kwargs)
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/builder/builder.py", line 216, in build
    return cls.builders[obj_cls](model, obj, *args, **kwargs)
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/builder/network.py", line 122, in build_network
    model.decoder_cache.shrink()
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/cache.py", line 609, in shrink
    with self._index:
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/cache.py", line 355, in __enter__
    with self._lock:
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/utils/lock.py", line 44, in __enter__
    self.acquire()
  File "/home/jgosmann/.virtualenvs/graham-imem/lib/python3.5/site-packages/nengo/utils/lock.py", line 29, in acquire
    filename=self.filename))
nengo.exceptions.TimeoutError: Could not acquire lock '/scratch/jgosmann/nengo-decoder-cache/index.lock'.
@jgosmann jgosmann added the bug label Jul 12, 2017
@jgosmann jgosmann self-assigned this Jul 12, 2017
jgosmann added a commit that referenced this issue Jul 13, 2017
because of an unavailable lock.

Fixes #1335.
@jgosmann jgosmann removed their assignment Jul 13, 2017
tbekolay pushed a commit that referenced this issue Jul 19, 2017
because of an unavailable lock.

Fixes #1335.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

Successfully merging a pull request may close this issue.

1 participant