The accounting file is out of sync with the system. When corrupt VMs expire, there is no corresponding REMOVED line. Reported by John Ouellette, thankyou.
After looking through John's logs, here's what I think happened here:
A VM starts:
2011-05-06 15:49:12,139 INFO defaults.CreationManagerImpl
WORKSPACE INSTANCE CREATED:
- Name: 'http://wst4'
- Start time: May 6, 2011 3:49:12 PM
- Shutdown time: May 8, 2011 3:49:12 PM
- Resource termination time: May 8, 2011 3:51:12 PM
- Creator: /C=CA/O=Grid/OU=hia.nrc.ca/CN=John Ouellette
- ID: 12958, VMM: proc5-28.nope
Then at 2011-05-06 16:22:34,791, a nimbus-full-reset happens,
Then we see nimbus get a notification from the worker node about that VM,
but since the service has been reset, it doesn't know anything about it
2011-05-06 16:22:40,258 WARN site.NotificationPoll
[Timer-1,oneNotification:113] received workspace-control notification
about unknown id 12958
So you never see a remove entry in your log.