accounting txt file not updated when corrupt VMs expire #49

timf opened this Issue May 12, 2011 · 1 comment

3 participants

Nimbus member

The accounting file is out of sync with the system. When corrupt VMs expire, there is no corresponding REMOVED line. Reported by John Ouellette, thankyou.

Nimbus member

After looking through John's logs, here's what I think happened here:

A VM starts:

2011-05-06 15:49:12,139 INFO  defaults.CreationManagerImpl
[ServiceThread-24,successPrint:1432] [NIMBUS-EVENT][id-12958]:

    - Name: 'http://wst4'
    - Start time:                May 6, 2011 3:49:12 PM
    - Shutdown time:             May 8, 2011 3:49:12 PM
    - Resource termination time: May 8, 2011 3:51:12 PM
    - Creator: /C=CA/O=Grid/ Ouellette
    - ID: 12958, VMM: proc5-28.nope

Then at 2011-05-06 16:22:34,791, a nimbus-full-reset happens,

Then we see nimbus get a notification from the worker node about that VM,
but since the service has been reset, it doesn't know anything about it

2011-05-06 16:22:40,258 WARN  site.NotificationPoll
[Timer-1,oneNotification:113] received workspace-control notification
about unknown id 12958

So you never see a remove entry in your log.
@labisso labisso closed this Jun 9, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment