Skip to content
This repository

accounting txt file not updated when corrupt VMs expire #49

timf opened this Issue May 12, 2011 · 1 comment

3 participants

Tim Freeman David LaBissoniere Patrick Armstrong
Tim Freeman
timf commented May 12, 2011

The accounting file is out of sync with the system. When corrupt VMs expire, there is no corresponding REMOVED line. Reported by John Ouellette, thankyou.

Patrick Armstrong

After looking through John's logs, here's what I think happened here:

A VM starts:

2011-05-06 15:49:12,139 INFO  defaults.CreationManagerImpl
[ServiceThread-24,successPrint:1432] [NIMBUS-EVENT][id-12958]:

    - Name: 'http://wst4'
    - Start time:                May 6, 2011 3:49:12 PM
    - Shutdown time:             May 8, 2011 3:49:12 PM
    - Resource termination time: May 8, 2011 3:51:12 PM
    - Creator: /C=CA/O=Grid/ Ouellette
    - ID: 12958, VMM: proc5-28.nope

Then at 2011-05-06 16:22:34,791, a nimbus-full-reset happens,

Then we see nimbus get a notification from the worker node about that VM,
but since the service has been reset, it doesn't know anything about it

2011-05-06 16:22:40,258 WARN  site.NotificationPoll
[Timer-1,oneNotification:113] received workspace-control notification
about unknown id 12958

So you never see a remove entry in your log.
David LaBissoniere labisso closed this June 09, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.