Skip to content
This repository

accounting txt file not updated when corrupt VMs expire #49

Closed
timf opened this Issue May 12, 2011 · 1 comment

3 participants

Tim Freeman David LaBissoniere Patrick Armstrong
Tim Freeman
Collaborator
timf commented May 12, 2011

The accounting file is out of sync with the system. When corrupt VMs expire, there is no corresponding REMOVED line. Reported by John Ouellette, thankyou.

Patrick Armstrong
Collaborator

After looking through John's logs, here's what I think happened here:

A VM starts:

2011-05-06 15:49:12,139 INFO  defaults.CreationManagerImpl
[ServiceThread-24,successPrint:1432] [NIMBUS-EVENT][id-12958]:

WORKSPACE INSTANCE CREATED:
    - Name: 'http://wst4'
    - Start time:                May 6, 2011 3:49:12 PM
    - Shutdown time:             May 8, 2011 3:49:12 PM
    - Resource termination time: May 8, 2011 3:51:12 PM
    - Creator: /C=CA/O=Grid/OU=hia.nrc.ca/CN=John Ouellette
    - ID: 12958, VMM: proc5-28.nope

Then at 2011-05-06 16:22:34,791, a nimbus-full-reset happens,

Then we see nimbus get a notification from the worker node about that VM,
but since the service has been reset, it doesn't know anything about it
anymore:

2011-05-06 16:22:40,258 WARN  site.NotificationPoll
[Timer-1,oneNotification:113] received workspace-control notification
about unknown id 12958

So you never see a remove entry in your log.
David LaBissoniere labisso closed this June 09, 2011
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.