Issues with Docker Redeploys #76

lagalbra · 2017-09-29T15:17:55Z

Re-raised from microsoft/OMS-Agent-for-Linux#579

Really hoping someone can point me in the right direction here, every time we redeploy our docker instances our SCX logs at (/var/opt/microsoft/scx/log/scx.log) beging to fill very rapidly with the following messages:

2017-09-29T15:02:39,523Z Error [scx.core.common.pal.system.disk.statisticallogicaldiskinstance:############] statvfs() failed for /var/lib/docker/overlay/######################/merged; errno = 2

Systemctl restart omsagent##### seems to take care of this, but we were expecting the agent to be aware when a container went away and to stop trying to stat the directory it used to be mounted to.

So far we have tried removing and reinstalling the OMS bundle. But curious if there is something else we are doing wrong here?

@kevi5702

lagalbra · 2017-09-29T15:18:26Z

@samisms @keikhara How do you currently handle Docker instances being removed and added to the host which the OMSAgent is installed on?

kevi5702 · 2017-09-29T15:19:48Z

Thanks for moving this @lagalbra I was under the impression this repo was only for instances where docker was deployed inside a container.

For clarification this is the latests OMSAgent running on a Cento 7 VM on azure. With a deploy of the OMS Agent. Things work smoothly until we redeploy our docker instances on the VM and then we start seeing the above errors which look like the Agent is trying to stat the old bind mount locations that no longer exist.

Was curious if we didn't do something correct as I know we can exclude /var/lib/docker/ from filesystem monitoring but I was expecting it to be a bit more intelligent when containers were removed. Please let me know if we didn't have something set up correct or I need to provide more information!

kevi5702 · 2017-10-02T17:48:09Z

Digging around this error seems to match the code here in the PAL software:

https://github.com/Microsoft/pal/blob/master/source/code/scxsystemlib/disk/statisticallogicaldiskinstance.cpp#L269

I'm wondering if overlay needs to be added to excludes somewhere, however manually making a basic overlay mount doesn't produce the statvfs errors when its unmounted, only the warning about overlay not being recognized.

I was able to reproduce this on a new centos image with fresh OMS deploy and just running a basic hello world container.

Restarting the omid.service looks to make this go away, so not sure if something needs to be aware to update this when a container is removed?

This only seems to trigger when logical disk performance counters are enabled and only when the file system was a docker overlay FS mount.

kevi5702 · 2017-10-02T19:28:11Z

Running a test based on the old removal tasks for disks I was able to clean this up w/o restarting omid.service:

[root@dockercentos log]# /opt/omi/bin/omicli iv root/scx { SCX_FileSystem } RemoveByName { Name /var/lib/docker/overlay/######/merged }
instance of SCX_FileSystem
{
    Caption=File system information
    Description=Information about a logical unit of secondary storage
    [Key] Name=/var/lib/docker/overlay/######/merged
    [Key] CSCreationClassName=SCX_ComputerSystem
    [Key] CSName=dockercentos
    [Key] CreationClassName=SCX_FileSystem
    Root=/var/lib/docker/overlay/######/merged
    BlockSize=0
    FileSystemSize=0
    AvailableSpace=0
    ReadOnly=false
    EncryptionMethod=Unknown
    CompressionMethod=Unknown
    CaseSensitive=true
    CasePreserved=true
    MaxFileNameLength=0
    FileSystemType=overlay
    PersistenceType=0
    IsOnline=false
}
instance of RemoveByName
{
    ReturnValue=true
}

samisms · 2017-10-03T21:30:29Z

Raised microsoft/SCXcore#89 with scx provider. Closing this here.

lagalbra mentioned this issue Sep 29, 2017

Issues with Docker Redploys microsoft/OMS-Agent-for-Linux#579

Closed

samisms mentioned this issue Oct 3, 2017

Issues with Docker Redeploys microsoft/SCXcore#89

Open

samisms closed this as completed Oct 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with Docker Redeploys #76

Issues with Docker Redeploys #76

lagalbra commented Sep 29, 2017

lagalbra commented Sep 29, 2017

kevi5702 commented Sep 29, 2017 •

edited

Loading

kevi5702 commented Oct 2, 2017 •

edited

Loading

kevi5702 commented Oct 2, 2017 •

edited

Loading

samisms commented Oct 3, 2017

Issues with Docker Redeploys #76

Issues with Docker Redeploys #76

Comments

lagalbra commented Sep 29, 2017

lagalbra commented Sep 29, 2017

kevi5702 commented Sep 29, 2017 • edited Loading

kevi5702 commented Oct 2, 2017 • edited Loading

kevi5702 commented Oct 2, 2017 • edited Loading

samisms commented Oct 3, 2017

kevi5702 commented Sep 29, 2017 •

edited

Loading

kevi5702 commented Oct 2, 2017 •

edited

Loading

kevi5702 commented Oct 2, 2017 •

edited

Loading