New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly "No such file or directory" from sdc-lastcomm #202

Closed
smokris opened this Issue Apr 18, 2013 · 13 comments

Comments

Projects
None yet
@smokris
Copy link

smokris commented Apr 18, 2013

Every night for the last week I've been getting emails like the following:

From: Super-User <root@my-smartos-server>
Subject: Cron <root@my-smartos-server> /smartdc/bin/sdc-lastcomm -R 30
Date: 2013.04.17 20:00:01 EDT
To: root@my-smartos-server

find: stat() error ./20130417183242.not_terminated.my-smartos-server: No such file or directory

There is a find invocation in /smartdc/bin/sdc-lastcomm but it doesn't seem to be doing anything unusual:

find . -mtime +$DAYS -exec rm -f "{}" \;

The only situation I think would trigger that error would be for the not_terminated file to be deleted after find has started enumerating files but before it's actually deleted the file... Though given that there are only about 30 files in /var/audit I wouldn't expect that there would be much window of opportunity for that to happen; certainly not enough to happen consistently every night for the last week.

Any ideas?

@jjlawren

This comment has been minimized.

Copy link

jjlawren commented Jan 2, 2014

I've noticed this happening to me, too. Looks like a race between the auditd service creating a file in /var/audit with the format of <start_time>.<end_time>.<hostname> with 'not_terminated' as the end time. It then renames to an actual timestamp while the find command is running. Not sure of the proper way to avoid this.

@bahamat

This comment has been minimized.

Copy link
Member

bahamat commented Mar 16, 2014

This looks like it happens because it runs at the exact same time that auditd rolls over.

Could this not be fixed by offsetting the cron job by 1 minute?

@badboy

This comment has been minimized.

Copy link

badboy commented Mar 17, 2014

Any update on this? I still see this error on 20140111T020931Z

@ccrusius

This comment has been minimized.

Copy link

ccrusius commented Mar 27, 2014

Same here - system was ok for about 10 days and now I'm getting daily messages with this error.

@jclulow jclulow added the scrub0 label May 19, 2014

@bahamas10

This comment has been minimized.

Copy link
Contributor

bahamas10 commented Oct 15, 2014

[root@datadyne ~]# uname -a
SunOS datadyne 5.11 joyent_20140919T024804Z i86pc i386 i86pc
[root@datadyne ~]# uptime
21:40:16    up 19 day(s),  1:32,  1 user,  load average: 0.08, 0.07, 0.07
[root@datadyne ~]# mail
From root@datadyne.rapture.com Sun Oct 12 00:00:01 2014
Date: Sun, 12 Oct 2014 00:00:01 GMT
From: Super-User <root@datadyne.rapture.com>
Message-Id: <201410120000.s9C001ss033778@datadyne.rapture.com>
To: root@datadyne.rapture.com
Subject: Cron <root@datadyne> /smartdc/bin/sdc-lastcomm -R 30
Content-Length: 88

find: stat() error ./20141011000000.not_terminated.datadyne: No such file or directory


? 

same here

@alek-p

This comment has been minimized.

Copy link

alek-p commented Oct 21, 2014

This is a race between rm and auditd trying to rotate files as requested by "audit -n".
I've addressed it on my setup but I have a feeling the real fix will be to add "captive mode log rotate" to auditd. See alek-p@08407a6 patch for immediate relief.

@lloydde

This comment has been minimized.

Copy link
Contributor

lloydde commented Apr 29, 2015

[root@headnode (coal-1) ~]# uname -a
SunOS headnode 5.11 joyent_20150331T183020Z i86pc i386 i86pc
From root@headnode.example.com Sun Apr 12 03:00:00 2015
Date: Sun, 12 Apr 2015 03:00:00 GMT
From: Super-User <root@headnode.example.com>
Message-Id: <201504120300.t3C300HT038301@headnode.example.com>
To: root@headnode.example.com
Subject: Cron <root@headnode> /smartdc/bin/sdc-lastcomm -R 30
Content-Length: 88

find: stat() error ./20150412000055.not_terminated.headnode: No such file or directory
@lloydde

This comment has been minimized.

Copy link
Contributor

lloydde commented Jun 29, 2015

Continue to hit this every few days on my single server testing env.

@unclejack

This comment has been minimized.

Copy link

unclejack commented Jul 17, 2015

I'm seeing this as well.

@sjorge

This comment has been minimized.

Copy link
Contributor

sjorge commented Jul 23, 2015

Been seeing this on and off, happening a lot the last few weeks.

@farmergreg

This comment has been minimized.

Copy link

farmergreg commented Sep 30, 2015

same here:

[root@smarts/]# uname -a
SunOS smarts 5.11 joyent_20150917T232817Z i86pc i386 i86pc

From root@smarts Wed Sep 30 00:00:00 2015
Date: Wed, 30 Sep 2015 00:00:00 GMT
From: Super-User <root@smarts >
Message-Id: <201509300000.t8U000gI060918@smarts >
To: root@smarts
Subject: Cron root@smarts /smartdc/bin/sdc-lastcomm -R 30
Content-Length: 85

find: stat() error ./20150929000000.not_terminated.smarts: No such file or directory

@bahamas10

This comment has been minimized.

Copy link
Contributor

bahamas10 commented Oct 5, 2015

@bahamas10 bahamas10 closed this Oct 5, 2015

@lloydde

This comment has been minimized.

Copy link
Contributor

lloydde commented Oct 5, 2015

@bahamas10 You are gold!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment