Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

network paths #4

Closed
secjunkie opened this issue Dec 3, 2021 · 10 comments
Closed

network paths #4

secjunkie opened this issue Dec 3, 2021 · 10 comments
Assignees
Labels
bug Something isn't working good first issue Good for newcomers

Comments

@secjunkie
Copy link
Contributor

Working on a case today and the script hang.. after a bit of investigation and we figured out it was the get_executables function which states:
find / -type f -perm -o+rx -print0 | xargs -0 sha1sum
The reason behind the hang was that the system had a network share (fuse.sshfs) that was mounted, yet the outgoing/incoming traffic due to the incident was moderated heavily.. so basically the mountpoint was broken.

To avoid that with any other situation similar we could do a quick fix of:
find / ! -fstype nfs,sshfs,fuse.sshfs -type f -perm -o+rx -print0 | xargs -0 sha1sum

Or I we could explore a more inclusive solution that makes sure the mount points are alive before running the find (or running it with parameters defining specific paths) if we really want ALL the bins from ALL the things

@56616c6f72
Copy link
Collaborator

56616c6f72 commented Dec 3, 2021

Hey man!

Thanks for reporting this back! :)

Thinking of more inclusive solution. That makes sure the mount points are alive before running.

I think solution would be something like below. The function works on Ubuntu and i expect it to work on all GNU hosts. This should also work in theory for the broken network directories.

Any chance you can test this?

get_executables(){ #Production

	#Hash executables in root directory. Don't go into mountpoints
	find / -xdev -type f -perm -o+rx -print0 | xargs -0 sha1sum > $OUTROOT/$OUTDIR/Misc/$OUTFILE-exec-perm-files.txt

	#Hash executables in mountpoints if they are alive
	for i in $(df -h --output=target | tail -n +3); do 
		if mountpoint -q $i &>/dev/null; then 
		
			find $i -type f -perm -o+rx -print0 | xargs -0 sha1sum >> $OUTROOT/$OUTDIR/Misc/$OUTFILE-exec-perm-files.txt; 
		
		else
			
			echo $i " mountpoint was not alive!" >> $OUTROOT/$OUTDIR/$OUTFILE-console-error-log.txt
			
		fi; 
	done;

}

@56616c6f72 56616c6f72 added bug Something isn't working good first issue Good for newcomers labels Dec 3, 2021
@secjunkie
Copy link
Contributor Author

There is a bit to unpack so I will try to be as short as possible.
On first view, the first find seems pointless to be honest, because the df will always reference / in its output and it will get the resutls from the for.

Some test results

Server(debian)- serves ssh and firewalls with DROP after ssh mount is established to simulate issue

Clients - Ubuntu / CentOS 7 (same as client that reported issue)

Ubuntu:
mountpoint hangs but fails in a few secs with Input/output error
no residual /proc/self/mountinfo path
no df path after fail
dmesg tells no story

CentOS:
moutpoint hangs the same way as find
the mount point is lost in df but it remains in /proc/self/mountinfo for a while (like 15mins or so) until find and mountpoint come back with "Input/output error"
dmesg tells the story:
[ 2279.400973] INFO: task mountpoint:4359 blocked for more than 120 seconds.
[ 2279.400977] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2279.400980] mountpoint D ffff8ebae17505e0 0 4359 3232 0x00000084
[ 2279.400984] Call Trace:
[ 2279.400993] [] schedule+0x29/0x70
[ 2279.401016] [] __fuse_request_send+0xf5/0x2e0 [fuse]
[ 2279.401021] [] ? wake_up_atomic_t+0x30/0x30
[ 2279.401028] [] fuse_request_send+0x12/0x20 [fuse]
[ 2279.401036] [] fuse_do_getattr+0x10a/0x330 [fuse]
[ 2279.401043] [] fuse_update_attributes+0x75/0x80 [fuse]
[ 2279.401050] [] fuse_getattr+0x40/0x60 [fuse]
[ 2279.401054] [] vfs_getattr+0x49/0x80
[ 2279.401058] [] vfs_fstatat+0x75/0xc0
[ 2279.401061] [] SYSC_newstat+0x2e/0x60
[ 2279.401066] [] ? system_call_after_swapgs+0xa2/0x13a
[ 2279.401069] [] ? system_call_after_swapgs+0x96/0x13a
[ 2279.401073] [] ? system_call_after_swapgs+0xa2/0x13a
[ 2279.401076] [] ? system_call_after_swapgs+0x96/0x13a
[ 2279.401080] [] ? system_call_after_swapgs+0xa2/0x13a
[ 2279.401083] [] ? system_call_after_swapgs+0x96/0x13a
[ 2279.401086] [] ? system_call_after_swapgs+0xa2/0x13a
[ 2279.401090] [] ? system_call_after_swapgs+0x96/0x13a
[ 2279.401093] [] ? system_call_after_swapgs+0xa2/0x13a
[ 2279.401097] [] ? system_call_after_swapgs+0x96/0x13a
[ 2279.401100] [] ? system_call_after_swapgs+0xa2/0x13a
[ 2279.401103] [] SyS_newstat+0xe/0x10
[ 2279.401107] [] system_call_fastpath+0x25/0x2a
[ 2279.401111] [] ? system_call_after_swapgs+0xa2/0x13a

I have further tested my suggestion as well btw.. even though my target is /mnt/sshfs and I am pointing find to /mnt (to make life easier) it STILL has the same results as mountpoint or the find without the exclusions

Back to the drawing board.

@56616c6f72
Copy link
Collaborator

df -h --output=target | tail -n +3 in the loop cuts out the / path from the loop. Also the first command we ran with -xdev so it stops it from descending into the mounted directories. This is how we had it initially instead of all directories.

Sounds like mountpoints is not the right tool for this.

While we look into this, I will change the function to exclude the hashing of the mounted directories to remediate this issue for the time.

What if we did not hash the files in the mounted dirs but we did a file command on them? Does it have better error handling i wonder. If file succeds we do sha1sum

So something like

find $i -type f -perm -o+rx -print0 | xargs -0 -I {} file {} >> $OUTROOT/$OUTDIR/Misc/$OUTFILE-exec-perm-files-file.txt && sha1sum {} >> $OUTROOT/$OUTDIR/Misc/$OUTFILE-exec-perm-files.txt;

@56616c6f72
Copy link
Collaborator

Updated the latest version to avoid network mounted directories while we work on this issue. Script will check for executables in

/  
/dev/
/proc/
/run/

Using -xdev to avoid / to dive into mounted network drives

@secjunkie
Copy link
Contributor Author

secjunkie commented Dec 6, 2021

I was going to suggest

for i in $(mount | grep -v 'cifs\|fuse.sshfs\|nfs' | awk '{print $3}'); do 
		find $i -xdev -type f -perm -o+rx -print0 | xargs -0 sha1sum >> $OUTROOT/$OUTDIR/Misc/$OUTFILE-exec-perm-files.txt;
	done;

but neither works really.. as soon as the find of / goes into /mnt and lists the sshfs directory... caboom.

dont get me wrong, when the firewall is off.. it will see it and IGNORE it.. but when the firewall is on and the traffic is DROPed.. seeing it makes it go boom.

@56616c6f72 56616c6f72 self-assigned this Dec 8, 2021
@56616c6f72
Copy link
Collaborator

I will try to deploy a testing environment of my own and work on this issue over the weekend. For now though, the recent update I made to the public version of the function should resolve this issue with some feature reduction :/

Assigning this to myself.

@secjunkie
Copy link
Contributor Author

I've been trying to think of a way tbh but keep coming up empty on this one. Nevertheless the hang time as expressed by the client is not the same, CentOS takes considerably more time to get the input/output error due to the process being picked up as dead.. but in no way shape or form reaching "overnight" levels.

@56616c6f72
Copy link
Collaborator

Have you tried the recent version of the script? v1.3.2 should stop all hanging. I've made adjustments to avoid network drives for now.

@secjunkie
Copy link
Contributor Author

Yeah the
find / -xdev -type f -perm -o+rx -print0 | xargs -0 sha1sum
its pretty much what the for I posted does at the end of the day.
The weird thing is that the first time i tried it a few days ago, the eventual error
find: ‘/mnt/sshfs’: Input/output error
took for ever(3-5mins).. which still was not close to the reported "left it running overnight"
I tried it just now on the same environment just to be sure and the error came out in a similar time as in the Debian test server (which was always much quicker to see the process hanging.. like a few seconds.. like it should!).
I cannot replicate the long hang times anymore... makes me think that there was some other functional issue with that box..

The icing on this cake:
I tried the original find with the broken mount point and it worked just fine (error and all but not taking forever)..
find / -type f -perm -o+rx -print0 | xargs -0 sha1sum

@56616c6f72
Copy link
Collaborator

Reverted back to v1.3.1 for now.

I think we leave it as it for now as issue seems not replicable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants