Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpm install failed with "cpio: utime failed - Resource temporarily unavailable" #449

Closed
colorfulshark opened this issue Jun 12, 2018 · 7 comments

Comments

@colorfulshark
Copy link

The installation of a rpm package named "libjpeg62-1.5.2-r0.0.core2_64.rpm" failed with the following log.

error: unpacking of archive failed on file /usr/lib64/libjpeg.so.62;5b1f5f5a: cpio: utime failed - Resource temporarily unavailable
error: libjpeg62-1:1.5.2-r0.0.core2_64: install failed

I suspected that it is NFS rootfs that causes the error so I tried using rpm to install that package with NFS rootfs and standard rootfs. When using NFS, an error appeared but with standard rootfs it's all right.

I used strace to trace the return code of every function in rpm. And I found the following information in strace log.

utimensat(AT_FDCWD, "/usr/lib64/libjpeg.so.62;5b111347", [{tv_sec=1527818564, tv_nsec=0} /* 2018-06-01T02:02:44+0000 */, {tv_sec=1527818564, tv_nsec=0} /* 2018-06-01T02:02:44+0000 */], AT_SYMLINK_NOFOLLOW) = -1 ESTALE (Stale file handle)

It indicates that there are some problem with utime in rpm installation procedure. And I found the rpm tool uses utime in function fsmUtime in lib/fsm.c.

I added some code to debug and I found the rpm uses lutimes, which means the system supports HAVE_LUTIMES. And the perror also shows an error code "ESTALE".

I searched it and found this error code is returned by NFS server. On NFS FAQ website(http://nfs.sourceforge.net/#faq_c10), I found 5 situations which cause this error, and the fifth one may suits my situation.

5.The exported file system doesn't support permanent inode numbers. Exporting FAT file systems via NFS is problematic for this reason. This problem can be avoided by exporting only local filesystems which have good NFS support. See question C6 for more information.

I tried to use touch to modify the time stamp but it did not work. If this is true, we should relax the restriction of utime when using NFS rootfs. So I added some code to check the filesystem type and if it is NFS, the code does not care so much for the result of utime.

I tested the new rpm package. The problem is fixed and I have not found any bug of it. But even the latest rpm source code does not change these code. So I want to know if this kind of modification makes sense?

RPM version:

RPM version 4.13.90

My NFS export config:

/nfsroot (rw,no_root_squash,no_subtree_check,async,insecure)

NFS client config:

root=/dev/nfs nfsroot=10.0.2.2:/nfsroot,nfsvers=3,port=3063,udp,mountport=3062 rw highres=off  console=ttyS0 mem=256M ip=dhcp vga=0 uvesafb.mode_option=640x480-32 oprofile.timer=1 uvesafb.task_timeout=-1
diff --git a/lib/fsm.c b/lib/fsm.c
index dcfce82..3fdd7df 100644
--- a/lib/fsm.c
+++ b/lib/fsm.c
@@ -10,6 +10,8 @@
 #if WITH_CAP
 #include <sys/capability.h>
 #endif
+#include <sys/vfs.h>
+#include <linux/magic.h>

 #include <rpm/rpmte.h>
 #include <rpm/rpmts.h>
@@ -604,6 +606,9 @@ static int fsmUtime(const char *path, mode_t mode, time_t mtime)
        { .tv_sec = mtime, .tv_usec = 0 },
        { .tv_sec = mtime, .tv_usec = 0 },
     };
+    struct statfs fs;
+
+    statfs(path, &fs);

 #if HAVE_LUTIMES
     rc = lutimes(path, stamps);
@@ -616,8 +621,9 @@ static int fsmUtime(const char *path, mode_t mode, time_t mtime)
        rpmlog(RPMLOG_DEBUG, " %8s (%s, 0x%x) %s\n", __func__,
               path, (unsigned)mtime, (rc < 0 ? strerror(errno) : ""));
     if (rc < 0)        rc = RPMERR_UTIME_FAILED;
-    /* ...but utime error is not critical for directories */
-    if (rc && S_ISDIR(mode))
+
+    /* ...but utime error is not critical for directories and nfs fs */
+    if (rc && (S_ISDIR(mode) || fs.f_type == NFS_SUPER_MAGIC))
        rc = 0;
     return rc;
 }
@pmatilai
Copy link
Member

No, that's not the way to go. We're not adding an extra statfs() call to every single file/dir operation rpm does for an esoteric case that basically only happens when going against direct recommendations on NFS usage. The troubles with Linux on FAT-based filesystems are not going to end here, why not use an actual POSIX filesystem to export over NFS?

@colorfulshark
Copy link
Author

Thanks for reply. In fact my NFS host filesystem is ext4. And the real problem is that if a symlink file points to a target which does not exist, the function lutimes() will failed with an error code ESTALE returned by NFS server. I found rpm tool always creates the symlink first and tries to modify its timestamp, when the target file has not been extracted.

@pmatilai
Copy link
Member

If it's ext4 then did you post an excerpt that talks about filesystems that don't support permanent inodes such as FAT? :) Anyway, with ext4 underneath the story is entirely different.

This seems more like an NFS bug though, it's clearly being told NOT to follow the symlink:

utimensat(AT_FDCWD, "/usr/lib64/libjpeg.so.62;5b111347", [{tv_sec=1527818564, tv_nsec=0} /* 2018-06-01T02:02:44+0000 /, {tv_sec=1527818564, tv_nsec=0} / 2018-06-01T02:02:44+0000 */], AT_SYMLINK_NOFOLLOW) = -1 ESTALE (Stale file handle)

Thinking about it a bit, utimensat() is a relatively new thing and eg. NFS v3 (never mind v2) wouldn't know anything about it even if our kernel does, but then I'd think it'd be the responsibility of the NFS stack to deal with this.

You say you're getting this with nfsroot, but can you easily test, it'd be interesting to know if it happens with regular NFS client too. And the versions involved - which NFS version, kernel, userland parts.

Finally, I think this should work around the problem:

+++ b/lib/fsm.c
@@ -638,6 +638,9 @@ static int fsmUtime(const char *path, mode_t mode, time_t mtime)
 
 #if HAVE_LUTIMES
     rc = lutimes(path, stamps);
+    /* NFS doesn't necessarily support lutimes() even if our kernel does */
+    if (rc == ESTALE && S_ISLINK(mode))
+       rc = 0;
 #else
     if (!S_ISLNK(mode))
        rc = utimes(path, stamps);

...but please confirm. And I'd also like to understand the affected versions before considering actually applying such a workaround.

@colorfulshark
Copy link
Author

You are right.

This should be a NFS bug. But the situation is a little complex...
I test the standard NFS server: nfs-kernel-server. And it works just fine.
Then I traced the log of runqemu(yocto), it seems that yocto uses unfsd(unfs3) as NFS server for nfsroot.
I downloaded the unfs3 and run it on my PC, then the error occurred.

Finally I found if I use lutimes to modify the timestamp of a symlink file pointing to a none existent file, the error occurs. But if the symlink points to an existent file, there will be no error.

So I guess the unfs3 server somehow follows the symlink though the code is not. If I can't find a solution about unfs3, your work round will be the best choice.

Thanks.

@pmatilai
Copy link
Member

Ok, thanks for narrowing it down to unfs3. Never heard of such a thing before now, which goes to explain why such an issue has never turned up before...

Mmh, looking at unfs3 code, the following is the only reference to utimes that I could find:

/*
 * setting of time, races with local filesystem
 *
 * there is no futimes() function in POSIX or Linux
 */
static nfsstat3 set_time(const char *path, backend_statstruct buf, sattr3 new)

That's a wee bit outdated, but then the commit is from 2004.

@pmatilai
Copy link
Member

...anyway... it eventually goes down to call utime() to set the time. Which of course does follow symlinks by definition and thus fails with -ENOENT, which unfs3 translates to -ESTALE on the way back. You'll need someone who knows about NFS to look further though...

@colorfulshark
Copy link
Author

Thanks for your effort about this!
I finally made a work round in unfs3. But because unfs3 can not determine if the nfs client wants to modify the symlink file itself or to follow the symlink, I made it use lutimes for symlink if this function exists, just like the code in fsm.c. And the the code of rpm will remain clean. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants