New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Signed LineageOS build fails to boot on OnePlus One (bacon) #78
Comments
This might actually be a kernel bug, but it's still surprising that signing the Robotnix build triggers this. |
It might be worth trying to apply the patch referenced in that bug. You can easily do this in robotnix with
|
I quickly checked and it appears that this patch is indeed not part of the kernel tree. The image is rebuilding now and I'll report back when I tested it. |
The upstream patch is torvalds/linux@dac7a4b but that didn't apply cleanly, so I had to cherry-pick it by hand :/ PatchFrom 29cf3a89ce8d6c17a66e8128fce06b36e3e7c62f Mon Sep 17 00:00:00 2001
From: Henri Menke <henri@henrimenke.de>
Date: Fri, 1 Jan 2021 22:32:44 +0100
Subject: [PATCH] ext4: lock the xattr block before checksuming it
We must lock the xattr block before calculating or verifying the
checksum in order to avoid spurious checksum failures.
https://bugzilla.kernel.org/show_bug.cgi?id=193661
Reported-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@vger.kernel.org
(cherry picked from commit dac7a4b4b1f664934e8b713f529b629f67db313c)
---
fs/ext4/xattr.c | 60 ++++++++++++++++++++++++-------------------------
1 file changed, 29 insertions(+), 31 deletions(-)
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 890bfe137f24..820cbda82e86 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -145,33 +145,27 @@ static __le32 ext4_xattr_block_csum(struct inode *inode,
}
static int ext4_xattr_block_csum_verify(struct inode *inode,
- sector_t block_nr,
- struct ext4_xattr_header *hdr)
+ struct buffer_head *bh)
{
+ struct ext4_xattr_header *hdr = BHDR(bh);
+ int ret = 1;
if (EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
- EXT4_FEATURE_RO_COMPAT_METADATA_CSUM) &&
- (hdr->h_checksum != ext4_xattr_block_csum(inode, block_nr, hdr)))
- return 0;
- return 1;
+ EXT4_FEATURE_RO_COMPAT_METADATA_CSUM)) {
+ lock_buffer(bh);
+ ret = (hdr->h_checksum == ext4_xattr_block_csum(inode,
+ bh->b_blocknr, hdr));
+ unlock_buffer(bh);
+ }
+ return ret;
}
static void ext4_xattr_block_csum_set(struct inode *inode,
- sector_t block_nr,
- struct ext4_xattr_header *hdr)
+ struct buffer_head *bh)
{
- if (!EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
- EXT4_FEATURE_RO_COMPAT_METADATA_CSUM))
- return;
-
- hdr->h_checksum = ext4_xattr_block_csum(inode, block_nr, hdr);
-}
-
-static inline int ext4_handle_dirty_xattr_block(handle_t *handle,
- struct inode *inode,
- struct buffer_head *bh)
-{
- ext4_xattr_block_csum_set(inode, bh->b_blocknr, BHDR(bh));
- return ext4_handle_dirty_metadata(handle, inode, bh);
+ if (EXT4_HAS_RO_COMPAT_FEATURE(inode->i_sb,
+ EXT4_FEATURE_RO_COMPAT_METADATA_CSUM))
+ BHDR(bh)->h_checksum = ext4_xattr_block_csum(inode,
+ bh->b_blocknr, BHDR(bh));
}
static inline const struct xattr_handler *
@@ -232,7 +226,7 @@ ext4_xattr_check_block(struct inode *inode, struct buffer_head *bh)
if (buffer_verified(bh))
return 0;
- if (!ext4_xattr_block_csum_verify(inode, bh->b_blocknr, BHDR(bh)))
+ if (!ext4_xattr_block_csum_verify(inode, bh))
return -EIO;
error = ext4_xattr_check_names(BFIRST(bh), bh->b_data + bh->b_size,
bh->b_data);
@@ -568,8 +562,9 @@ ext4_xattr_release_block(handle_t *handle, struct inode *inode,
le32_add_cpu(&BHDR(bh)->h_refcount, -1);
if (ce)
mb_cache_entry_release(ce);
+ ext4_xattr_block_csum_set(inode, bh);
unlock_buffer(bh);
- error = ext4_handle_dirty_xattr_block(handle, inode, bh);
+ error = ext4_handle_dirty_metadata(handle, inode, bh);
if (IS_SYNC(inode))
ext4_handle_sync(handle);
dquot_free_block(inode, EXT4_C2B(EXT4_SB(inode->i_sb), 1));
@@ -804,13 +799,14 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
s->here);
ext4_xattr_cache_insert(bs->bh);
}
+ ext4_xattr_block_csum_set(inode, bs->bh);
unlock_buffer(bs->bh);
if (error == -EIO)
goto bad_block;
if (!error)
- error = ext4_handle_dirty_xattr_block(handle,
- inode,
- bs->bh);
+ error = ext4_handle_dirty_metadata(handle,
+ inode,
+ bs->bh);
if (error)
goto cleanup;
goto inserted;
@@ -879,10 +875,11 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
le32_add_cpu(&BHDR(new_bh)->h_refcount, 1);
ea_bdebug(new_bh, "reusing; refcount now=%d",
le32_to_cpu(BHDR(new_bh)->h_refcount));
+ ext4_xattr_block_csum_set(inode, new_bh);
unlock_buffer(new_bh);
- error = ext4_handle_dirty_xattr_block(handle,
- inode,
- new_bh);
+ error = ext4_handle_dirty_metadata(handle,
+ inode,
+ new_bh);
if (error)
goto cleanup_dquot;
}
@@ -937,11 +934,12 @@ ext4_xattr_block_set(handle_t *handle, struct inode *inode,
goto getblk_failed;
}
memcpy(new_bh->b_data, s->base, new_bh->b_size);
+ ext4_xattr_block_csum_set(inode, new_bh);
set_buffer_uptodate(new_bh);
unlock_buffer(new_bh);
ext4_xattr_cache_insert(new_bh);
- error = ext4_handle_dirty_xattr_block(handle,
- inode, new_bh);
+ error = ext4_handle_dirty_metadata(handle, inode,
+ new_bh);
if (error)
goto cleanup;
}
--
2.29.2
Unfortunately, the error is still there (see attached log) but when modifying the kernel I also noticed that there are tons of fixes missing for different xattr deadlocks. Maybe I should just open a bug upstream at https://gitlab.com/LineageOS/issues/android. Attachments |
I'm curious if you can still reproduce this issue using a recent robotnix commit. We're on |
I don't have a device currently. Postponed until further notice. |
When building a self-signed LineageOS image the resulting product does not boot on the OnePlus One (bacon). The device starts but remains stuck at the vendor image screen for several minutes. Then it restarts automatically into recovery saying “Cannot load Android system. Your data may be corrupt.”
The configuration is similar to
The setting
variant = "eng";
was chosen to be able to attach ADB during boot and inspect the logs for indication as to why boot is failing. From thedmesg
it looks like there is some kind of filesystem corruption which results in the system UI crashing with an I/O error:At first sight this looks like a hardware defect, but curiously an unsigned Robotnix build (
signing.enable = false;
), the stock ROM, and the official LineageOS ROM all boot fine. The filesystem image inside the signedTargetFiles also reports clean withe2fsck
, so the corruption is not due to the image. In the attachment there are thedmesg
outputs for an unsigned build that boots fine and a signed build that fails to boot.Attachments
dmesg_signed.log
dmesg_unsigned.log
The text was updated successfully, but these errors were encountered: