Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files has damaged after deduplication #91

Closed
Nefelim4ag opened this issue Aug 9, 2015 · 9 comments
Closed

Files has damaged after deduplication #91

Nefelim4ag opened this issue Aug 9, 2015 · 9 comments

Comments

@Nefelim4ag
Copy link
Contributor

@Nefelim4ag Nefelim4ag commented Aug 9, 2015

I use my steam data for dedup testing proporses (it can be easy repaired)
I do something like:
find ~/CS.GO/ -type f -exec md5sum {} ; > ~/md5sums
duperemove -rdb 8k ~/CS.GO/
find ~/CS.GO/ -type -f -exec md5sum {} ; > ~/md5sums_deduped
diff -up ~/md5sums ~/md5sums_deduped

And get:
[$] diff -up ~/md5sum.log ~/md5sum_deduped.log [0:41:25]
--- /home/nefelim4ag/md5sum.log 2015-08-10 00:31:17.449352832 +0300
+++ /home/nefelim4ag/md5sum_deduped.log 2015-08-10 00:35:29.133182665 +0300
@@ -462,7 +462,7 @@ a1c373679d2722571d1095cc5b50e917 /home/
17f3e7c3c370c7cfb8460c49b500d6c1 /home/nefelim4ag/CS.GO/csgo/maps/de_vertigo.jpg
ef031e8b197066cbe8f9424bb8b4404a /home/nefelim4ag/CS.GO/csgo/maps/de_vertigo.nav
4f415d2a1f9d48d85fe12c04d673a9ec /home/nefelim4ag/CS.GO/csgo/maps/de_vertigo_cameras.txt
-00c3a46b124a38034594d2259989c5c2 /home/nefelim4ag/CS.GO/csgo/maps/de_zoo.bsp
+42b7a7ecc3243fb16ad2f1fdcb111765 /home/nefelim4ag/CS.GO/csgo/maps/de_zoo.bsp
5cebc0565ad91b77fb01085912b1366d /home/nefelim4ag/CS.GO/csgo/maps/de_zoo.jpg
bed7a7fca8473b336d5c391486ea08ab /home/nefelim4ag/CS.GO/csgo/maps/de_zoo.txt
aeac02d70f71d84ebb9cfdbc3d21a687 /home/nefelim4ag/CS.GO/csgo/maps/gd_assault.bsp
@@ -1379,7 +1379,7 @@ b84ab433a6b5cc3a78b0c47d2d334f22 /home/
8e08a40d915d3a1ebada0857675c4b7b /home/nefelim4ag/CS.GO/csgo/resource/overviews/de_vertigo_radar.dds
5074bb09bb5640a2abbe877c0497fa89 /home/nefelim4ag/CS.GO/csgo/resource/overviews/de_zoo.txt
05b0637ac62ed4f3a75522b1c82e60cb /home/nefelim4ag/CS.GO/csgo/resource/overviews/de_zoo_radar.dds
-b833acc16a3592b515533fdf534bc559 /home/nefelim4ag/CS.GO/csgo/resource/overviews/de_zoo_radar_spectate.dds
+51e161e5e9f791c86aa6059122a5e815 /home/nefelim4ag/CS.GO/csgo/resource/overviews/de_zoo_radar_spectate.dds
99a64bf1938dafe004bdbe35541d5392 /home/nefelim4ag/CS.GO/csgo/resource/overviews/gd_bank.txt
ea3d932c2da9fbc1fa43ca350fa9db34 /home/nefelim4ag/CS.GO/csgo/resource/overviews/gd_bank_radar.dds
a3a64ce1816734363e7a35a8f34fa2eb /home/nefelim4ag/CS.GO/csgo/resource/overviews/gd_cbble.txt
@@ -1922,21 +1922,21 @@ b3611c7f9fbe5cc2e03c344a1dced3db /home/
820de0df44e5455b5f5f8897b4b9605b /home/nefelim4ag/CS.GO/csgo/pak01_008.vpk
3a08346113d27853d0515e4deec81c13 /home/nefelim4ag/CS.GO/csgo/pak01_009.vpk
92ec86ffb0575e9791965fe73d25ef26 /home/nefelim4ag/CS.GO/csgo/pak01_010.vpk
-ed07bdffb3e56b362ad4b31e38ed94f4 /home/nefelim4ag/CS.GO/csgo/pak01_011.vpk
+59a76f77d48fcd088b07e45a9182d4d4 /home/nefelim4ag/CS.GO/csgo/pak01_011.vpk
f009c902413b924f02049052966ee03c /home/nefelim4ag/CS.GO/csgo/pak01_012.vpk
-610e116aa91b847c2f36c0230c8beeb7 /home/nefelim4ag/CS.GO/csgo/pak01_013.vpk
+13d05c3aa2b6cc367b2b3cf7d1de61ca /home/nefelim4ag/CS.GO/csgo/pak01_013.vpk
bae9c971ac82e810c37504b5fd182ea8 /home/nefelim4ag/CS.GO/csgo/pak01_014.vpk
-04dc1ce59622f51d7a4a32d877655752 /home/nefelim4ag/CS.GO/csgo/pak01_015.vpk
+89ff1b5ca1065a3e41c7e2ac1b323297 /home/nefelim4ag/CS.GO/csgo/pak01_015.vpk
f109aa5805bf14dc6f5eb1150abf0333 /home/nefelim4ag/CS.GO/csgo/pak01_016.vpk
c2095ce58c222f9a28b384b32c1d2968 /home/nefelim4ag/CS.GO/csgo/pak01_017.vpk
-72b4faa3a115b52d4bbb33ed415aaf89 /home/nefelim4ag/CS.GO/csgo/pak01_018.vpk
+8b2407376ac4c7e578e375dcff7fbcfb /home/nefelim4ag/CS.GO/csgo/pak01_018.vpk
ade79ddcf959044c1e639249e258cf30 /home/nefelim4ag/CS.GO/csgo/pak01_019.vpk
-b976d895290b2ebeca0e4428821de2f4 /home/nefelim4ag/CS.GO/csgo/pak01_020.vpk
+8441ae336721ee06303db7aae2df012c /home/nefelim4ag/CS.GO/csgo/pak01_020.vpk
776533cbe48fa831193e9715cd7f13e2 /home/nefelim4ag/CS.GO/csgo/pak01_021.vpk
4eaf19800bc04a58de506d9a830647fd /home/nefelim4ag/CS.GO/csgo/pak01_022.vpk
-d9241676cddb810c133eea98a8542a76 /home/nefelim4ag/CS.GO/csgo/pak01_023.vpk
-cf9c5f1a6e663e4f354be6c83effefd8 /home/nefelim4ag/CS.GO/csgo/pak01_024.vpk
-f5fa2a4837649a7a6450fa733638f495 /home/nefelim4ag/CS.GO/csgo/pak01_025.vpk
+5f30bf02cf9cbe5005ca9b47b1c7c6d9 /home/nefelim4ag/CS.GO/csgo/pak01_023.vpk
+fc9d1b7682a4288b83345a22017498b3 /home/nefelim4ag/CS.GO/csgo/pak01_024.vpk
+0f916bba4135f1f03348042125a7b15c /home/nefelim4ag/CS.GO/csgo/pak01_025.vpk
d9e115d3f97ce761f4ff404094edce86 /home/nefelim4ag/CS.GO/csgo/pak01_026.vpk
9df203a609960202ebb34aa11231480f /home/nefelim4ag/CS.GO/csgo/pak01_027.vpk
ee8b3e61f9a00d493db25c7a9189565a /home/nefelim4ag/CS.GO/csgo/pak01_028.vpk
@@ -1960,8 +1960,8 @@ cae5d29d56721de431ddb5ddba56f5d0 /home/
1090352e5ef70830233d6ffeee8402a9 /home/nefelim4ag/CS.GO/csgo/pak01_046.vpk
edd4df576dad2d2def12ba1c8b0d7a2f /home/nefelim4ag/CS.GO/csgo/pak01_047.vpk
e540472dc1254f706ff01877abe2839a /home/nefelim4ag/CS.GO/csgo/pak01_048.vpk
-6afabf774fd748bd48a67213d3b834e5 /home/nefelim4ag/CS.GO/csgo/pak01_049.vpk
-5bbf08d13f87e080851527eeb66314f8 /home/nefelim4ag/CS.GO/csgo/pak01_050.vpk
+f00b7cf24ebe6ac074e772b54d0d5ba1 /home/nefelim4ag/CS.GO/csgo/pak01_049.vpk
+25ae62bebb88ac2b3055fe1193b70b8c /home/nefelim4ag/CS.GO/csgo/pak01_050.vpk
0edd4bb737d342904b90b2dd72a087ec /home/nefelim4ag/CS.GO/csgo/pak01_051.vpk
9dead3da4f20a6e5399db3047587d3c9 /home/nefelim4ag/CS.GO/csgo/pak01_052.vpk
5d370a5de2b2e7444b46c4d276af17b0 /home/nefelim4ag/CS.GO/csgo/pak01_053.vpk

As i know, kernel must check what data is realy identical, and its a very safe operation, but as i see, happens something unexpected.
Kernel linux-next 07.08.2015
btrfs with zlib compression

@markfasheh
Copy link
Owner

@markfasheh markfasheh commented Aug 10, 2015

Yeah that's interesting, thanks very much for running this test.

I'll just hit you with a bunch of questions if that's ok. Basically I want to verify that the data got corrupted (you've done most of this for us already) and then obviously we want to go figure out how this happened in the first place.

  • Can you run btrfsck on the device to see if there's any metadata corruption that it can find
  • How reproducible is this - did it happen only once, on every run, etc
  • Is there any process accessing the files while they were being deduped or is duperemove getting more or less 'exclusive' access to them (this isn't a requirement but would tell me whether we need to look into some sort of race condition)

Last question might be impossible for you to answer, which is obviously fine :)

  • are the contents of any of the corrupted files something you can verify visually? By that I mean can you load them in an editor or viewer of sorts and see what they look like now. I'm curious if it's some random garbage, an older version of the data, or even data from another file type.
@markfasheh markfasheh closed this Aug 10, 2015
@Nefelim4ag
Copy link
Contributor Author

@Nefelim4ag Nefelim4ag commented Aug 11, 2015

Not a problem Mark,
This a very rarely problem,
Before, i think what steam just a stupid and check mtime, but in 4.2 kernel, you has fix the btrfs dedup ioctl. And problem not gone, then i've try to check data consistency and i found a difference.

  1. Can you run btrfsck on the device to see if there's any metadata corruption that it can find
    I've run btrfs check on my root partition, this a fresh fs (i've create it several days ago) and btrfs progs can't find any errors on fs.
  2. it's happen often, but not easy reproducible, i've kill ubuntu system be running dedup in rootfs
    // i really easy reproduce it every time on the same data set %) (on steam games)
  3. I think what from userspace duperemove work in exclusive mode. On the kernel side ... i don't know.
  4. Sorry, but i can't check this content by other soft.
@Nefelim4ag
Copy link
Contributor Author

@Nefelim4ag Nefelim4ag commented Aug 11, 2015

Also, i can joke, what you can install Steam with Team Fortress, and run latest kernel & duperemove.
With 128 kb and 32kb blocks not damaged, but with 8k, many files has damaged, size of block only change the chance to happen.

@juliantaylor
Copy link

@juliantaylor juliantaylor commented Aug 11, 2015

how large are the corrupted files?
there was a corruption fix to clone on inline extents recently: http://www.spinics.net/lists/linux-btrfs/msg45443.html

@Nefelim4ag
Copy link
Contributor Author

@Nefelim4ag Nefelim4ag commented Aug 11, 2015

@juliantaylor, thanks,
large then few MB
May be btrfs can inline pieces of file (1 extent with changes?)?

// it's hard to get corruption with 128k blocks, but i also get it in some cases.

@Nefelim4ag
Copy link
Contributor Author

@Nefelim4ag Nefelim4ag commented Sep 29, 2015

FYI:
Looks like patch:
Btrfs: fix read corruption of compressed and shared extents

Partial fixed my issue

@juliantaylor
Copy link

@juliantaylor juliantaylor commented Sep 29, 2015

there was another patch for this posted yesterday:
http://www.spinics.net/lists/linux-btrfs/msg47557.html
maybe it fully fixes your issue?

@Nefelim4ag
Copy link
Contributor Author

@Nefelim4ag Nefelim4ag commented Sep 29, 2015

@juliantaylor,
thanks, i will try to apply it and test, thx

@Nefelim4ag
Copy link
Contributor Author

@Nefelim4ag Nefelim4ag commented Sep 29, 2015

@juliantaylor,
Thx, this is full fix my issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.