Skip to content

In the case of a cache disk mistake read,the first read data is inconsistent #5981

Closed
@YunhuiChen

Description

@YunhuiChen

What happened:
1、 mount juicefs
bin/mount.juicefs redis://redis.kube-system:6379/0 /jfs/pvc-64dab018-ed69-4f82-a1aa-52ff8c82a5fb-umvyyf -o enable-cap,writeback=false,cache-dir=/data/cache,cache-items=10,enable-xattr,max-readahead=100,subdir=pvc-64dab018-ed69-4f82-a1aa-52ff8c82a5fb

2、Simulate cache disk mistake read using chaosmesh

    - name: juicefs-io-mistake-read
      templateType: IOChaos
      ioChaos:
        action: mistake
        mode: one
        selector:
          namespaces:
            - kube-system
          labelSelectors:
            app.kubernetes.io/name: juicefs-mount
          labelSelectors:
            chaostest: "true"
          nodes:
            - chaos-k8s-001
        volumePath: /data/cache
        path: /data/cache/**/*
        mistake:
          filling: random
          maxOccurrences: 1
          maxLength: 1
        methods:
          - READ
        percent: 100

2、Data consistency check using vdbench

3、Data inconsistency occurred:

04:09:58.839 localhost-0: 04:09:58.825    All 1 sectors in this key block are corrupted.
04:09:58.840 localhost-0: 04:09:58.825    All corruptions are of the same type:
04:09:58.840 localhost-0: 04:09:58.825    ===> Compression pattern miscompare.
04:09:58.840 localhost-0: 04:09:58.825    Only the FIRST sector will be reported:
04:09:58.840 localhost-0: 04:09:58.825
04:09:58.840 localhost-0: 04:09:58.825         Data Validation error for fsd=fsd1; FSD lba: 0x2df01400; Key block size: 512; relative sector in data block: 0x00
04:09:58.841 localhost-0: 04:09:58.825         File name: /data/vdb.1_3.dir/vdb_f0002.file; file block lba: 0x05500000; bad sector file lba: 0x05501400
04:09:58.841 localhost-0: 04:09:58.827 0x000   00000000 2df01400 ........ ........   00000000 2df01400 00000196 37a37a18
04:09:58.841 localhost-0: 04:09:58.827 0x010   01..0000 31647366 20202020 00000000   01030000 31647366 20202020 0000001f
04:09:58.842 localhost-0: 04:09:58.828 0x0c0*  7f3850fa 3b59e396 1abbfdd2 01374e77   7f3850fa 3b59e396 1abbfdd2 01374eec
04:09:58.846 localhost-0: 04:09:58.833 Key block lba: 0x2df24800
04:09:58.846 localhost-0: 04:09:58.833    Key block of 512 bytes has 1 512-byte sectors.
04:09:58.847 localhost-0: 04:09:58.833    Timeline:
04:09:58.847 localhost-0: 04:09:58.833    Tue Apr 15 2025 04:09:42.680 GMT Sector last written. (As found in the first corrupted sector, timestamp is taken just BEFORE the actual write).
04:09:58.847 localhost-0: 04:09:58.833    Tue Apr 15 2025 04:09:58.758 GMT Key block first found to be corrupted during a read-before-write.
04:09:58.848 localhost-0: 04:09:58.833
04:09:58.848 localhost-0: 04:09:58.833    All 1 sectors in this key block are corrupted.
04:09:58.848 localhost-0: 04:09:58.833    All corruptions are of the same type:
04:09:58.848 localhost-0: 04:09:58.834    ===> Compression pattern miscompare.
04:09:58.849 localhost-0: 04:09:58.834    Only the FIRST sector will be reported:
04:09:58.849 localhost-0: 04:09:58.834
04:09:58.850 localhost-0: 04:09:58.834         Data Validation error for fsd=fsd1; FSD lba: 0x2df24800; Key block size: 512; relative sector in data block: 0x00
04:09:58.850 localhost-0: 04:09:58.834         File name: /data/vdb.1_3.dir/vdb_f0002.file; file block lba: 0x05500000; bad sector file lba: 0x05524800
04:09:58.850 localhost-0: 04:09:58.842 0x000   00000000 2df24800 ........ ........   00000000 2df24800 00000196 37a37a18
04:09:58.851 localhost-0: 04:09:58.843 0x010   01..0000 31647366 20202020 00000000   01030000 31647366 20202020 0000001f
04:09:58.851 localhost-0: 04:09:58.844 0x140*  5fa8f949 168a7473 6b723be5 3cebdcd8   5fa84449 168a7473 6b723be5 3cebdcd8
04:09:58.859 localhost-0: 04:09:58.850 Key block lba: 0x2df4c000

4、md5sum file,The data in the first read is inconsistent:

root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir# md5sum vdb_f0001.file
51fc69b3d5030f5dafc1ea00cced02fc  vdb_f0001.file
root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir#
root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir# md5sum vdb_f0001.file
febf97c778d7a4fe3d586dd84fa5c767  vdb_f0001.file
root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir#
root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir# md5sum vdb_f0001.file
febf97c778d7a4fe3d586dd84fa5c767  vdb_f0001.file
root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir# md5sum vdb_f0001.file
febf97c778d7a4fe3d586dd84fa5c767  vdb_f0001.file
root@dynamic-ce-juicefs-85545b6bfc-844z6:/data/vdb.1_3.dir# md5sum vdb_f0001.file
febf97c778d7a4fe3d586dd84fa5c767  vdb_f0001.file

What you expected to happen:

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?

Environment:

  • JuiceFS version (use juicefs --version) or Hadoop Java SDK version:
  • root@chaos-k8s-001:~/chenyunhui/juicefs# ./juicefs version
    juicefs version 1.3.0-dev+2025-04-14.196db13e
  • Cloud provider or hardware configuration running JuiceFS:
  • OS (e.g cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Object storage (cloud provider and region, or self maintained):
  • Metadata engine info (version, cloud provider managed or self maintained):
  • Network connectivity (JuiceFS to metadata engine, JuiceFS to object storage):
  • Others:

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions