Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent results when walking the filesystem concurrently #780

Closed
dylan-ferreira opened this issue Oct 18, 2018 · 17 comments

Comments

@dylan-ferreira
Copy link

commented Oct 18, 2018

The Issue

We found this during the audit phase of a new software project. LizardFS gives inconsistent results when walking the mounted filesystem. Running find ./ returns mostly correct files, but also a selection of duplicate files (the exact same path/file two or more times), as well a selection of files that don't actually exist. From what we can tell, all files stored on the filesystem do exist, but occasionally filenames are being returned with incorrect paths. The incorrect paths also exist, but is not where the file in question is stored.

This has been tested and recreated on 3.11.3 and 3.12.0 on Ubuntu Xenial in multiple datacenters (i.e. separate hardware).

Recreating

Setup a new 3.12.0 LizardFS environment with one or more chunk servers.

On the master server:

  • Mount the LizardFS root (e.g. /mnt/lfs).
  • Write ~3 million files to the filesystem distributed in paths 3 or more levels deep (e.g. ./1/2/3/file1 etc).
  • When the files are in place, run concurrent find commands (or any other script etc that walks the filesystem), sorted out to a file, e.g.
    find /mnt/lfs | sort > /var/tmp/find.out.1
    find /mnt/lfs | sort > /var/tmp/find.out.2
  • Then compare the output:
    comm -3 /var/tmp/find.out.1 /var/tmp/find.out.2

In most cases, if the find commands are run concurrently, you will get a handful of differences. In a sample size of 3 million, we're seeing 50 - 200 incorrect items.

@onlyjob

This comment has been minimized.

Copy link
Member

commented Oct 19, 2018

I could not reproduce on 3.13.0~rc1 (jemalloc build) under Debian amd64 in directory tree with 4_000_000+ files/directories, mounted with cacheexpirationtime=0.

@dylan-ferreira

This comment has been minimized.

Copy link
Author

commented Oct 19, 2018

I gave it a try on 3.12.0 under Ubuntu Xenial, mounted with cacheexpirationtime=0, 3138754 files.
Running the following commands concurrently on a static filesystem (nothing writing to the filesystem):
find /mnt/lfs | sort > find.out.1.s
find /mnt/lfs | sort > find.out.2.s
The resulting file counts matched, but the md5sums were different.

find.out.1.s had 2 duplicate lines and there were 40 differences between find.out.1.s and find.out.2.s

I haven't tried upgrading to 3.13.0~rc1 yet.

@onlyjob

This comment has been minimized.

Copy link
Member

commented Oct 21, 2018

In my case md5sums matched. Please be extremely careful upgrading to 3.13.0~rc if you have any EC chunks: #746.
Anyway there are no official Debian packages for 3.13.0~rc1 hence not much you can do on Ubuntu...

Just to clarify, did you try 3.12.0+dfsg-1 from native Ubuntu repositories?

@dylan-ferreira

This comment has been minimized.

Copy link
Author

commented Oct 22, 2018

We're running the packages referenced from the LizardFS website: http://packages.lizardfs.com/ubuntu/xenial/dists/xenial/main/binary-amd64/Packages

@onlyjob

This comment has been minimized.

Copy link
Member

commented Oct 23, 2018

I don't trust those poor quality packages...

@dylan-ferreira

This comment has been minimized.

Copy link
Author

commented Oct 24, 2018

Setup a brand new 3.13.0~rc1 cluster using the packages from LizardFS (https://lizardfs.com/download). The master and chunk servers are all running on Ubuntu (Trusty) 14.04. The master is running a stock config for this test. In my original setup, I ran the master in an LXC container, but for this setup, I put the master on the hardware. There are 3 chunk servers also running on hardware, each with 4 disks.

I loaded this filesystem with 2.2 million small files and was able to reproduce the issue.
4 concurrent find commands produced 3 identical output files and 1 file that is different.

In the "bad" (different) file, I found:

  • 2 files were reported twice (duplicates).
  • 38 files that only appeared in this file. These 38 files do not actually exist in the filesystem although they follow the same path/filename pattern as the existing files (possibly path & filename being transposed?).
  • 40 files that are missing (exist in the other 3 files).
@matthiaz

This comment has been minimized.

Copy link
Contributor

commented Oct 25, 2018

@onlyjob "I don't trust those poor quality packages..." why not? What is bad about those?
Honest question, I would like to know.

@onlyjob

This comment has been minimized.

Copy link
Member

commented Oct 25, 2018

Confirmed: I could not reproduce the problem with two parallel find but when I increased concurrency to four the output became non-deterministic and all four results differ with unique MD5.

@onlyjob onlyjob added the bug label Oct 25, 2018

@dylan-ferreira

This comment has been minimized.

Copy link
Author

commented Oct 25, 2018

I've possibly found a little more here. So far, I can only reproduce this problem when the filenames are non-unique across the path I'm recursing. If the filenames are unique across the recursed path, I cannot reproduce.

Here's a set of shell scripts I'm running with:
lfs-test.zip

Test Data Creation

lfs-make-files-unique.sh

Makes a 3 level deep directory tree [a-z]/[a-z]/[a-z] with files at bottom that are unique across the entire tree. The contents of each file is the filename.

lfs-make-files-duplicates.sh

Makes a 3 level deep directory tree [a-f]/[a-f]/[a-f]. Each bottom-level directory contains 14000 files named 1 to 14000 (so plenty of identical filenames in the tree). The contents of each file is the filename.

Test Data Testing

lfs-test-unique.sh & lfs-test-duplicates.sh

Runs 99 tests. Each test has the following steps:

  • Runs 4 (mostly) concurrent directory walks of a path into sorted output files.
  • Tests the output files for differences (md5sum).
  • If a difference is detected, it reports and the output files are gzipped.
  • If no differences are detected, the output files are deleted.

Results

Results from the test scripts above:

lfs-test-unique.sh : 0 consistency errors found over 99 runs
fs-test-duplicates.sh : 5 consistency errors found over 99 runs

Second set of results:

lfs-test-unique.sh : 0 consistency errors found over 1000 runs
fs-test-duplicates.sh : 9 consistency errors found over 99 runs

@dylan-ferreira

This comment has been minimized.

Copy link
Author

commented Nov 8, 2018

Checking the progress on this ticket.

I've run the same tests on a MooseFS v3.0.101-1 setup on the same hardware and can't replicate.

@matthiaz

This comment has been minimized.

Copy link
Contributor

commented Nov 8, 2018

@onlyjob onlyjob added this to the 3.13.0 milestone Dec 18, 2018

@onlyjob

This comment has been minimized.

Copy link
Member

commented Dec 18, 2018

@onlyjob "I don't trust those poor quality packages..."

@matthiaz: why not? What is bad about those? Honest question, I would like to know.

I tried to elaborate on that once in #456: #456 (comment)

Outside of Debian, vendors just can not accomplish proper integration with Debian (through packaging). It is just not their speciality. Naturally they miss peer review, policy requirements, best practice, vigorous Debian quality control and CI, don't follow/answer Debian-specific bugs and build packages for few hardware architectures at best.

LizardFS vendor packages are many commits behind official Debian packaging and not even trying to sync. As far as I'm concerned, it is a redundant duplication of effort for LizardFS team to try providing their own binary packages for Debian. Besides when you get your packages from Debian, it means that at least package maintainer checked them (or tested), maybe cherry-picked fixes, etc.

Basically Don't Break Debian by using inferior unprofessional packages...

@onlyjob onlyjob modified the milestones: 3.13.0, 3.12.1 Dec 18, 2018

@dylan-ferreira

This comment has been minimized.

Copy link
Author

commented Feb 6, 2019

Is there any status on this ticket? We're still struggling with getting consistent results from the filesystem, i.e. we can't backup etc.

@onlyjob

This comment has been minimized.

Copy link
Member

commented Feb 6, 2019

There were no progress on anything in months.
You are not the only one concerned: #805
This project appears to be dormant... :(

@pbeza

This comment has been minimized.

Copy link
Member

commented Mar 27, 2019

Unfortunately I can confirm the problem.

I've compiled latest LizardFS master branch (ver. 3.13.0) and tested it on my testing Docker-based environment using @dylan-ferreira's lfs-make-files-duplicates.sh and lfs-test-duplicates.sh scripts.

This is the result of running lfs-test-duplicates.sh:

root@lizardfs-client:~/lfs-test# ./lfs-test-duplicates.sh
mkdir: created directory '/var/tmp/lfs-test-duplicates.output'
found a problem on run 8!  (/var/tmp/lfs-test-duplicates.output/find.out.8.*.is)
gzip: /var/tmp/lfs-test-duplicates.output/find.out.8.01.is: file size changed while zipping
gzip: /var/tmp/lfs-test-duplicates.output/find.out.8.03.is: file size changed while zipping
found a problem on run 9!  (/var/tmp/lfs-test-duplicates.output/find.out.9.*.is)
found a problem on run 12!  (/var/tmp/lfs-test-duplicates.output/find.out.12.*.is)
found a problem on run 13!  (/var/tmp/lfs-test-duplicates.output/find.out.13.*.is)
gzip: /var/tmp/lfs-test-duplicates.output/find.out.13.01.is: file size changed while zipping
found a problem on run 15!  (/var/tmp/lfs-test-duplicates.output/find.out.15.*.is)
gzip: /var/tmp/lfs-test-duplicates.output/find.out.15.02.is: file size changed while zipping
gzip: /var/tmp/lfs-test-duplicates.output/find.out.15.03.is: file size changed while zipping
found a problem on run 16!  (/var/tmp/lfs-test-duplicates.output/find.out.16.*.is)
found a problem on run 17!  (/var/tmp/lfs-test-duplicates.output/find.out.17.*.is)
found a problem on run 18!  (/var/tmp/lfs-test-duplicates.output/find.out.18.*.is)
gzip: /var/tmp/lfs-test-duplicates.output/find.out.18.03.is: file size changed while zipping
found a problem on run 23!  (/var/tmp/lfs-test-duplicates.output/find.out.23.*.is)
gzip: /var/tmp/lfs-test-duplicates.output/find.out.23.01.is: file size changed while zipping
gzip: /var/tmp/lfs-test-duplicates.output/find.out.23.02.is: file size changed while zipping
found a problem on run 25!  (/var/tmp/lfs-test-duplicates.output/find.out.25.*.is)
found a problem on run 27!  (/var/tmp/lfs-test-duplicates.output/find.out.27.*.is)
found a problem on run 28!  (/var/tmp/lfs-test-duplicates.output/find.out.28.*.is)
found a problem on run 30!  (/var/tmp/lfs-test-duplicates.output/find.out.30.*.is)
found a problem on run 31!  (/var/tmp/lfs-test-duplicates.output/find.out.31.*.is)
found a problem on run 32!  (/var/tmp/lfs-test-duplicates.output/find.out.32.*.is)
found a problem on run 33!  (/var/tmp/lfs-test-duplicates.output/find.out.33.*.is)
found a problem on run 35!  (/var/tmp/lfs-test-duplicates.output/find.out.35.*.is)
found a problem on run 36!  (/var/tmp/lfs-test-duplicates.output/find.out.36.*.is)
found a problem on run 37!  (/var/tmp/lfs-test-duplicates.output/find.out.37.*.is)
found a problem on run 42!  (/var/tmp/lfs-test-duplicates.output/find.out.42.*.is)
found a problem on run 43!  (/var/tmp/lfs-test-duplicates.output/find.out.43.*.is)
found a problem on run 44!  (/var/tmp/lfs-test-duplicates.output/find.out.44.*.is)
found a problem on run 51!  (/var/tmp/lfs-test-duplicates.output/find.out.51.*.is)
found a problem on run 52!  (/var/tmp/lfs-test-duplicates.output/find.out.52.*.is)
found a problem on run 53!  (/var/tmp/lfs-test-duplicates.output/find.out.53.*.is)
found a problem on run 55!  (/var/tmp/lfs-test-duplicates.output/find.out.55.*.is)
found a problem on run 56!  (/var/tmp/lfs-test-duplicates.output/find.out.56.*.is)
found a problem on run 59!  (/var/tmp/lfs-test-duplicates.output/find.out.59.*.is)
found a problem on run 60!  (/var/tmp/lfs-test-duplicates.output/find.out.60.*.is)
found a problem on run 63!  (/var/tmp/lfs-test-duplicates.output/find.out.63.*.is)
found a problem on run 66!  (/var/tmp/lfs-test-duplicates.output/find.out.66.*.is)
found a problem on run 67!  (/var/tmp/lfs-test-duplicates.output/find.out.67.*.is)
found a problem on run 71!  (/var/tmp/lfs-test-duplicates.output/find.out.71.*.is)
found a problem on run 72!  (/var/tmp/lfs-test-duplicates.output/find.out.72.*.is)
found a problem on run 73!  (/var/tmp/lfs-test-duplicates.output/find.out.73.*.is)
found a problem on run 74!  (/var/tmp/lfs-test-duplicates.output/find.out.74.*.is)
found a problem on run 75!  (/var/tmp/lfs-test-duplicates.output/find.out.75.*.is)
found a problem on run 76!  (/var/tmp/lfs-test-duplicates.output/find.out.76.*.is)
found a problem on run 77!  (/var/tmp/lfs-test-duplicates.output/find.out.77.*.is)
found a problem on run 78!  (/var/tmp/lfs-test-duplicates.output/find.out.78.*.is)
found a problem on run 80!  (/var/tmp/lfs-test-duplicates.output/find.out.80.*.is)
found a problem on run 81!  (/var/tmp/lfs-test-duplicates.output/find.out.81.*.is)
found a problem on run 82!  (/var/tmp/lfs-test-duplicates.output/find.out.82.*.is)
found a problem on run 83!  (/var/tmp/lfs-test-duplicates.output/find.out.83.*.is)
found a problem on run 86!  (/var/tmp/lfs-test-duplicates.output/find.out.86.*.is)
found a problem on run 87!  (/var/tmp/lfs-test-duplicates.output/find.out.87.*.is)
found a problem on run 89!  (/var/tmp/lfs-test-duplicates.output/find.out.89.*.is)
found a problem on run 90!  (/var/tmp/lfs-test-duplicates.output/find.out.90.*.is)
found a problem on run 91!  (/var/tmp/lfs-test-duplicates.output/find.out.91.*.is)
found a problem on run 93!  (/var/tmp/lfs-test-duplicates.output/find.out.93.*.is)
found a problem on run 94!  (/var/tmp/lfs-test-duplicates.output/find.out.94.*.is)
found a problem on run 96!  (/var/tmp/lfs-test-duplicates.output/find.out.96.*.is)
found a problem on run 98!  (/var/tmp/lfs-test-duplicates.output/find.out.98.*.is)
found a problem on run 99!  (/var/tmp/lfs-test-duplicates.output/find.out.99.*.is)

To make sure that find command implementation itself is not faulty, I made the same test without LizardFS – just ext4 filesystem, lot of files generated by @dylan-ferreira's lfs-make-files-duplicates.sh and find traversing the directory tree by running lfs-test-duplicates.sh. I works fine, so there is probably a problem with LizardFS master server. For both tests (with and without LizardFS) I was using the same find version 4.6.0.225-235f.

EDIT

I was about to click Comment button to confirm the problem, but I've just noticed that you run concurrently 4 find instances in parallel and you don't wait for them to finish before comparing the output files and eventually gzipping them. That's the reason why you can see:

gzip: /var/tmp/lfs-test-duplicates.output/find.out.23.01.is: file size changed while zipping

in ./lfs-test-duplicates.sh command output. I'm not sure if it's also the reason of the not identical e.g. find.out.99.*.is report/output files, but it may be – I will retest it soon and report back.

@pbeza

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

Update

I retested it with wait after all of the four find instances:

for a in {1..99};
do
    for i in {1..4};
    do
        find ${ROOT} | sort > ${OUTPUT}/find.out.${a}.0${i}.is &
        pids[${i}]=$!
    done

    for pid in ${pids[*]};
    do
        wait $pid
    done

    file_pattern="${OUTPUT}/find.out.${a}.*.is"
    diff_count=$(md5sum ${file_pattern} | awk '{print $1}' | sort -u | wc -l)
    if [[ ${diff_count} > 1 ]]; then
        echo "found a problem on run ${a}!  (${OUTPUT}/find.out.${a}.*.is)"
        gzip -1 ${file_pattern}
    else
        rm -f ${file_pattern}
    fi
done

and there is still a problem (note fewer error messages):

root@lizardfs-client:~/lfs-test# ./lfs-test-duplicates.sh 
mkdir: created directory '/var/tmp/lfs-test-duplicates.output'
found a problem on run 42!  (/var/tmp/lfs-test-duplicates.output/find.out.42.*.is)
found a problem on run 49!  (/var/tmp/lfs-test-duplicates.output/find.out.49.*.is)
found a problem on run 50!  (/var/tmp/lfs-test-duplicates.output/find.out.50.*.is)

The most striking thing is that in every test case that have duplicates, there is the same file missing/duplicated (files no. 135 and 5824):

patryk@patryk:/tmp/lfs-test-duplicates.output$ diff find.out.42.01.is find.out.42.03.is 
1460018a1460019
> /mnt/lizardfs/duplicates/c/f/c/135
1465490d1465490
< /mnt/lizardfs/duplicates/c/f/c/5824

patryk@patryk:/tmp/lfs-test-duplicates.output$ diff find.out.49.01.is find.out.49.02.is 
801963d801962
< /mnt/lizardfs/duplicates/b/d/d/135
807434a807434
> /mnt/lizardfs/duplicates/b/d/d/5824

patryk@patryk:/tmp/lfs-test-duplicates.output$ diff find.out.50.01.is find.out.50.03.is 
2062070a2062071
> /mnt/lizardfs/duplicates/e/a/d/135
2067542d2067542
< /mnt/lizardfs/duplicates/e/a/d/5824

Updated update

I narrowed the problem with git bisect. I'm quite sure that the bug was introduced in commit no. f68e389bdf21c06f26e4811b81193941f2f64997.

Temporary fix is to mount LizardFS share with mfsdirentrycacheto=0 (unfortunately it introduces noticeable slowdown).

commit f68e389bdf21c06f26e4811b81193941f2f64997 (refs/bisect/bad)
Author: Hazeman <hazeman@skytechnology.pl>
Date:   Fri Mar 24 13:10:04 2017 +0100

    mount: Upgrade to new directory cache
    
    This commit replaces old directory cache with new one
    based on class DirEntryCache.
    
    Change-Id: I71dd4bb6ecc5a35622c0159aba17973f4aef5ffb

This problem may be also related to commit no. 22ef355de5d19e297c9e9a9381a50866afb2e384. DirEntryCache::insertSubsequent() function is probably buggy.

What is interesting: if I comment out those two lines, then all tests pass without a problem. It may be some race condition related to entry_index.

Work in progress... :-)

@trzysiek

This comment has been minimized.

Copy link
Member

commented Sep 1, 2019

Fix

I think I managed to find the source of the problem and fix it.

As suspected, the problem was with restoring files stored in client's DirEntryCache. I looked through its entire code and it all seemed fine, but after a while I found one small glitch, which caused all these errors.

The entire fix after which everything, including tests discussed here previously, seems to start working is a change in just 1 line of code: After appending another test cases to it and changing it into:

if (!gDirEntryCache.isValid(it) || it->index != entry_index ||
        it->parent_inode != ino || it->uid != ctx.uid || it->gid != ctx.gid) {
    break;
}

all tests pass peacefully.

The code before the fix was correct given only one process was accessing the same data at the same time, but in case of multiple threads/processes requesting data of the same directory in a very short time, definetely shorter than cache's validity set by the setup variable mfsdirentrycacheto, which defaults to 0.25sec, it might lead to serious errors. As observed before, it's possible to have some files missing from directory's listing as well as to have files from different directories incorrectly listed, which might have grievous consequences, e.g. if listing operation is combined with removal.

Updated tests

lfs-tests.zip

I include zipped tests which I used for testing. It consists of a previous test (lfs-test-duplicates), an updated lfs-test-unique, with structure same as in lfs-test-duplicates (14k files in directories /a-f/a-f/a-f), as well as a third, smaller one, so that it runs faster, but still causes quite a few errors. It has unique filenames, but smaller number of files in it.

Previously, unique tests worked fine, but it was only because number of files in a directory was too small for errors to occur. Unsurprisingly to me, as someone who looked carefully at the code, the updated unique tests fail much more often than the duplicates one.

All those tests were passed with a 3.13.0 version of client, after recompiling it with the change. I run all the tests' 99 iterations twice, so it's unlikely the problem will now persist. The master version of 3.13.0 fails on those tests with following results.

root@lizardfs-client:/# ./lfs-test-duplicates.sh 
found a problem on run 6!  (/var/tmp/lfs-test-duplicates.output/find.out.6.*.is)
found a problem on run 8!  (/var/tmp/lfs-test-duplicates.output/find.out.8.*.is)
root@lizardfs-client:/# ./lfs-test-unique-small.sh 
found a problem on run 6!  (/var/tmp/lfs-test-unique-small.output/find.out.6.*.is)
found a problem on run 11!  (/var/tmp/lfs-test-unique-small.output/find.out.11.*.is)
found a problem on run 44!  (/var/tmp/lfs-test-unique-small.output/find.out.44.*.is)
found a problem on run 50!  (/var/tmp/lfs-test-unique-small.output/find.out.50.*.is)
found a problem on run 57!  (/var/tmp/lfs-test-unique-small.output/find.out.57.*.is)
found a problem on run 61!  (/var/tmp/lfs-test-unique-small.output/find.out.61.*.is)
found a problem on run 77!  (/var/tmp/lfs-test-unique-small.output/find.out.77.*.is)
found a problem on run 99!  (/var/tmp/lfs-test-unique-small.output/find.out.99.*.is)
root@lizardfs-client:/# ./lfs-test-unique.sh 
found a problem on run 2!  (/var/tmp/lfs-test-unique.output/find.out.2.*.is)
found a problem on run 4!  (/var/tmp/lfs-test-unique.output/find.out.4.*.is)
found a problem on run 8!  (/var/tmp/lfs-test-unique.output/find.out.8.*.is)
found a problem on run 10!  (/var/tmp/lfs-test-unique.output/find.out.10.*.is)
found a problem on run 11!  (/var/tmp/lfs-test-unique.output/find.out.11.*.is)
found a problem on run 12!  (/var/tmp/lfs-test-unique.output/find.out.12.*.is)
found a problem on run 13!  (/var/tmp/lfs-test-unique.output/find.out.13.*.is)
found a problem on run 14!  (/var/tmp/lfs-test-unique.output/find.out.14.*.is)
found a problem on run 15!  (/var/tmp/lfs-test-unique.output/find.out.15.*.is)
found a problem on run 16!  (/var/tmp/lfs-test-unique.output/find.out.16.*.is)
found a problem on run 17!  (/var/tmp/lfs-test-unique.output/find.out.17.*.is)
found a problem on run 19!  (/var/tmp/lfs-test-unique.output/find.out.19.*.is)
found a problem on run 20!  (/var/tmp/lfs-test-unique.output/find.out.20.*.is)
found a problem on run 21!  (/var/tmp/lfs-test-unique.output/find.out.21.*.is)
found a problem on run 22!  (/var/tmp/lfs-test-unique.output/find.out.22.*.is)
found a problem on run 23!  (/var/tmp/lfs-test-unique.output/find.out.23.*.is)
found a problem on run 25!  (/var/tmp/lfs-test-unique.output/find.out.25.*.is)
found a problem on run 26!  (/var/tmp/lfs-test-unique.output/find.out.26.*.is)
found a problem on run 29!  (/var/tmp/lfs-test-unique.output/find.out.29.*.is)
found a problem on run 34!  (/var/tmp/lfs-test-unique.output/find.out.34.*.is)
found a problem on run 35!  (/var/tmp/lfs-test-unique.output/find.out.35.*.is)
found a problem on run 36!  (/var/tmp/lfs-test-unique.output/find.out.36.*.is)
found a problem on run 38!  (/var/tmp/lfs-test-unique.output/find.out.38.*.is)
found a problem on run 39!  (/var/tmp/lfs-test-unique.output/find.out.39.*.is)
found a problem on run 41!  (/var/tmp/lfs-test-unique.output/find.out.41.*.is)
found a problem on run 43!  (/var/tmp/lfs-test-unique.output/find.out.43.*.is)
found a problem on run 53!  (/var/tmp/lfs-test-unique.output/find.out.53.*.is)
found a problem on run 54!  (/var/tmp/lfs-test-unique.output/find.out.54.*.is)
found a problem on run 57!  (/var/tmp/lfs-test-unique.output/find.out.57.*.is)
found a problem on run 58!  (/var/tmp/lfs-test-unique.output/find.out.58.*.is)
found a problem on run 59!  (/var/tmp/lfs-test-unique.output/find.out.59.*.is)
found a problem on run 61!  (/var/tmp/lfs-test-unique.output/find.out.61.*.is)
found a problem on run 62!  (/var/tmp/lfs-test-unique.output/find.out.62.*.is)
found a problem on run 63!  (/var/tmp/lfs-test-unique.output/find.out.63.*.is)
found a problem on run 66!  (/var/tmp/lfs-test-unique.output/find.out.66.*.is)
found a problem on run 69!  (/var/tmp/lfs-test-unique.output/find.out.69.*.is)
found a problem on run 70!  (/var/tmp/lfs-test-unique.output/find.out.70.*.is)
found a problem on run 74!  (/var/tmp/lfs-test-unique.output/find.out.74.*.is)
found a problem on run 77!  (/var/tmp/lfs-test-unique.output/find.out.77.*.is)
found a problem on run 78!  (/var/tmp/lfs-test-unique.output/find.out.78.*.is)
found a problem on run 79!  (/var/tmp/lfs-test-unique.output/find.out.79.*.is)
found a problem on run 81!  (/var/tmp/lfs-test-unique.output/find.out.81.*.is)
found a problem on run 82!  (/var/tmp/lfs-test-unique.output/find.out.82.*.is)
found a problem on run 83!  (/var/tmp/lfs-test-unique.output/find.out.83.*.is)
found a problem on run 84!  (/var/tmp/lfs-test-unique.output/find.out.84.*.is)
found a problem on run 87!  (/var/tmp/lfs-test-unique.output/find.out.87.*.is)
found a problem on run 88!  (/var/tmp/lfs-test-unique.output/find.out.88.*.is)
found a problem on run 89!  (/var/tmp/lfs-test-unique.output/find.out.89.*.is)
found a problem on run 90!  (/var/tmp/lfs-test-unique.output/find.out.90.*.is)
found a problem on run 94!  (/var/tmp/lfs-test-unique.output/find.out.94.*.is)
found a problem on run 95!  (/var/tmp/lfs-test-unique.output/find.out.95.*.is)
found a problem on run 97!  (/var/tmp/lfs-test-unique.output/find.out.97.*.is)
found a problem on run 99!  (/var/tmp/lfs-test-unique.output/find.out.99.*.is)

Aftermath

I'll run a couple more tests, on bigger data sample and with other operations besides a simple find, but the problem seems to be fixed.

I strongly encourage everyone interested in this issue to check out the solution by yourself, as so far it was only tested locally on my computer.

@trzysiek trzysiek self-assigned this Sep 3, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.