Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed data race in threads found thread sanitizer #2209

Merged
merged 1 commit into from
Jul 9, 2023

Conversation

ggtakec
Copy link
Member

@ggtakec ggtakec commented Jul 9, 2023

Relevant Issue (if applicable)

n/a

Details

The following error in the ThreadSinitizer test was detected, so it was fixed.

WARNING: ThreadSanitizer: data race (pid=22052)
  Read of size 4 at 0x7b4c000119a8 by thread T9 (mutexes: write M0):
    #0 FdEntity::IsDirtyNewFile() const /__w/s3fs-fuse/s3fs-fuse/src/fdcache_entity.cpp:2546 (s3fs+0x5dc836) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 s3fs_release(char const*, fuse_file_info*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:3023 (s3fs+0x54aae9) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #2 fuse_fs_release ??:? (libfuse.so.2+0xd345) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Previous write of size 4 at 0x7b4c000119a8 by thread T44 (mutexes: write M1):
    #0 FdEntity::UploadPending(int, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache_entity.cpp:2450 (s3fs+0x5d983b) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 s3fs_release(char const*, fuse_file_info*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:3026 (s3fs+0x54ab2b) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #2 fuse_fs_release ??:? (libfuse.so.2+0xd345) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Location is heap block of size 448 at 0x7b4c00011800 allocated by thread T1:
    #0 operator new(unsigned long) ??:? (s3fs+0x53747b) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 FdManager::Open(int&, char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, long, timespec const&, int, bool, bool, bool, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache.cpp:593 (s3fs+0x5c7df3) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #2 AutoFdEntity::Open(char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, long, timespec const&, int, bool, bool, bool, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache_auto.cpp:103 (s3fs+0x5e9413) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #3 s3fs_create(char const*, unsigned int, fuse_file_info*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:1135 (s3fs+0x54c193) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #4 fuse_fs_create ??:? (libfuse.so.2+0xddff) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Mutex M0 (0x7b4c00011920) created at:
    #0 pthread_mutex_init ??:? (s3fs+0x4b0354) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 FdEntity /__w/s3fs-fuse/s3fs-fuse/src/fdcache_entity.cpp:124 (s3fs+0x5ce9c4) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #2 FdManager::Open(int&, char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, long, timespec const&, int, bool, bool, bool, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache.cpp:593 (s3fs+0x5c7e27) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #3 AutoFdEntity::Open(char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, long, timespec const&, int, bool, bool, bool, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache_auto.cpp:103 (s3fs+0x5e9413) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #4 s3fs_create(char const*, unsigned int, fuse_file_info*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:1135 (s3fs+0x54c193) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #5 fuse_fs_create ??:? (libfuse.so.2+0xddff) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Mutex M1 (0x7b4c00011800) created at:
    #0 pthread_mutex_init ??:? (s3fs+0x4b0354) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 FdEntity /__w/s3fs-fuse/s3fs-fuse/src/fdcache_entity.cpp:120 (s3fs+0x5ce967) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #2 FdManager::Open(int&, char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, long, timespec const&, int, bool, bool, bool, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache.cpp:593 (s3fs+0x5c7e27) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #3 AutoFdEntity::Open(char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, long, timespec const&, int, bool, bool, bool, AutoLock::Type) /__w/s3fs-fuse/s3fs-fuse/src/fdcache_auto.cpp:103 (s3fs+0x5e9413) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #4 s3fs_create(char const*, unsigned int, fuse_file_info*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:1135 (s3fs+0x54c193) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #5 fuse_fs_create ??:? (libfuse.so.2+0xddff) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Thread T9 (tid=22089, running) created by thread T1 at:
    #0 pthread_create ??:? (s3fs+0x4ae98f) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 <null> <null> (libfuse.so.2+0xc4e0) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Thread T44 (tid=23204, running) created by thread T38 at:
    #0 pthread_create ??:? (s3fs+0x4ae98f) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 <null> <null> (libfuse.so.2+0xc4e0) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Thread T1 (tid=22054, running) created by main thread at:
    #0 pthread_create ??:? (s3fs+0x4ae98f) (BuildId: ab83e9890914b4c6d66a0fd5ea48fdf43707afe7)
    #1 <null> <null> (libfuse.so.2+0xc4e0) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)
    #2 __libc_start_call_main ??:? (libc.so.6+0x27b49) (BuildId: 245240a31888ad5c11bbc55b18e02d87388f59a9)

SUMMARY: ThreadSanitizer: data race /__w/s3fs-fuse/s3fs-fuse/src/fdcache_entity.cpp:2546 in FdEntity::IsDirtyNewFile() const

This is because some methods that deal with the pending_status variable use different locking variables than others.
These methods used fdent_data_lock and should have used fdent_lock.
(FdEntity::UploadPending calls the Flush method, so fdent_data_lock doesn't work)

Also, the comments in s3fs.cpp were confirmed by this fix and have been removed.

@ggtakec ggtakec requested a review from gaul July 9, 2023 05:35
@ggtakec
Copy link
Member Author

ggtakec commented Jul 9, 2023

It looks like it needs a little more fixing, so I will change to draft once.

@ggtakec
Copy link
Member Author

ggtakec commented Jul 9, 2023

Coincidentally, the following datarace was also detected, so we took action.

WARNING: ThreadSanitizer: data race (pid=15527)
  Write of size 8 at 0x7b0800024780 by thread T129:
    #0 free ??:? (s3fs+0x4ad379) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #1 OPENSSL_sk_free ??:? (libcrypto.so.3+0x205cfa) (BuildId: 3273586fcac67f861ec301429787310dc38b9e7e)
    #2 S3fsCurl::HeadRequest(char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) /__w/s3fs-fuse/s3fs-fuse/src/curl.cpp:3301 (s3fs+0x58151e) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #3 get_object_attribute(char const*, stat*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, bool, bool*, bool) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:513 (s3fs+0x538f52) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #4 check_object_access(char const*, int, stat*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:694 (s3fs+0x550394) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #5 s3fs_getattr(char const*, stat*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:949 (s3fs+0x54266b) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #6 fuse_reply_attr ??:? (libfuse.so.2+0x168a7) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Previous read of size 8 at 0x7b0800024780 by thread T14:
    #0 memcpy ??:? (s3fs+0x4babd4) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #1 OPENSSL_sk_dup ??:? (libcrypto.so.3+0x207433) (BuildId: 3273586fcac67f861ec301429787310dc38b9e7e)
    #2 S3fsCurl::HeadRequest(char const*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >&) /__w/s3fs-fuse/s3fs-fuse/src/curl.cpp:3301 (s3fs+0x58151e) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #3 get_object_attribute(char const*, stat*, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, header_nocase_cmp, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >*, bool, bool*, bool) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:513 (s3fs+0x538f52) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #4 check_object_access(char const*, int, stat*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:694 (s3fs+0x550394) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #5 s3fs_getattr(char const*, stat*) /__w/s3fs-fuse/s3fs-fuse/src/s3fs.cpp:949 (s3fs+0x54266b) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #6 fuse_reply_attr ??:? (libfuse.so.2+0x168a7) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Thread T129 (tid=16169, running) created by thread T8 at:
    #0 pthread_create ??:? (s3fs+0x4ae98f) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #1 <null> <null> (libfuse.so.2+0xc4e0) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

  Thread T14 (tid=15585, running) created by thread T2 at:
    #0 pthread_create ??:? (s3fs+0x4ae98f) (BuildId: 8557aa27952623128f107410c458dd909a62c523)
    #1 <null> <null> (libfuse.so.2+0xc4e0) (BuildId: 4db3553f25db655fdcad5600bb328da5dc18736b)

SUMMARY: ThreadSanitizer: data race ??:? in __interceptor_free

Not enough information from ThreadSanitizer about this fix. But...

It seems that there is a conflict between openssl memory destruction and creation called by libcurl.
A possible part of this is the S3fsCurl::ClearInternalData method called in the S3fsCurl::DestroyCurlHandle method, and this method calls curl_slist_free_all.
And since this error occurred in the test test_concurrent_directory_updates which calls S3fsCurl::HeadRequest repeatedly, I think it's probably due to an unprotected call to S3fsCurl::ClearInternalData.

So that, I have included this fix in this PR.
We need to continue to watch for ThreadSanitizer errors.

@gaul gaul merged commit b253705 into s3fs-fuse:master Jul 9, 2023
17 checks passed
@gaul
Copy link
Member

gaul commented Jul 9, 2023

I am very happy with the ThreadSanitizer errors in CI and subsequent fixes!

@ggtakec ggtakec deleted the fix_data_race branch July 9, 2023 12:13
@ggtakec
Copy link
Member Author

ggtakec commented Jul 9, 2023

@gaul It seems that the problem of #2209 (comment) still remains.
I'll look into this later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants