-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tweak Remote class and test multi-threaded file remote access #3834
Conversation
06ae34a
to
c8be7e5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read about Singleton yesterday to make sure I understand what you are doing here and this code is correct, but I have a few comments that I want to check with you.
- The only reason you need the mutex is because you need the
first
variable to be true only for the thread that creates the singleton. The recommendation is to create the singleton in a lambda function that also does all the operations that need to be executed only once. The lambda is guaranteed to be called by a single thread. So, my first question is, could you do all theRegisterFormats
andCMregister_handler
calls in the Singleton? By looking at the code I would say you can. If yes we don't need the mutex. - There is a small memory leak caused by
new
. Unless there is a reason to return a pointer I would return the singleton by reference.
I wrote a bunch of code to try to understand how the singleton works so if you agree with the two points I can send a draft commit with what I'm thinking and you can modify it or trash it. This being said your code does the job, I just think it might be cleaner with the changes based of what I've read.
These are the changes I'm proposing: Author: anagainaru <ana.gainaru@gmail.com>
Date: Sat Oct 7 13:17:28 2023 -0400
Clean the singleton code
diff --git a/source/adios2/toolkit/remote/Remote.cpp b/source/adios2/toolkit/remote/Remote.cpp
index 0fe319942..2adb8bb80 100644
--- a/source/adios2/toolkit/remote/Remote.cpp
+++ b/source/adios2/toolkit/remote/Remote.cpp
@@ -50,15 +50,12 @@ void ReadResponseHandler(CManager cm, CMConnection conn, void *vevent, void *cli
return;
};
-void Remote::InitCMData()
+CManagerSingleton& CManagerSingleton::Instance(RemoteCommon::Remote_evpath_state &ev_state)
{
- std::lock_guard<std::mutex> lockGuard(m_CMInitMutex);
- bool first = true;
- auto CM = CManagerSingleton::Instance(first);
- ev_state.cm = CM->m_cm;
- RegisterFormats(ev_state);
- if (first)
- {
+ static CManagerSingleton self = [&ev_state] {
+ CManagerSingleton instance;
+ ev_state.cm = instance.m_cm;
+ RegisterFormats(ev_state);
CMfork_comm_thread(ev_state.cm);
CMregister_handler(ev_state.OpenResponseFormat, (CMHandlerFunc)OpenResponseHandler,
&ev_state);
@@ -68,7 +65,14 @@ void Remote::InitCMData()
(CMHandlerFunc)OpenSimpleResponseHandler, &ev_state);
CMregister_handler(ev_state.ReadResponseFormat, (CMHandlerFunc)ReadResponseHandler,
&ev_state);
- }
+ return instance;
+ } ();
+ return self;
+}
+
+void Remote::InitCMData()
+{
+ (void) CManagerSingleton::Instance(ev_state);
}
void Remote::Open(const std::string hostname, const int32_t port, const std::string filename,
diff --git a/source/adios2/toolkit/remote/Remote.h b/source/adios2/toolkit/remote/Remote.h
index 0e39d8964..aaa1b89e9 100644
--- a/source/adios2/toolkit/remote/Remote.h
+++ b/source/adios2/toolkit/remote/Remote.h
@@ -63,34 +63,20 @@ private:
bool m_Active = false;
};
+#ifdef ADIOS2_HAVE_SST
class CManagerSingleton
{
public:
-#ifdef ADIOS2_HAVE_SST
+ static CManagerSingleton& Instance(RemoteCommon::Remote_evpath_state &ev_state);
+
+private:
CManager m_cm = NULL;
-#endif
- static CManagerSingleton *Instance(bool &first)
- {
- static std::mutex init_mutex;
- const std::lock_guard<std::mutex> lock(init_mutex);
- static CManagerSingleton *ptr = new CManagerSingleton();
- static bool internal_first = true;
- first = internal_first;
- internal_first = false;
- return ptr;
- }
-
-protected:
-#ifdef ADIOS2_HAVE_SST
+
CManagerSingleton() { m_cm = CManager_create(); }
~CManagerSingleton() { CManager_close(m_cm); }
-#else
- CManagerSingleton() {}
-
- ~CManagerSingleton() {}
-#endif
};
+#endif
} // end namespace adios2 |
Hmm. I tend to agree with your comments. I had been following my original thinking, where my focus was on making sure we shared one CM per process, so I put it into the singleton. That was fine with the Get() level stuff, but the transports being threaded wasn't something I had thought about in advance and I ran into data races. Returning the |
I went down a C++ rabbit hole and I have a better idea. There is |
293a3b2
to
bbf96ae
Compare
Well, I've rewritten and done this without std::atexit. Normal C++ shutdown calls the destructor for the singleton, so there's no need for atexit. BUT, two builds, ascent-nvhpc and el8-icc-ompi, are failing. I could ignore the former as an anomoly, but I think I have to dig into the latter. Both may be failing because the server is dying, but since that's run as a background ctest fixture, there's no indication if it dies. Anyhow, you might look at the code as it stands and see if you like it better. I may not get to the failures today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is why some builds fail but see below for a possible explanation.
ev_state = instance.internalEvState; | ||
return instance; | ||
}(); | ||
ev_state = self.internalEvState; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a thread might reach this line before the thread calling the lambda finished in which case there will be a thread with an invalid internal ev state. If this is the thread that ends up calling std::call_once we will have a problem. Is there a reason why you want the CMregister_handler
calls outside the singleton?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The state should be complete once we have registered the formats. We don't need the handlers to have been registered until we get messages arriving. (I had them outside of the Singleton because they weren't visible inside it. I could have added the externs before it, but I don't think it should be necessary.). My understanding (which may be flawed), is that the singleton creation is thread safe, so anyone that gets it will get it after its constructor has been completed. So, the formats should have been registered and the internalEvState fully populated. So, the ev_state in every Remote instance should be appropriately populated. One of them will register the handlers, and from what I read of the call_once docs, there will be no race conditions. I.E. if a thread invokes the call_once, it will only proceed past that once someone has completed the call, even if it's some other thread doing it. I don't think there are holes here, but I can't explain the failures. Will be trying to get it into a docker image to see if I can get more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it wasn't the internal state thing. Instead icc seems not agree that this pattern produces a singleton. Specifically, the constructor gets called (creating the CManager), then the destructor gets called (closing the CManager), then we start registering handlers but get a segfault because the CManager has been closed and deallocated. I'm guessing that the other failure, from nvhpc is from the same problem. The gcc-based compilers seem happy, but not the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you are right, the internal state is guaranteed to be created when threads reach that line. Let me play with godbolt a bit to look at the assembly created by icc. I am very surprised only gcc likes this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand why the lambda doesn't behave like a mutex for icc/nvhpc but I'm giving up. Let's merge it like this and if I feel like spending some time on understanding this we can revisit in the future.
* master: (183 commits) Adding tests for writing null blocks with and without compression Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. Merge pull request ornladios#3805 from pnorbert/fix-bpls-string-scalar Merge pull request ornladios#3804 from pnorbert/fix-aws-version Merge pull request ornladios#3759 from pnorbert/bp5dbg-metadata new attempt to commit query support of local array. (ornladios#3868) MPI::MPI_Fortran should be INTERFACE not PUBLIC Fix hip example compilation error (ornladios#3865) Server Improvements (ornladios#3862) ascent,ci: remove unshallow flag Remove Slack as a contact mechanism (ornladios#3866) bug fix: syntax error in json output (ornladios#3857) Update the bpWriterReadHip example's cmake to run on crusher Examples: Use BPFile instead of BP3/4/5 for future-proof inlineMWE example: Close files at the end Examples: Add BeginStep/EndStep wherever it was missing BP5Serializer: handle local variables that use operators (ornladios#3859) Blosc2 USE ON: Fix Module Fallback (ornladios#3774) SST,MPI,DP: soft handle peer error SST,MPI,DP: improve uniq identifier Fix destdir install test (ornladios#3850) cmake: include ctest before detectoptions ci: enable tau check Add/Improve the ReadMe.md files in examples directory Disable BUILD_TESTING and ADIOS2_BUILD_EXAMPLES by default Remove testing based on ADIOS2-examples Fix formatting issue in DetectOptions.cmake Add examples from ADIOS2-Examples Improve existing examples MPI_DP: do not call MPI_Init (ornladios#3847) cmake: update minimum cmake to 3.12 (ornladios#3849) MPI: add timeout for conf test for MPI_DP (ornladios#3848) Tweak Remote class and test multi-threaded file remote access (ornladios#3834) Add prototype testing of remote functionality (ornladios#3830) Try always using the MPI version Try always using the MPI version Import tests from bp to staging common, implement memory selection in SST ci: fix codeql ignore path (ornladios#3772) install: export adios2 device variables (ornladios#3819) added support to query BP5 files (ornladios#3809) ffs 2023-09-19 (67e411c0) Fix abs/rel step in BP5 DoCount fix dummy Win build Pass Array Order of reader to remote server for proper Get() operation Fix the ADIOS_USE_{_} variable names to use ADIOS2 Remove BP5 BetweenStepPairs variable that hides Engine.h counterpart Delete experimental examples yaml-cpp: support 0.8.0 version Mod not to overload to make some compilers happy Mod to keep MinBlocksInfo private to BP5Writer, not reuse Engine.h API. gha,ci: update checkout to v4 Limit testing changes to BP5 tests: fix windows 64bit type issues Add location of hdf5.dll to PATH for windows tests Fix more compiler warnings from ompi build clang format Fix size_t -> int conversion warnings Allow building with Visual Studio and HDF5 shared libs Fixup local var reading by block with test, master branch use documented initial cache variable to find hdf5 on windows ci: Add HDF5 to a windows build Remove unused SelectionType values Update ADIOS2 HDF5 VOL with basic set of capability flags defines reorder Remove file before writing, catch exceptions format Reader-side Profiling removed a comment ci: disable MGARD static build cmake: fix ffs dependency removed comments and cleaned up more code bp5: remove ADIOS2_USE_BP5 option operators: fix module library ci: Create static minimal build cmake: correct prefer_shared_blosc behavior removing warning from auto clang-format match type of timestep for h5 engine to size_t (same as adios VariableBase class) when storing to hdf5, check the limit of size_t and match to the right type (uint, ulong, default ulonglong) BP5Deserialize: modify changes to keep abi compt Rename test. fix windows compile errors. add codeql workflow Fix ChunkV maintaining CurOffset when downsizing current chunk in Allocate() and creating the next chunk. ci: fix docker images names removed kwsys gitattributes KWSys 2023-05-25 (c9f0da47) fixup! ci: add mgard dependency to spack builds ci: remove unused script part cmake: remove redundant pugixml dependency ci: add mgard dependency to spack builds cmake: Remove enet config warning cmake: resolve cmake python deprecation enet 2023-08-15 (7587eb15) EVPath 2023-08-15 (657c7fa4) dill 2023-08-15 (235dadf0) atl 2023-08-15 (7286dd11) remove data.py since BP5 data file has no format format bp5dbg parse records and check sizes during it for mmd.0 and md.0. No decoding FFS records. cmake: correct info.h installation path ...
…ornladios#3834)" This reverts commit 107164d.
* master: (126 commits) ReadMe.md: Mention 2.9.2 release Cleanup server output a bit (ornladios#3914) ci: set openmpi and openmp params Example using Kokkos buffers with SST Changes to MallocV to take into consideration the memory space of a variable Change install directory of Gray scott files again ci,crusher: increase supported num branches ci: add shellcheck coverage to source and testing Change install directory of Gray scott files Only rank 0 should print the initialization message in perfstub Defining and computing derived variables (ornladios#3816) Add Remote "-status" command to see if a server is running and where (ornladios#3911) examples,hip: use find_package(hip) once in proj Add Steps Tutorial Add Operators Tutorial Add Attributes Tutorial Add Variables Tutorial Add Hello World Tutorial Add Tutorials' Download and Build section Add Tutorials' Overview section Improve bpStepsWriteRead* examples Rename bpSZ to bpOperatorSZWriter Convert bpAttributeWriter to bpAttributeWriteRead Improve bpWriter/bpReader examples Close file after reading for hello-world.py Fix names of functions in engine Fix formatting warnings Add dataspaces.rst in the list of engines Add query.rst cmake: find threads package first docs: update new_release.md Bump version to v2.9.2 ci: update number of task for mpich build clang-format: Correct format to old style Merge pull request ornladios#3878 from anagainaru/test-null-blocks Merge pull request ornladios#3588 from vicentebolea/fix-mpi-dp Adding tests for writing null blocks with and without compression bp5: make RecMap an static anon namespaced var Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. Replace LookupWriterRec's linear search on RecList with an unordered_map. For 250k variables, time goes from 21sec to ~1sec in WSL. The order of entries in RecList was not necessary for the serializer to work correctly. (ornladios#3877) Fix data length calculation for hash (ornladios#3875) Merge pull request ornladios#3823 from eisenhauer/SstMemSel Merge pull request ornladios#3805 from pnorbert/fix-bpls-string-scalar Merge pull request ornladios#3804 from pnorbert/fix-aws-version Merge pull request ornladios#3759 from pnorbert/bp5dbg-metadata new attempt to commit query support of local array. (ornladios#3868) MPI::MPI_Fortran should be INTERFACE not PUBLIC Fix hip example compilation error (ornladios#3865) Server Improvements (ornladios#3862) ascent,ci: remove unshallow flag Remove Slack as a contact mechanism (ornladios#3866) bug fix: syntax error in json output (ornladios#3857) Update the bpWriterReadHip example's cmake to run on crusher Examples: Use BPFile instead of BP3/4/5 for future-proof inlineMWE example: Close files at the end Examples: Add BeginStep/EndStep wherever it was missing BP5Serializer: handle local variables that use operators (ornladios#3859) gha,ci: update checkout to v4 Blosc2 USE ON: Fix Module Fallback cmake: correct prefer_shared_blosc behavior cmake: correct info.h installation path ci: disable MGARD static build operators: fix module library ci: add downloads readthedocs cmake: Add Blosc2 2.10.1 compatibility. Blosc2 USE ON: Fix Module Fallback (ornladios#3774) Fix destdir install test (ornladios#3850) cmake: update minimum cmake to 3.12 (ornladios#3849) MPI: add timeout for conf test for MPI_DP (ornladios#3848) MPI_DP: do not call MPI_Init (ornladios#3847) install: export adios2 device variables (ornladios#3819) Merge pull request ornladios#3799 from vicentebolea/support-new-yaml-cpp Merge pull request ornladios#3737 from vicentebolea/fix-evpath-plugins-path SST,MPI,DP: soft handle peer error SST,MPI,DP: improve uniq identifier Fix destdir install test (ornladios#3850) cmake: include ctest before detectoptions ci: enable tau check Add/Improve the ReadMe.md files in examples directory Disable BUILD_TESTING and ADIOS2_BUILD_EXAMPLES by default Remove testing based on ADIOS2-examples Fix formatting issue in DetectOptions.cmake Add examples from ADIOS2-Examples Improve existing examples MPI_DP: do not call MPI_Init (ornladios#3847) cmake: update minimum cmake to 3.12 (ornladios#3849) MPI: add timeout for conf test for MPI_DP (ornladios#3848) Tweak Remote class and test multi-threaded file remote access (ornladios#3834) Add prototype testing of remote functionality (ornladios#3830) Try always using the MPI version Try always using the MPI version Import tests from bp to staging common, implement memory selection in SST ci: fix codeql ignore path (ornladios#3772) install: export adios2 device variables (ornladios#3819) added support to query BP5 files (ornladios#3809) Partial FFS Upstream, only changes to type_id ffs 2023-09-19 (67e411c0) Fix abs/rel step in BP5 DoCount fix dummy Win build Pass Array Order of reader to remote server for proper Get() operation ...
No description provided.