Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Medium]Convert everything on the heap to smart pointers #193

Closed
Feynstein opened this issue May 8, 2020 · 23 comments
Closed

[Medium]Convert everything on the heap to smart pointers #193

Feynstein opened this issue May 8, 2020 · 23 comments

Comments

@Feynstein
Copy link

Feynstein commented May 8, 2020

This is a very good practice in C++ as it makes memory leaks virtually impossible. Might need to include boost in the CMakeLists.txt file, it's pretty easy to do though. There also might have ways to couple this use with std::any in C++17 ... it's also a very powerful tool. Like by making a std::map or hashmap bucket of string,std::any that can carry any kind of data from one module to the other.

@Feynstein Feynstein changed the title Convert everything on the heap in boost smart pointers (or stl) Convert everything on the heap to boost smart pointers (or stl) May 8, 2020
@Feynstein Feynstein changed the title Convert everything on the heap to boost smart pointers (or stl) [CS-7][Medium]Convert everything on the heap to boost smart pointers (or stl) May 8, 2020
@Feynstein Feynstein changed the title [CS-7][Medium]Convert everything on the heap to boost smart pointers (or stl) [Medium]Convert everything on the heap to boost smart pointers (or stl) May 8, 2020
@YsHaNgM
Copy link
Contributor

YsHaNgM commented May 8, 2020

It is good practice to use unique_ptr for any heap allocated array, or even shared_ptr for use of copy/assignment if using C++17 with gcc-7/msvc-19. There’s no need to use boost, the compilers mentioned before has perfect implements. I have used them with either OpenMP and MPI, there’s only some need to invoke ptr.get() if you really have to expose the raw pointer to certain algorithms.

@Feynstein
Copy link
Author

Feynstein commented May 8, 2020

@YsHaNgM Well, I'm still going to use boost anyway for program_options so might as well do it the way I prefer with boost::shared_ptr and dynamic_pointer_casts and stuff if need.

@Feynstein
Copy link
Author

I dont't think boost is that big of a deal to add though... And I'm used to work with it.

@YsHaNgM
Copy link
Contributor

YsHaNgM commented May 8, 2020

Don’t get me wrong. I am not resisting boost, and program_options is a good idea. I am just discussing the title mentioned smart pointer which largely overlaid in both boost and standard library. I am happy to use boost version if the project have already included it, and it indeed provides better portability (index operand supported from gcc-7).

@Feynstein
Copy link
Author

Ah ok, what do you suggest as a title then?

@Feynstein Feynstein changed the title [Medium]Convert everything on the heap to boost smart pointers (or stl) [Medium]Convert everything on the heap to smart pointers (or stl) May 9, 2020
@Feynstein Feynstein changed the title [Medium]Convert everything on the heap to smart pointers (or stl) [Medium]Convert everything on the heap to smart pointers May 9, 2020
@chrisatang
Copy link

Hi Feynstein,

I was actually right in the middle of converting everything on the heap to smart pointers, but realized that due to the OpenMP library they're using, that adding C++1z into the mix isn't the best idea (another thread here explaining why).

Thus, as the # of elements for all their arrays are essentially static, I've opted to start switching their variables over to std::vector, as there is quite a bit of code to go through.

Please let me know if switching to smart pointers actually is a good idea, given this.

Thanks for starting the discussion :-)

@Feynstein
Copy link
Author

Feynstein commented May 9, 2020

Hi! I was also thinking of doing that but I'd like you to check what I did in my fork... Since there are very too many parameters I was thinking of an unordered_map of string and std::any

@Feynstein
Copy link
Author

Meaning that you create a txt file (like documentation stuff) with all the keys and their descriptions and insert them on the fly in the code without the need to have a thousand definition in the code itself. Maybe you can merge your branch in mine or the reverse so that we can work together on that?

@Feynstein
Copy link
Author

@chrisatang That's where I think mutexes would be useful? Accessing those pointers one by one might be the way to create a something that's thread safe? Also I know that there exists a way to catch an exception pointer from inside an openmp for loop, maybe that could help us in managing this stuff.... let me check for that...

@YsHaNgM
Copy link
Contributor

YsHaNgM commented May 9, 2020

Hi Feynstein,

I was actually right in the middle of converting everything on the heap to smart pointers, but realized that due to the OpenMP library they're using, that adding C++1z into the mix isn't the best idea (another thread here explaining why).

Thus, as the # of elements for all their arrays are essentially static, I've opted to start switching their variables over to std::vector, as there is quite a bit of code to go through.

Please let me know if switching to smart pointers actually is a good idea, given this.

Thanks for starting the discussion :-)

Each thread need to hold their own copy of data as private type for omp, that’s why you can not use unique_ptr, but you can use shared_ptr instead. In such case, it is safer to let smart pointer manage your dynamic array, and make it private in omp. Plus, vector may have more overhead compared to a plain array.

@Feynstein
Copy link
Author

Hi Feynstein,
I was actually right in the middle of converting everything on the heap to smart pointers, but realized that due to the OpenMP library they're using, that adding C++1z into the mix isn't the best idea (another thread here explaining why).
Thus, as the # of elements for all their arrays are essentially static, I've opted to start switching their variables over to std::vector, as there is quite a bit of code to go through.
Please let me know if switching to smart pointers actually is a good idea, given this.
Thanks for starting the discussion :-)

Each thread need to hold their own copy of data as private type for omp, that’s why you can not use unique_ptr, but you can use shared_ptr instead. In such case, it is safer to let smart pointer manage your dynamic array, and make it private in omp. Plus, vector may have more overhead compared to a plain array.

I agree that vectors might have more overhead. But if we include them into a more elegant data structure like a map and then before entering any processing loop we transfer them to shared arrays? Something of the like: https://stackoverflow.com/questions/13061979/shared-ptr-to-an-array-should-it-be-used

@YsHaNgM
Copy link
Contributor

YsHaNgM commented May 9, 2020

@chrisatang That's where I think mutexes would be useful? Accessing those pointers one by one might be the way to create a something that's thread safe? Also I know that there exists a way to catch an exception pointer from inside an openmp for loop, maybe that could help us in managing this stuff.... let me check for that...

I encountered the same situation before. There’s heavy data race if different threads are trying to access/modify the same array within omp region. I don’t think it is a good way to mix up C++ standard mutex with omp, we can probably use omp atomic or just put that statement in omp critical/single region. Another approach I could have in mind is quite similar to the way in MPI. we manually spilt an array in separate parts then give in omp as firstprivate, then use a user defined reduction (lambda) to collect data from each thread.

@Feynstein
Copy link
Author

Feynstein commented May 9, 2020

@YsHaNgM yes like some-kind of multi-thread for_each? And I know about atomics, but I think this is all static(constant? sry french lol) data anyway...

@YsHaNgM
Copy link
Contributor

YsHaNgM commented May 9, 2020

Hi Feynstein,
I was actually right in the middle of converting everything on the heap to smart pointers, but realized that due to the OpenMP library they're using, that adding C++1z into the mix isn't the best idea (another thread here explaining why).
Thus, as the # of elements for all their arrays are essentially static, I've opted to start switching their variables over to std::vector, as there is quite a bit of code to go through.
Please let me know if switching to smart pointers actually is a good idea, given this.
Thanks for starting the discussion :-)

Each thread need to hold their own copy of data as private type for omp, that’s why you can not use unique_ptr, but you can use shared_ptr instead. In such case, it is safer to let smart pointer manage your dynamic array, and make it private in omp. Plus, vector may have more overhead compared to a plain array.

I agree that vectors might have more overhead. But if we include them into a more elegant data structure like a map and then before entering any processing loop we transfer them to shared arrays? Something of the like: https://stackoverflow.com/questions/13061979/shared-ptr-to-an-array-should-it-be-used

Yeah, unordered map is a good idea. But bear in mind unordered map uses hash key to make sure they are unique, so you can only use plain data or pointer as key. In other words, it is pointless to wrap and pass a pointer value (address) into omp. I don’t know what data you want to include in the map, so I might be wrong.

@Feynstein
Copy link
Author

Feynstein commented May 9, 2020

Basically everything thats included in this structure:

typedef struct PARAM {

Officially, I was not supposed to talk about a solution in here because of etiquette but I'll explain.

I want to create a hash map<std::string, std::any> of all the parameters in the PARAM structure. And then clear them all from the header... and Instead use documentation to explain everything that is contained in this map. Including all the default stuff that's initialized. I plan to put this map in a singleton class with getter and setters and whatever protection needed. Then whenever you need to add or get something from the parameters you simply look at what is possible to include inside it and can either check if it exist already or you can set its value. I'll probably use some kind of xml or txt file to place all the possible strings, load them in the constructor and verify them when a user tries to set a value. This is only in order to move all the params in one place all over the code, when you need something you take it or set it. Basically like a supermarket trolley?

@Feynstein
Copy link
Author

Feynstein commented May 9, 2020

That way before doing your any_cast you can do an if condition to see if it exists or not, or maybe just use throws in it and use this thing in a try-catch in order not to break all the multithreading thats happening

@Feynstein
Copy link
Author

And it will remove the thousands upon thousands of definitions because in the back of it it will only be strings carrying all the information about what's in the map...

@Feynstein
Copy link
Author

And if some data doesn't need to be in it until some point, you could delay the load of the data to try and save RAM... you could also use it as an output container... so yeah like a virtual trolley with good documentation surrounding it lol

@Feynstein
Copy link
Author

And then just before the processing of the data you use, you take it out inside a shared array

@Feynstein
Copy link
Author

Feynstein commented May 9, 2020

At first I plan to use the name of the variables as the keys so that I can trace the code properly with it and use it at every occurences of the use of the original parameters. But once this is done, adding parameters is only adding a line in a text file basically... Need to add the type in the documentation so that it's not lost when put in an any

@Feynstein
Copy link
Author

Oh yeah and then this class is put in a smart pointer so that if someone drops it, it dies... Like if you're dealing with output...

@Feynstein
Copy link
Author

Feynstein commented May 9, 2020

Hum maybe not a singleton but you get the idea.... I'm not done lol... Because of the Any you can add scalable complexity at the shake of a wand basically.. like different levels of complexity that load and unload at different times... I already saw something like that somewhere but never let my thoughts flow on it like I just did...

@Feynstein
Copy link
Author

Lead to another issue... Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants