Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macbuild #49

Merged
merged 3 commits into from
Jun 4, 2022
Merged

Macbuild #49

merged 3 commits into from
Jun 4, 2022

Conversation

objarni
Copy link
Collaborator

@objarni objarni commented Jun 3, 2022

This is an experimental branch containing several changes to make building on my MacBook Air M1 possible.

@electronicsbyjulie
Copy link
Contributor

It's an impressive amount of changes. I'll check it out in a few hours and try it out. Thank you for doing this!

@objarni
Copy link
Collaborator Author

objarni commented Jun 4, 2022

It's an impressive amount of changes. I'll check it out in a few hours and try it out. Thank you for doing this!

The most 'scary' thing is the removal of all array initializations on the form ]={} or ]{}, which could change behaviour depending on the interpreration of those expressions by the compiler. I'm still not 100% sure what it means, and the C23 I found earlier this week isn't convincing me anymore, since it's a C programming language standard - C++ standards are abbreviated C++NN not CNN!

It is especially troublesome/undefined when the expression in [] is dynamic (e.g. n+2).

This expression:

int a[5] = {0};

.. is specified to set all elemnts of the statically sized array to 0.

This expression:

int a[5] = {};

.. I'm not sure what it means really. What is your intention of the meaning?

This expression:

int a[n+2] = {};

.. I feel is even more unclear to me.

This confusion would just disappate if we switch to std::vector, as pre-deciding the size of the array (which it internally is!) isn't something you would generally do (it grows in runtime, on program execution, as needed) even if it is possible (you know you will need at least 1000 elements, so tell vector constructor to allocate that directly instead of dynamically expanding).

You may have taught me some C++ I've been unaware of:

https://www.cplusplus.com/doc/tutorial/arrays/

However, clang on MacBook complains a lot on the [n+2] and similar size expression calculated in runtime, so at least those need to go for mac build to work. One could see this as part of the goal of using std::vector more (not everywhere).

@electronicsbyjulie
Copy link
Contributor

int a[5] = {0}; and int a[5] = {}; should do the same thing. Does the former compile okay on your Mac? I tested the docker and the protein assembly and didn't see anything amiss; if any new segfaults occur then they could be from uninitialized values and it's just a matter of initializing them. But in my Object** arrays I've done my best to add nulls at the end of everything.

int a[n+2] = {}; just means allocate an array of n+2 ints. One place these occur is when copying an old array to a new array, for example atoms of a molecule where n can be any positive integer. The +2 is just a little bit of buffer space to allow adding a null and to minimize the chance of segfaults. std::vector of shared_ptr would be so much better I agree.

Why not use std::vector everywhere?

@objarni
Copy link
Collaborator Author

objarni commented Jun 4, 2022

Good question - why not use std::vector everywhere?

Reason is potential runtime performance hit. This is since there are more to std::vector than dynamic growth; it also does bounds-checking - which is great for safety/correctness - but it does come with a runtime performance hit.

However, I like the principle of Donald Knuth: "premature optimization is the root of all evil" and the general idea when doing optimization - don't make assumption, measure, measure, measure! Find the bottleneck, optimize there (not everywhere).

So I'm curious to see the difference, and will hack together a little benchmark program!

Therefore I added a folder src/spikes/vector_vs_array (which uses CMake so I can use the full strength of CLion...) for these kinds of investigations.

@objarni
Copy link
Collaborator Author

objarni commented Jun 4, 2022

There is now a spike performance test here: https://github.com/primaryodors/podock/tree/main/src/spikes/vector_vs_array

Conclusion: there is a difference, however I think the test proves a valid point, as the array test actually segfaults (!) for NUMBER_OF_LOOPS greater than 20.

@electronicsbyjulie
Copy link
Contributor

Any performance hit is potentially bad since docking is a CPU intensive calculation. If it's a small hit, then the benefits can outweigh the lost performance. Is the performance only impacted for adding/removing elements or are array access and iteration functions also affected? Because if access/iterations are not slowed then it'll be perfectly fine for docking.

Checking out the spike test now, very excited to see what it does...

@electronicsbyjulie
Copy link
Contributor

The array summer was using one-based indices (i.e. i=1; i<=n) which can cause segfaults because on an int[n] the last array element's index is actually n-1, so array[n] is technically an index out of bounds.

The other thing that was happening was some kind of weird stack overflow on trying to declare a 2 million member int array. It resolved when I changed it from int[n] to new int[n].

Unfortunately the performance hit is pretty severe, 211% greater on my machine. :( For the volume of processing my purposes require, a threefold difference is just not usable.

@electronicsbyjulie
Copy link
Contributor

@objarni Cool if I merge this PR?

@objarni
Copy link
Collaborator Author

objarni commented Jun 4, 2022

Sure!

@electronicsbyjulie electronicsbyjulie merged commit a6d712e into main Jun 4, 2022
electronicsbyjulie pushed a commit that referenced this pull request Jun 5, 2022
electronicsbyjulie pushed a commit that referenced this pull request Jun 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants