Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] make a bgzf_thread_count a single variable #2752

Merged
merged 1 commit into from
Jan 18, 2022

Conversation

SGSSGene
Copy link
Contributor

@SGSSGene SGSSGene commented Jun 14, 2021

bgzf_thread_count was declared inline static outside of the scope of a
class.
In this case the inline allows the variable to break the one
definition rule. Meaning there can be multiple symbols across different
translation that all declare bgzf_thread_count. In this case the
linker will choose one of the defined symbols. (Without inline it would
throw a multiple definition error).

The static keyword (in this context) means that the created symbol is
only visible inside a current translation unit. Meaning, there are no symbols
for the linker to work with. This will cause every translation unit to have there
own bgzf_thread_count variable.

This combination leads to every translation unit having there own
bgzf_thread_count variable. The fix is simple, we remove the static
keyword. This should have the intended behavior.

  • Changelog entry (API change)
  • Adapt cookbook entry

@vercel
Copy link

vercel bot commented Jun 14, 2021

This pull request is being automatically deployed with Vercel (learn more).
To see the status of your deployment, click below or on the icon next to each commit.

🔍 Inspect: https://vercel.com/seqan/seqan3/9deSNYGGQRf4KqYobzvAM3US4DQM
✅ Preview: https://seqan3-git-fork-sgssgene-fix-bgzfthreadcountvariable-seqan.vercel.app

@codecov
Copy link

codecov bot commented Jun 14, 2021

Codecov Report

Merging #2752 (b778975) into master (0f5fb9d) will decrease coverage by 0.00%.
The diff coverage is n/a.

❗ Current head b778975 differs from pull request most recent head 1ec1bba. Consider uploading reports for the commit 1ec1bba to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2752      +/-   ##
==========================================
- Coverage   98.28%   98.28%   -0.01%     
==========================================
  Files         267      266       -1     
  Lines       11462    11455       -7     
==========================================
- Hits        11265    11258       -7     
  Misses        197      197              
Impacted Files Coverage Δ
include/seqan3/utility/simd/algorithm.hpp 100.00% <0.00%> (ø)
include/seqan3/utility/tuple/pod_tuple.hpp 100.00% <0.00%> (ø)
test/include/seqan3/test/tmp_directory.hpp 100.00% <0.00%> (ø)
...nclude/seqan3/io/sam_file/input_format_concept.hpp 100.00% <0.00%> (ø)
...clude/seqan3/io/sam_file/output_format_concept.hpp 100.00% <0.00%> (ø)
...nclude/seqan3/alphabet/views/validate_char_for.hpp

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f5fb9d...1ec1bba. Read the comment docs.

Copy link
Member

@simonsasse simonsasse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@SGSSGene
Copy link
Contributor Author

SGSSGene commented Jul 12, 2021

Seqan2 "solves" this by using a macro/define instead of a global variable.

iostream_bgzf.h defines

#ifndef SEQAN_BGZF_NUM_THREADS
#define SEQAN_BGZF_NUM_THREADS 16
#endif

And then it is similar used similar to seqan3:


    basic_bgzf_streambuf(ostream_reference ostream_,
                         size_t numThreads = SEQAN_BGZF_NUM_THREADS,
                         size_t jobsPerThread = 8) :

taken from:
https://github.com/seqan/seqan/blob/f5f658343c366c9c3d44ba358ffc9317e78a09ed/include/seqan/stream/iostream_bgzf.h

@marehr marehr requested review from a team and marehr and removed request for a team August 9, 2021 08:50
@SGSSGene SGSSGene marked this pull request as draft October 25, 2021 10:24
@smehringer
Copy link
Member

Core Meeting 25.10.2021 - This is not release critical.

@@ -38,7 +38,7 @@ namespace seqan3::contrib
/*!\brief A static variable indicating the number of threads to use for the bgzf-streams.
* Defaults to std::thread::hardware_concurrency.
*/
inline static uint64_t bgzf_thread_count = std::thread::hardware_concurrency();
inline uint64_t bgzf_thread_count = std::thread::hardware_concurrency();
Copy link
Member

@marehr marehr Oct 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a hybrid idea: Use SeqAn 2 solution with macro:

#ifndef SEQAN3_BGZF_THREAD_COUNT
#define SEQAN3_BGZF_THREAD_COUNT (std::thread::hardware_concurrency())
#define SEQAN3_BGZF_THREAD_COUNT (bgzf_thread_count)
#endif // SEQAN3_BGZF_THREAD_COUNT

basic_bgzf_streambuf(ostream_reference ostream_,
                         size_t numThreads = SEQAN3_BGZF_THREAD_COUNT,
                         size_t jobsPerThread = 8) :

This allows:

  • having consistent definition over all translation unit
  • making it configurable to a fixed number, to a function, to anything that reduces to a number
  • AND: It is a hack and will always be a hack.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave it some thought, but this doesn't really work.
The issue is if I declare -DSEQAN3_BGZF_THREAD_COUNT=(std::thread::hardware_concurrency()) It is required that the header includes #include <thread>. Same goes if I want to use a user defined variable.
It would require that other headers are being included before. I don't see the advantages.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with static: each translation unit gets it's own variable.
- you have to set it in each unit, cannot alter it globally
+ you can alter it for each unit

without static:
+ you set it globally
- different compilation unit cannot use different values
- if you set the variable in multiple places, the compiler chooses the variable at random

@Irallia
Copy link
Contributor

Irallia commented Nov 24, 2021

@marehr friendly reminder.

Would you like to clarify this please? If I see correctly, this change means that I no longer unintentionally occupy all threads on the server with iGenVar, although I have specified much less.

@smehringer
Copy link
Member

Core meeting 06.12.2021 - The common use case is that you want the same variable across your translation units. So removing static enables the user to set the variable globally for his/her application at runtime (e.g. via command line input).

bgzf_thread_count was declared `inline static` outside of the scope of a
class.
In this case the `inline` allows the variable to break the one
definition rule. Meaning there can be multiple symbols across different
translation that all declare `bgzf_thread_count`. In this case the
linker will choose one of the defined symbols. (Without inline it would
throw a multiple definition error).

The `static` keyword (in this context) means that the created symbol is
only visible inside a current translation unit. Meaning, there are no symbols
for the linker to work with. This will cause every translation unit to have there
own `bgzf_thread_count` variable.

This combination leads to every translation unit having there own
`bgzf_thread_count` variable, which is not what a user of seqan3 expects.
The fix is simple, we remove the `static` keyword. This should have the
intended behavior.

Thought: maybe it should be declared `thread_local`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants