-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track parameter estimation for cpu and cuda #85
Track parameter estimation for cpu and cuda #85
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I already made a lot of comments, I would stop here for now. I think we'd need to sort out what we want to do with the container classes first. As right now I just don't think that they add any value over using the underlying vecmem types directly.
@@ -55,7 +55,7 @@ struct seed_finding { | |||
/// Callable operator for the cuda seed finding | |||
/// | |||
/// @return seed_container is the vecmem seed container | |||
host_seed_container operator()( | |||
host_seed_container& operator()( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait... what? Are you returning a permanent object all of a sudden? This does not seem thread-safe at all... 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for this change was to use the return output of certain algorithm as an input of next algorithm directly. With copy method, the memory resource is missing so we cannot use it directly for the next algorithm.
auto output = cuda_algorithm(input) // output cannot be used as an input for the next algorithm bec
But as you mentioned that it is not thread-safe, I reverted it to copy method and do some modification:
algorithm::output_type output(mr);
output = cuda_algorithm(input)
Is it okay to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. (Sorry for the late answer... 😦)
track_params_estimation(vecmem::memory_resource* mr) | ||
: m_mr(mr), params(mr) {} | ||
|
||
output_type& operator()(input_type& i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, returning a non-const reference to a member object seems terribly thread-unsafe. I would've just opted for copy elision on return in this function...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
answered above
@@ -22,7 +22,9 @@ class seeding_algorithm { | |||
using output_type = | |||
std::pair<host_internal_spacepoint_container, host_seed_container>; | |||
|
|||
seeding_algorithm(vecmem::memory_resource* mr = nullptr) : m_mr(mr) { | |||
seeding_algorithm(vecmem::memory_resource* mr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is not the only place where this pattern shows up, but this is where I'll comment about it...
We should only pass the memory resources around by pointer if it is allowed to pass a null pointer. In this case it's not really allowed anymore. So it would make for a better interface (in my opinion) to receive the memory resource as a reference. (vecmem::memory_resource&
)
It is indeed annoying that the std classes receive memory resources through pointers. Since they do allow the pointers to be invalid. (In which case they fall back on a default resource.) But with vecmem in most cases we just need to pass explicitly created memory resources around. (Otherwise compilation on macOS can fail. As the Apple Clang compiler doesn't handle "default memory resources" quite correctly. Which provides a pretty good debugging tool for vecmem in the end. 😛)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey wait. If the input is reference, it's not polymorphic anymore for different memory_resource
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you just want it to be (vecmem::cuda::memory_resource& mr)
for cuda algorithm and (vecmem::sycl::memory_resource& mr)
for sycl algorithm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
References in C++ are the same in terms of polymorpism to pointers, so this should not be a problem. 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks. The changes are applied
|
||
private: | ||
output_type params; | ||
std::shared_ptr<traccc::cuda::track_params_estimation> tp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really necessary to use a shared pointer here? Remember, these things are not cheap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is necessarily a problem here; this is a pointer to a function object, so it is unlikely to be replaced very often. Since std::shared_ptr
has zero de-referencing overhead, I don't foresee any performance degradation here.
Whether it is really necessary is debatable, but I can see some benefit to this approach given that function objects in C++ are often assumed to be copyable. Using shared pointers here can feasibly reduce the overhead of that copying. But perhaps Beomki had different reasons in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephenswat is right - This algorithm objects are generated only once before the actual algorithm runs. It could also be normal objects but I was too lazy to make a copy operator for algorithm classes. And if it doesn't introduce any overheads, we can just keep this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as for traccc::track_params_estimation_algorithm
, I don't see what's the purpose of this class. 😕 What interface does it add on top of traccc::cuda::track_params_estimation
?
Also, why does it not inherit from the traccc::algorithm
base class? 😕
I update the algebra_plugins to the latest commit: the rotate function of eigen algebra is missing in the current version |
Current PR compiles, but there is a runtime error in cuda track parameter estimation. |
Naive estimation shows that the averaged track parameter estimation time is: So cuda is slightly slower than cpu unfortunately. It's not a serious problem since the cuda seed finding time is usually 0.05 sec. |
If we're talking about 1.2 milliseconds here we're basically teetering on the border of what is achievable with memory transfer overhead and launch delays. Just following Amdahl's law I wouldn't necessarily recommend spending too much time optimizing this part of the code. Great job! 😄 |
@stephenswat OK. I will checkout whether the results are the same with Acts track parameter estimation before this PR goes in. Will let you know once it is done. |
I added a test file @asalzburger or anyone who has write access to traccc-data can review the MR? |
Other than than traccc-data_MR1, I also need algebra-plugin_PR17 to be merged to make this PR work (@niermann999 or anybody could have a look?) |
Oh I made the MR in wrong repository. I moved it to the official one: traccc_data_mr_1 |
Are there more comments on this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still very much in favour of this development, but I think the code could be simplified a bit in a number of places...
const seedfinder_config m_seedfinder_config; | ||
const seedfilter_config m_seedfilter_config; | ||
multiplet_estimator m_estimator; | ||
seed_filtering m_seed_filtering; | ||
|
||
vecmem::memory_resource& m_mr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know, I know... I keep harping on about this one issue... 😦
In my mind a class that uses vecmem::memory_resource
should look like:
class my_class {
public:
my_class( ..., vecmem::memory_resource& mr, ... ) : ..., m_mr( &mr ), ...
my_class( const my_class& ) = default;
my_class( my_class&& ) = default;
my_class& operator=( const my_class& ) = default;
my_class& operator=( my_class&& ) = default;
vecmem::memory_resource& memory_resource() { return *m_mr; }
const vecmem::memory_resource& memory_resource() const { return *m_mr; }
private:
vecmem::memory_resource* m_mr;
};
So while its public interface should receive and hand out memory resources by reference (showing that the memory resource is an existing one for sure), internally it should use a pointer. This is because a reference member makes objects of the class non-copyable. Since references cannot be modified after creation. But if the member is a simple pointer, the default copy/move constructors/operators should work just fine for the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I had to realise through acts-project/vecmem#114, it's probably an even better idea to use std::reference_wrapper<vecmem::memory_resource>
as the variable that "points at" the memory resource. As that both ensures that we would initialise it correctly under all circumstances with a reference, and also allows us to copy/assign our own objects with such variables.
https://en.cppreference.com/w/cpp/utility/functional/reference_wrapper
So in all the places where I was complaining about vecmem::memory_resource
member variables, the most ideal would seem to switch to using std::reference_wrapper<vecmem::memory_resource>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. definitely reference_wrapper would be better. Fixed now
@@ -0,0 +1,26 @@ | |||
/** TRACCC library, part of the ACTS project (R&D line) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is absolutely a question of convention... But I would rather call this file track_params_estimating.hpp
. I think we should only use the .cuh
extension on files that have actual CUDA code in them. Like a template class that launches kernels. (Which is not a good design idea, I just brought it up as an example.)
I myself took to organising my own code like the following recently:
helper_function.hpp
: Declaration of the helper function, using regular C++;helper_function.cu
: Implementation of the helper function, using the CUDA runtime API, and implementing device code.
In such a setup one can come up with the relatively simple convention, that .cuh
header files must only be included by .cu
source files. As .cpp
source files would not be able to interpret them. While .hpp
header files can be included by both .cpp
and .cu
files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see if that's a usual convention. I modified all .cuh into .hpp
namespace cuda { | ||
|
||
/// track parameter estimation for cuda | ||
class track_params_estimation { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I'm not sure what would be the best names, but traccc::cuda::track_params_estimating
and traccc::cuda::track_params_estimation
are way too similar.
But I'm also not sure what role this class is playing in general. 😕 Was this necessary for the previous implementation of how algorithms worked? In the current setup I think this is an unnecessary middle layer between trackccc::track_params_estimation_algorithm
and traccc::cuda::track_params_estimating
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I removed track_params_estimating
since it is confusing as you mentioned. Instead, I directly implement operator() in track_params_estimation.cu
. Now it will look much better.
For the algorithm part, I will comment out in the below as well; I removed the track_params_estimation_algorithm
as it seems not needed for the moment. The initial motivation of algorithm one is for setting the configuration parameter rather than touching the original track_parameter_estimation in core directory. but It is apparent that we don't need the algorithm version because there is not any configuration parameter for track parameter estimation for the moment.
/// @param resource vecmem memory resource | ||
void track_params_estimating(host_seed_container& seeds, | ||
host_bound_track_parameters_collection& params, | ||
vecmem::memory_resource* resource); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should rather be a vecmem::memory_resource&
reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Anyway this file is removed
|
||
namespace traccc { | ||
|
||
class track_params_estimation_algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm continuing to be confused...
Is this class supposed to (eventually) decide between using the host or the CUDA implementation? If not, why does it exist? What additional value does it give on top of using traccc::track_params_estimation
directly?
I'm probably missing something, but the naive UI in my head would be to have:
traccc::track_params_estimation_algorithm
: An "algorithm" for running the track parameter estimation on the host.traccc::cuda::track_params_estimation_algorithm
: An "algorithm" for running the track parameter estimation using CUDA.
I think some of the layers introduced in this PR are not actually necessary. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I explained above, these are removed now
|
||
private: | ||
output_type params; | ||
std::shared_ptr<traccc::cuda::track_params_estimation> tp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just as for traccc::track_params_estimation_algorithm
, I don't see what's the purpose of this class. 😕 What interface does it add on top of traccc::cuda::track_params_estimation
?
Also, why does it not inherit from the traccc::algorithm
base class? 😕
examples/run/cpu/seq_example.cpp
Outdated
traccc::clusterization_algorithm::input_type ca_input(cells_per_event); | ||
traccc::clusterization_algorithm::output_type ca_output; | ||
|
||
ca_output = ca(ca_input); | ||
|
||
auto& measurements_per_event = ca_output.first; | ||
auto& spacepoints_per_event = ca_output.second; | ||
|
||
/*----------------------- | ||
Seeding algorithm | ||
-----------------------*/ | ||
|
||
traccc::seeding_algorithm::input_type sa_input(spacepoints_per_event); | ||
traccc::seeding_algorithm::output_type sa_output; | ||
|
||
sa_output = sa(sa_input); | ||
|
||
auto& internal_sp_per_event = sa_output.first; | ||
auto& seeds = sa_output.second; | ||
|
||
/*---------------------------- | ||
Track params estimation | ||
----------------------------*/ | ||
|
||
traccc::track_params_estimation_algorithm::input_type tp_input(seeds); | ||
traccc::track_params_estimation_algorithm::output_type tp_output; | ||
|
||
tp_output = tp(tp_input); | ||
|
||
auto& params = tp_output; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the updated algorithms semantics this could surely be simplified quite a bit. Most of these lines should not be needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// Google Test can be run manually from the main() function | ||
// or, it can be linked to the gtest_main library for an already | ||
// set-up main() function primed to accept Google Test test cases. | ||
int main(int argc, char** argv) { | ||
::testing::InitGoogleTest(&argc, argv); | ||
|
||
return RUN_ALL_TESTS(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test should rather be linked against GTest::gtest_main
than to implement the main()
function for ourselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I simply removed int main(..)
since it was already linked to gtest. Not sure if it is what you meant though
I applied the update algorithm semantics to the seed finding as well as track parameter estimation even though it was not the initial purpose of this PR. I used the ravlue reference for the cuda algorithms since const lvalue reference doesn't work for vecmem containers. Maybe @stephenswat can look into the code to check if I didn't follow well the intention of the semantics. I didn't edit |
bac67ba
to
3f19089
Compare
Squashed into one commit |
Rebased with main branch. Event though it is not very desirable, let's make this happen in future PR... :) pinging @krasznaa @stephenswat p.s. overall run time does not change a lot between array and eigen plugin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of good work here overall! It feels like there is a lot of unrelated work in this PR though - is that here for a reason or did it get included by accident? I also have some other comments, but otherwise I think this should be close to ready. 😄
m.variance = var / totalWeight; | ||
m.variance = 1 / totalWeight * var; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to make this change? It seems to me like it would be slower and less numerically accurate. 😦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just becuase operator / is not defined for array in algebra-plugins
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will mark with TODO for later change
#include <cuda/seeding/doublet_counting.cuh> | ||
#include <cuda/seeding/doublet_finding.cuh> | ||
#include <cuda/seeding/seed_selecting.cuh> | ||
#include <cuda/seeding/triplet_counting.cuh> | ||
#include <cuda/seeding/triplet_finding.cuh> | ||
#include <cuda/seeding/weight_updating.cuh> | ||
#include <cuda/seeding/doublet_counting.hpp> | ||
#include <cuda/seeding/doublet_finding.hpp> | ||
#include <cuda/seeding/seed_selecting.hpp> | ||
#include <cuda/seeding/triplet_counting.hpp> | ||
#include <cuda/seeding/triplet_finding.hpp> | ||
#include <cuda/seeding/weight_updating.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we moving away from the .cuh
extension? That's quite standard for CUDA programs. 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought so. But they don't have any cuda related code inside .cuh so decided to use .hpp extension for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was one of my recommendations to use .hpp
on headers that can be included by not only .cu
files, but other C++ source files as well.
Since as I explained before, I like to use .cuh
as a "contagious" extension. Marking something as .cuh
should be done only when that particular header has code for launching a kernel, or when it includes some other .cuh
file. In all other cases we should just use .hpp
.
mutable std::shared_ptr<std::mutex> mutex{std::make_shared<std::mutex>()}; | ||
mutable host_doublet_counter_container doublet_counter_container; | ||
mutable host_doublet_container mid_bot_container; | ||
mutable host_doublet_container mid_top_container; | ||
mutable host_triplet_counter_container triplet_counter_container; | ||
mutable host_triplet_container triplet_container; | ||
mutable host_seed_container seed_container; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little surprised to find this here. Are we relying on the internal state of our algorithms? I really think we should aim to have our algorithms be stateless!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know.
Without these, I have to create host_doublet/triplet_container in operator() function for every event which makes entire algorithm much slower (it's almost 50% slowdown). Do you have a better idea on this?
/// @return seed_collection is the vector of seeds per event | ||
output_type operator()( | ||
host_internal_spacepoint_container&& i) const override { | ||
std::lock_guard<std::mutex> lock(*mutex.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we really need to do locking (which I hope we don't!) I think this is probably a little coarse-grained. It will be hard to get any decent acceleration from parallelism this way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also didn't want to make this change. The only way to remove mutable
internal members whould be taking them as external parameters - but this is also coarse-grained.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I don't think we have to achieve more parallelization in cuda operator()
, which is grid bin level.
auto uvTransform = [](const vector3& local) -> vector2 { | ||
vector2 uv; | ||
scalar denominator = local[0] * local[0] + local[1] * local[1]; | ||
uv[0] = local[0] / denominator; | ||
uv[1] = local[1] / denominator; | ||
return uv; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function doesn't actually act as a closure - maybe we can just define it as a free function? Or maybe move it the algebra plugins?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I don't recommend moving it into algebra plugin.
I will just make a free function in the same file
class seeding_algorithm { | ||
public: | ||
using input_type = host_spacepoint_container&; | ||
using output_type = | ||
std::pair<host_internal_spacepoint_container, host_seed_container>; | ||
class seeding_algorithm | ||
: public algorithm< | ||
std::pair<host_internal_spacepoint_container, host_seed_container>( | ||
host_spacepoint_container&&)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these changes related to the track parameter seeding in some way or have they snuck into this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They snuck.
std::reference_wrapper<vecmem::memory_resource> m_mr; | ||
|
||
// mutable internal objects for multiplets | ||
mutable std::shared_ptr<std::mutex> mutex{std::make_shared<std::mutex>()}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
People are okay with these mutable objects and mutex lock?
Without these, I have to declare host_doublet/multiplet_container
in operator() function for every event which makes entire algorithm much slower (it's almost 50% slowdown)
m.variance = var / totalWeight; | ||
m.variance = 1 / totalWeight * var; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is just becuase operator / is not defined for array in algebra-plugins
#include <cuda/seeding/doublet_counting.cuh> | ||
#include <cuda/seeding/doublet_finding.cuh> | ||
#include <cuda/seeding/seed_selecting.cuh> | ||
#include <cuda/seeding/triplet_counting.cuh> | ||
#include <cuda/seeding/triplet_finding.cuh> | ||
#include <cuda/seeding/weight_updating.cuh> | ||
#include <cuda/seeding/doublet_counting.hpp> | ||
#include <cuda/seeding/doublet_finding.hpp> | ||
#include <cuda/seeding/seed_selecting.hpp> | ||
#include <cuda/seeding/triplet_counting.hpp> | ||
#include <cuda/seeding/triplet_finding.hpp> | ||
#include <cuda/seeding/weight_updating.hpp> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought so. But they don't have any cuda related code inside .cuh so decided to use .hpp extension for them.
/// @return seed_collection is the vector of seeds per event | ||
output_type operator()( | ||
host_internal_spacepoint_container&& i) const override { | ||
std::lock_guard<std::mutex> lock(*mutex.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also didn't want to make this change. The only way to remove mutable
internal members whould be taking them as external parameters - but this is also coarse-grained.
/// @return seed_collection is the vector of seeds per event | ||
output_type operator()( | ||
host_internal_spacepoint_container&& i) const override { | ||
std::lock_guard<std::mutex> lock(*mutex.get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I don't think we have to achieve more parallelization in cuda operator()
, which is grid bin level.
mutable std::shared_ptr<std::mutex> mutex{std::make_shared<std::mutex>()}; | ||
mutable host_doublet_counter_container doublet_counter_container; | ||
mutable host_doublet_container mid_bot_container; | ||
mutable host_doublet_container mid_top_container; | ||
mutable host_triplet_counter_container triplet_counter_container; | ||
mutable host_triplet_container triplet_container; | ||
mutable host_seed_container seed_container; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know.
Without these, I have to create host_doublet/triplet_container in operator() function for every event which makes entire algorithm much slower (it's almost 50% slowdown). Do you have a better idea on this?
output_type operator()(host_seed_container&& seeds) const override; | ||
|
||
private: | ||
std::reference_wrapper<vecmem::memory_resource> m_mr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's something I don't understand well. what is normal wrapper?
class seeding_algorithm { | ||
public: | ||
using input_type = host_spacepoint_container&; | ||
using output_type = | ||
std::pair<host_internal_spacepoint_container, host_seed_container>; | ||
class seeding_algorithm | ||
: public algorithm< | ||
std::pair<host_internal_spacepoint_container, host_seed_container>( | ||
host_spacepoint_container&&)> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They snuck.
No good reason for that... :( I will try to make PR more clear in the future |
Are there more comments on this? Otherwise I want it to get merged in |
@stephenswat @krasznaa Are there more issues that I need to address? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me - just two small things I would like to follow up on. When that is done we can check whether Attila or anyone else has any comments, and then I would be in favour of moving to merge. 😄
} | ||
|
||
private: | ||
std::reference_wrapper<vecmem::memory_resource> m_mr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a huge deal, but I still don't see the need to use reference wrappers here as opposed to normal wrappers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's something I am not sure - it is recommended by @krasznaa. I think he can explain us the motivation for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll sign off on this for now, but I will leave just a bit of time for others to comment if they want.
@stephenswat looks like there is no comment anymore. Can you get this in? We can come back to the wrapper thing at any time in the future |
6c180c4
to
d414786
Compare
it's squashed into one commit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do it! 👍
traccc_pr_75 should be merged first before this PR
This is for track parameter estimation from found seeds.
The algorithm is almost copy-and-paste of EstimateTrackParamsFromSeed in Acts - a different thing is that the local position on surface is obtained from
measurement
directly while the acts one obtains it with global-to-local transformation and surface object.There are a couple of features and issues that people might be interested
core/include/definitions/qualifiers.hpp
)std::optional< measurement >
member in spacepoint to refer it when I obtain the local track parameter, but it turned out thatstd::optional
doesn't work in cuda device code - or anyone knows how to make it work??Meanwhile I have observed that seed finding doesn't work with eigen_plugin and double scalar type ... I will look into that and make another PR to resolve it
TO-DO list:
Update on Sep 22
This can be merged after #92