Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding dNdX to edm4hep::Track #137

Merged
merged 24 commits into from
May 26, 2022
Merged

Adding dNdX to edm4hep::Track #137

merged 24 commits into from
May 26, 2022

Conversation

clementhelsens
Copy link
Contributor

BEGINRELEASENOTES

@tmadlener
Copy link
Contributor

Thanks. As also discussed during the meeting, leaving this open for a day or two to allow for some discussion.

My generalized proposal would have looked something like this:

  1. Introduce a new component
components:
  edm4hep::DxQuantity:
    Members:
      - int16_t type // flag identifying how to interpret the data
      - float value // the value
      - float error // the error
  1. Make the Track use that and remove dEdx
datatypes:
 edm4hep::Track:
   Description: "Reconstructed track"
   Author : "F.Gaede, DESY"
   Members:
     - int32_t type                         //flagword that defines the type of track.Bits 16-31 are used internally
     - float chi2                       //Chi^2 of the track fit
     - int32_t ndf                          //number of degrees of freedom of the track fit
     - float radiusOfInnermostHit       //radius of the innermost hit that has been used in the track fit
   VectorMembers:
     - int32_t subDetectorHitNumbers        //number of hits in particular subdetectors.Check/set collection variable TrackSubdetectorNames for decoding the indices 
     - edm4hep::TrackState trackStates  //track states
     - edm4hep::DxQuantity dxQuantities // different measurements of dx quantities
   OneToManyRelations:
     - edm4hep::TrackerHit trackerHits  //hits that have been used to create this track
     - edm4hep::Track tracks            //tracks (segments) that have been combined to create this track

This would allow to store up to potentially 256 different dx categories and essentially a memory bound amount of actual dx measurements per track. The interface would become much less intuitive though and it only makes sense if there are actually more categories of dx measurements other than dN and dE.

@clementhelsens
Copy link
Contributor Author

Hi @tmadlener, I like your idea, as it allows for example storing various methods to compute dE/dX. Thus I suggest we proceed with your implementation and start using it in k4simDelphes

@gaede
Copy link
Collaborator

gaede commented Mar 2, 2022

Hi Thomas, nice idea. Do we actually need an extra 16/8 bit type integer here or can we not make use of some bits in the type flag of the track ? I would not expect more than 8 (3 bits) different types here. Would safe a bit of disk space and also have better alignment...

@tmadlener
Copy link
Contributor

There are two problems that I see if we use some bits in the type flag of the track

  • From a UI perspective, I think it is nicer if the information on how to interpret a given dx quantity comes with the value
  • It will no longer be easily possible to store more than one different dx quantity type with the track, because that would require to use another 3 bits for each type that we want to store. Additionally, we would have an implicit ordering requirement for the quantities that are stored in the dxQuantities, because we would need to know which type belongs to which index in the end.

In this case I think the second one is the more limiting one, because one of the main reasons to implement it in this way is to have the possibility to e.g. store a dE/dx and a dN/dx measurement (or multiples of them) in one track.

@gaede
Copy link
Collaborator

gaede commented Mar 2, 2022

Hi Thomas, I have overlooked the vector members here. So yes, then you are right of course. Still another option would be to have only one dxValue/Error pair per track...

@tmadlener
Copy link
Contributor

That would work. I can't really judge if having multiple different dx measurements makes sense for one track. If we only have one per track, then defining a dedicated component might also be a bit of an overkill, and we could get away with just proper naming.

@clementhelsens
Copy link
Contributor Author

@gaede , @tmadlener, I think we could have multiple definitions of dE/dX for a given track, meaning we can have different ways of computing the truncated mean. This is in my opinion a good argument to keep a vector member with arbitrary number of dq/dx

@liwd2003
Copy link

liwd2003 commented Mar 9, 2022

  1. The number of dedx measurements used to calculate dedx of a track is less than or equal to the number of hits on the track because some dedx measurements will be dropped when the truncate mean method is applied. Both HitNumbers and number of dedx measurements should be member variables?

@clementhelsens
Copy link
Contributor Author

I believe we could even generalise @tmadlener suggestion with a generic component Quantity that would have an identifier, a value and an error, no need to make it specific to dx

components:
  edm4hep::Quantity:
    Members:
      - int16_t type // flag identifying how to interpret the data
      - float value // the value
      - float error // the error

Copy link
Collaborator

@andresailer andresailer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved in the meeting

@clementhelsens
Copy link
Contributor Author

Dear all,
At the Key4hep meeting 2 weeks ago (March 15) we agreed to postpone this PR until a preliminary schema-evolution is implemented. As far as I can see, there is no corresponding pull request yet in PODIO. Given that schema evolution has been under discussion since quite some time now, and that it is becoming a show stopper for FCC physics studies at this point, could you please give us a brief update on the current status and the expected timeline for a first implementation? @hegner @tmadlener

@tmadlener tmadlener enabled auto-merge (squash) May 24, 2022 17:36
@tmadlener
Copy link
Contributor

Merging this, since root's automatic schema evolution works for all relevant use cases for now. (As discussed during the meeting on May 24

@tmadlener
Copy link
Contributor

It is currently not clear to me why the CI builds against key4hep are failing for this. It looks like a dictionary mixup at first glance, but I am not sure that is actually the case, as there are also some unresolved symbol issues that seem to be unrelated to podio and EDM4hep. Additionally, only the RDataFrame test fails, but not the others which in principle also use the dictionaries and which are run in the same environment.

@tmadlener
Copy link
Contributor

I can also not seem to reproduce these problems locally, i.e. when I actually run what is run in CI in the same container as in the CI. @vvolkl any ideas what could be going on here?

@vvolkl
Copy link
Collaborator

vvolkl commented May 26, 2022

The tests are missing the ROOT_INCLUDE_PATH environment variable, and are thus looking at the outdated headers on cvmfs. Actually, all the other tests print the error too, but none of them return a fail exit code, it'll definitely be necessary to fail on a regex here too.

@tmadlener tmadlener merged commit 0ba2243 into key4hep:master May 26, 2022
@tmadlener
Copy link
Contributor

Thanks for fixing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

None yet

6 participants