Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data V7, main branch (2024.05.03.) #561

Merged
merged 1 commit into from
May 3, 2024

Conversation

krasznaa
Copy link
Member

@krasznaa krasznaa commented May 3, 2024

Switched to version 7 of the traccc data file.

It has 2 major updates:

  • The ODD geometry files have been updated to the "latest" ones generated by @asalzburger. (He since seems to have generated slightly newer ones, but I stuck with the ones that I tested the traccc code with already.)
  • Replaced the existing ODD Geant4 and Fatras simulation files with a new set of Geant4 files.

I did the latter with the code, and instructions from: acts-project/acts#3169

As it is explained in that ticket, the current main branch of acts can run simulation on the ODD geometry out of the box. Which I did in the following way:

  • Cloned the HEAD of the Acts main branch;
    • Making sure that the ODD submodule of the repository would be set up correctly. (git submodule init && git submodule update)
  • Started a Docker container with the following command in the directory where I cloned Acts:
docker run -it --rm -v $PWD:$PWD -w $PWD ghcr.io/acts-project/ubuntu2204:v41
  • Inside the container, used the following BASH script to build the code, and run some specific simulations:
#!/bin/bash
#
# Script for running a set of ODD simulation jobs for traccc
# performance measurements.
#

# Stop on errors.
set -e

# Default script arguments.
ACTS_DIR="./acts"
BUILD_DIR="./build"

# Helper function for printing usage information for the script.
usage() {
    echo "Script for running ODD simulations for traccc using the"
    echo "ghcr.io/acts-project/ubuntu2204:v41 Docker image."
    echo ""
    echo "Usage: ${BASH_SOURCE[0]} [options]"
    echo "Options:"
    echo "  -h/--help:      Print this message."
    echo ""
    echo "  -a/--acts-dir:  Directory with the Acts sources"
    echo "                  [${ACTS_DIR}]"
    echo "  -b/--build-dir: Build directory for Acts"
    echo "                  [${BUILD_DIR}]"
    echo ""
}

# Parse the command line argument(s).
while [[ $# > 0 ]]
do
    case $1 in
	-a|--acts-dir)
	    ACTS_DIR=$2
	    shift
	    ;;
	-b|--build-dir)
	    BUILD_DIR=$2
	    shift
	    ;;
	-h|--help)
	    usage
	    exit 0
	    ;;
	*)
	    echo "ERROR: Unknown argument: $1"
	    echo ""
	    usage
	    exit 1
	    ;;
    esac
    shift
done

# Check if the Acts directory exists.
if [ ! -d "${ACTS_DIR}" ]
then
    echo "ERROR: Acts directory (${ACTS_DIR}) not found"
    exit 1
fi

# Check if the build directory exists. If not, make it.
if [ ! -d "${BUILD_DIR}" ]
then
    cmake -G Ninja \
	  -DCMAKE_BUILD_TYPE=Release \
	  -DCMAKE_CXX_STANDARD=17 \
	  -DACTS_ENABLE_LOG_FAILURE_THRESHOLD=ON \
	  -DACTS_BUILD_EVERYTHING=ON \
	  -DACTS_BUILD_ODD=ON \
	  -DACTS_BUILD_EXAMPLES_PYTHON_BINDINGS=ON \
	  -S "${ACTS_DIR}" \
	  -B "${BUILD_DIR}"
    cmake --build "${BUILD_DIR}"
fi

# Set up the environment from the build directory.
source "${BUILD_DIR}/this_acts.sh"
source "${BUILD_DIR}/python/setup.sh"

# Make sure that the Geant4 datasets are downloaded.
geant4-config --install-datasets

# Run all the muon simulations.
for NMUON in 1 10
do
    for PT in 1 5 10 50 100
    do
	python3 "${ACTS_DIR}/Examples/Scripts/Python/sim_digi_odd.py" \
		--digi-config odd-digi-geometric-config.json \
		--geant4 \
		--gun-multiplicity ${NMUON} \
		--gun-pt-range ${PT} ${PT} \
		--events 100 \
		--output geant4_${NMUON}muon_${PT}GeV 2>&1 | \
	    tee geant4_${NMUON}muon_${PT}GeV.log
    done
done

# Run all ttbar simulations.
for MU in 20 40 60 80 100 140 200 300
do
    python3 "${ACTS_DIR}/Examples/Scripts/Python/sim_digi_odd.py" \
	    --digi-config odd-digi-geometric-config.json \
	    --geant4 \
	    --ttbar \
	    --ttbar-pu ${MU} \
	    --events 10 \
	    --output geant4_ttbar_mu${MU} 2>&1 | \
	tee geant4_ttbar_mu${MU}.log
done

(The odd-digi-geometric-config.json file used by the simulation is the same included in this PR's v7 file as well.)

Unfortunately the ODD simulations turned out to be humongous. So I only included the muon simulations into the v7 data file. Since the ttbar simulations, even with just 10 events each, look like this:

[bash][celeborn]:acts > du -sh geant4_ttbar_mu*/ | sort -h
2.0G	geant4_ttbar_mu20/
3.1G	geant4_ttbar_mu40/
4.0G	geant4_ttbar_mu60/
5.2G	geant4_ttbar_mu80/
6.4G	geant4_ttbar_mu100/
9.0G	geant4_ttbar_mu140/
13G	geant4_ttbar_mu200/
19G	geant4_ttbar_mu300/
[bash][celeborn]:acts >

So yeah, these are not going into the file that our CI setup uses... I'm currently compressing them, to put them on CERNBox.

But with the simulations being so humongous, we'll probably need to find some better way of storing these things. I think we'll have to consider putting them on EOS in such a way that inside of CERN we could use them without having to download them locally. I'm open to proposals on this front. 🤔

Finally: While the current state of our reconstruction code swallows these files seemingly fine, performance checking doesn't work on them. 😦 As I explained to @beomki-yeo already, the logic in our code does some funky stuff. Leading to the following crash:

[bash][celeborn]:traccc > gdb ./build/bin/traccc_seq_example
GNU gdb (Ubuntu 12.1-0ubuntu1~22.04) 12.1
Copyright (C) 2022 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./build/bin/traccc_seq_example...
(gdb) run --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=odd/geant4_1muon_1GeV/ --input-events=1 --check-performance
Starting program: /home/krasznaa/ATLAS/projects/traccc/build/bin/traccc_seq_example --detector-file=geometries/odd/odd-detray_geometry_detray.json --grid-file=geometries/odd/odd-detray_surface_grids_detray.json --use-detray-detector --digitization-file=geometries/odd/odd-digi-geometric-config.json --input-directory=odd/geant4_1muon_1GeV/ --input-events=1 --check-performance
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".

Running Full Tracking Chain on the Host

>>> Detector Options <<<
  Detector file       : geometries/odd/odd-detray_geometry_detray.json
  Material file       : 
  Surface rid file    : geometries/odd/odd-detray_surface_grids_detray.json
  Use detray::detector: yes
  Digitization file   : geometries/odd/odd-digi-geometric-config.json
>>> Input Data Options <<<
  Input data format             : csv
  Input directory               : odd/geant4_1muon_1GeV/
  Number of input events        : 1
  Number of input events to skip: 0
>>> Clusterization Options <<<
  Target cells per partition: 1024
>>> Track Seeding Options <<<
  None
>>> Track Finding Options <<<
  Track candidates range   : 3:100
  Minimum step length for the next surface: 0.5 [mm] 
  Maximum step counts for the next surface: 100
  Maximum Chi2             : 30
  Maximum branches per step: 4294967295
  Maximum number of skipped steps per candidates: 3
>>> Track Propagation Options <<<
  Constraint step size  : 3.40282e+38 [mm]
  Overstep tolerance    : -100 [um]
  Minimum mask tolerance: 1e-05 [mm]
  Maximum mask tolerance: 1 [mm]
  Search window         : 0 x 0
  Runge-Kutta tolerance : 0.0001
>>> Track Ambiguity Resolution Options <<<
  Run ambiguity resolution : yes
>>> Performance Measurement Options <<<
  Run performance checks: yes

WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: No material in detector
WARNING: No entries in volume finder
Detector check: OK
WARNING: @traccc::io::csv::read_cells: 1 duplicate cells found in /home/krasznaa/ATLAS/projects/traccc/traccc/data/odd/geant4_1muon_1GeV/event000000000-cells.csv

Program received signal SIGSEGV, Segmentation fault.
traccc::event_map2::event_map2 (this=0x7fffffff9f50, event=<optimized out>, measurement_dir=..., hit_dir=..., particle_dir=...) at /home/krasznaa/ATLAS/projects/traccc/traccc/io/src/event_map2.cpp:90
90	        const auto csv_ptc = particles[csv_hit.particle_id];
(gdb) bt
#0  traccc::event_map2::event_map2 (this=0x7fffffff9f50, event=<optimized out>, measurement_dir=..., hit_dir=..., 
    particle_dir=...) at /home/krasznaa/ATLAS/projects/traccc/traccc/io/src/event_map2.cpp:90
#1  0x000000000044db6a in seq_run (input_opts=..., detector_opts=..., seeding_opts=..., finding_opts=..., propagation_opts=..., 
    resolution_opts=..., performance_opts=...)
    at /home/krasznaa/ATLAS/projects/traccc/traccc/examples/run/cpu/seq_example.cpp:295
#2  0x000000000044f906 in main (argc=<optimized out>, argv=<optimized out>)
    at /home/krasznaa/ATLAS/projects/traccc/traccc/examples/run/cpu/seq_example.cpp:368
(gdb)

Let's discuss this in either this PR, or on some other forum, because I don't even understand how that code is supposed to work exactly. 😕 Since the "particle IDs" used in the simulation are very large uint64_t numbers. They are not indices starting from 0. 😕 (So we'll need to use something like a map in that code instead of a vector. 🤔)

@krasznaa krasznaa added the build This relates to the build system label May 3, 2024
@beomki-yeo
Copy link
Contributor

beomki-yeo commented May 3, 2024

Hmmm this is the first two lines of hit file of single 1 GeV muon ODD simulation (odd/geant4_1muon_1GeV/event000000000-hits.csv)

particle_id,geometry_id,tx,ty,tz,tt,tpx,tpy,tpz,te,deltapx,deltapy,deltapz,deltae,index
4503599694479360,1152922329240578050,-132.517853,-109.295212,-842.599976,1296.16541,-0.80127883,-0.598622918,-5.04116726,5.14051819,-0.00023341192,4.34259928e-05,7.85181619e-05,-4.56686867e-05,4

As you can see te, which I think is the truth energy of the particle, is 5.14 GeV. I am not sure how the particle with such a big momentum could be generated in the ODD simulation. And 4503599694479360 does not seem to be included in any of particles-initial and particles-final.

@krasznaa
Copy link
Member Author

krasznaa commented May 3, 2024

Not sure why you say that that particle ID would not be included. 😕 I see:

[bash][celeborn]:geant4_1muon_1GeV > grep "4503599694479360" event000000000-particles_*
event000000000-particles_final.csv:4503599694479360,-13,0,-683.296387,-364.839783,-4003,4516.3916,-0.66901058,-0.0603341982,-4.21662664,0.105658375,1
event000000000-particles_initial.csv:4503599694479360,-13,0,-0.0068779313,0.00644291332,26.2992783,410.251038,-0.739709318,-0.672926545,-5.055408,0.105658382,1
[bash][celeborn]:geant4_1muon_1GeV >

So that possibly slightly weird-looking hit should be coming from the 1 muon of the event. 🤔

@beomki-yeo
Copy link
Contributor

Sorry i just quickly browed the file without checking it thoroughly
Do you have any idea why it has 5 GeV though?

@krasznaa
Copy link
Member Author

krasznaa commented May 3, 2024

Is it absolutely for certain that that particular property is measured in GeV? 🤔 Though even as MeV it would be way too much for how much energy a muon should lose in a silicon layer. 🤔 (And while I can imagine the Acts simulation using MeV, it certainly doesn't use keV...)

Still, with all that, I think that is just a red herring. If you look at for instance:

[bash][celeborn]:traccc > head data/detray_simulation/telescope/kf_validation/1_GeV_0_phi/event000000000-particles.csv 
particle_id,particle_type,process,vx,vy,vz,vt,px,py,pz,m,q
0,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
1,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
2,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
3,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
4,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
5,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
6,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
7,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
8,0,0,0,0,0,0,1,0,6.123233995736766e-17,0,-1
[bash][celeborn]:traccc >

In that toy simulation, the particle IDs start from 0, and count up. While in this latest G4 simulation we rather have:

[bash][celeborn]:traccc > head data/odd/geant4_1muon_1GeV/event000000000-particles_initial.csv 
particle_id,particle_type,process,vx,vy,vz,vt,px,py,pz,m,q
4503599644147712,13,0,-0.0068779313,0.00644291332,26.2992783,410.251038,0.556904197,0.830576718,1.06415033,0.105658382,-1
4503599644213249,11,0,241.868393,1243.0564,1415.2373,2318.17432,-1.09519851e-05,-4.98540794e-06,4.38029019e-06,0.000510998885,-1
4503599644213250,11,0,255.827927,1169.23669,1334.9187,2207.8811,-0.000287963398,-6.49540743e-05,0.000454213965,0.000510998885,-1
4503599644213251,11,0,256.66626,1164.1405,1329.37378,2200.28223,0.00152186817,0.00986581575,0.00815776736,0.000510998885,-1
4503599644213252,11,0,258.453583,1153.00684,1317.3075,2183.72217,0.000291749922,2.94843212e-05,0.000428735773,0.000510998885,-1
4503599644213253,11,0,269.193176,1068.70593,1225.87292,2058.5542,-2.35209045e-05,3.10777068e-05,-1.89763032e-05,0.000510998885,-1
4503599644213254,11,0,272.956726,1026.00427,1179.76099,1995.42603,-1.1201015e-06,1.07465752e-07,-1.25962345e-07,0.000510998885,-1
4503599644213255,11,0,234.6642,540.838684,657.654602,1278.53601,9.5930136e-07,3.63300899e-07,-4.8290525e-07,0.000510998885,-1
4503599644213256,11,0,173.804993,339.263153,432.710236,969.504639,-1.89713978e-06,8.96205904e-07,-3.45779441e-08,0.000510998885,-1
[bash][celeborn]:traccc >

So again, I think the logic in traccc::event_map2 is just fundamentally flawed. 🤔

@beomki-yeo
Copy link
Contributor

Yeah I remember that I simplified the particle id scheme in the detray simulation. But how does it have something to do with event_map2 logic which is based on unordered std::map?

@asalzburger
Copy link
Contributor

Hey, I did have a bug first where the eta values had been taken for pt as well in the production script - but I thought I had fixed that.

@beomki-yeo
Copy link
Contributor

beomki-yeo commented May 3, 2024

Is it absolutely for certain that that particular property is measured in GeV? 🤔 Though even as MeV it would be way too much for how much energy a muon should lose in a silicon layer.

Check the first line of particle file.. which might be the primiary muon

particle_id,particle_type,process,vx,vy,vz,vt,px,py,pz,m,q
4503599644147712,13,0,241.868393,1243.0564,1415.2373,2318.17432,-0.185933262,0.871767998,0.952647626,0.105658375,-1

It has (0.185933262,0.871767998,0.952647626) for momentum which is 1 -> that is, GeV is the right scale

@krasznaa
Copy link
Member Author

krasznaa commented May 3, 2024

Yeah I remember that I simplified the particle id scheme in the detray simulation. But how does it have something to do with event_map2 logic which is based on unordered std::map?

Check the code! The particles are first read into an std::vector, in the order in which they appear in the CSV file.

Then, while going through the hits, the code tries to find the particle belonging to the hit, through its vector index.

But for that to work, the indices would always have to be the same as the particle IDs. Which is only the case for the Detray simulation files.

@krasznaa
Copy link
Member Author

krasznaa commented May 3, 2024

Is it absolutely for certain that that particular property is measured in GeV? 🤔 Though even as MeV it would be way too much for how much energy a muon should lose in a silicon layer.

Check the first line of particle file.. which might be the primiary muon

particle_id,particle_type,process,vx,vy,vz,vt,px,py,pz,m,q
4503599644147712,13,0,241.868393,1243.0564,1415.2373,2318.17432,-0.185933262,0.871767998,0.952647626,0.105658375,-1

It has (0.185933262,0.871767998,0.952647626) for momentum which is 1 -> that is, GeV is the right scale

The particle and hit files may still save stuff in different units. Since different parts of the simulation code take care of creating those files.

But as I wrote, I don't think this is the thing that we should be focusing on right now...

@beomki-yeo
Copy link
Contributor

OK. Definitely using std::vector was a stupid move. I will fix that

@krasznaa
Copy link
Member Author

krasznaa commented May 3, 2024

For completeness, I've put the ttbar simulations here: https://cernbox.cern.ch/s/aLswvi2pNcBX9wr

It was quite a challenge even to upload it as a single file...

@krasznaa
Copy link
Member Author

krasznaa commented May 3, 2024

One of you should approve it eventually. That would make it a bit easier to fix the performance measurement code on top of this. 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build This relates to the build system
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants