Hi together,
I do a research project about MPI / OpenMPI 5.0.8 (with Peruse) and we try to trace things with ucTrace.
I used a slurm system without Nvidia / CUDA with 8 Nodes and ran LULESH 2.0 LULESH 2.0 with it:
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --tasks-per-node=8
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G
#SBATCH --output=lulesh-%j.out
#SBATCH --error=lulesh-%j.err
export LD_PRELOAD=/home/<USERNAME>/git/so-files-lib/preload.so
# This folder contains the built .so files from ucTrace folder
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/<USERNAME>/git/so-files-lib/
srun /home/<USERNAME>/git/LULESH/lulesh2.0
Everything worked and I got the 8 .asd.gz files.
After that, I tried to execute parse.py to parse the log files:
python3 parse.py my-outputs/ -p my_output.pkl -n 0
Badly, I got an error while the script was parsing the COMMS:
Traceback (most recent call last):
File "/home/<USERNAME>/ucTrace/parser/parse.py", line 2017, in <module>
main()
~~~~^^
File "/home/<USERNAME>/ucTrace/parser/parse.py", line 1966, in main
all_data, parsed_comms = parse_folder(args.folder_path, args)
~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<USERNAME>/ucTrace/parser/parse.py", line 1793, in parse_folder
match_ucp_send_recv(parsed_comms, all_data)
~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/<USERNAME>/ucTrace/parser/parse.py", line 1253, in match_ucp_send_recv
recvs = tmp_recvs[target_pid]
~~~~~~~~~^^^^^^^^^^^^
KeyError: "Error decoding hex string-- 'ascii' codec can't decode byte 0xf0 in position 8"
How can I fix this?
Thanks in advance!
Hi together,
I do a research project about MPI / OpenMPI 5.0.8 (with Peruse) and we try to trace things with ucTrace.
I used a slurm system without Nvidia / CUDA with 8 Nodes and ran LULESH 2.0 LULESH 2.0 with it:
Everything worked and I got the 8 .asd.gz files.
After that, I tried to execute parse.py to parse the log files:
Badly, I got an error while the script was parsing the COMMS:
How can I fix this?
Thanks in advance!