## About .ptu Format

A .ptu file is a binary data format used by PicoQuant time-tagging devices to store photon detection events with high temporal resolution. It typically contains metadata (like configuration and hardware info) and time-tagged photon event records, used in time-correlated single photon counting (TCSPC) and quantum optics experiments.

__PyPTU__ is a Python package and command-line tool developed by __Mike Rye__ for parsing PicoQuant PTU files, which are used in time-correlated single photon counting (TCSPC) experiments. It efficiently extracts photon, marker, and overflow data into Pandas DataFrames, and can export this data to JSON and CSV formats for further analysis. This tool reads PicoQuant PTU files using the same parsing approach as PicoQuant’s Python demo, but achieves up to __40×__ faster performance by leveraging NumPy’s vectorized operations.

Link: `https://gitlab.inria.fr/jrye/pyptu`

I am going to install the package and modify the .ptu file for convenient vector operation in this project.

## Install:

`pip install pyptu`

In [1]:
import pyptu

print("pyptu is installed.")
print(pyptu.__file__)


pyptu is installed.
C:\Users\mhassa11\AppData\Local\miniconda3\envs\intelARC\lib\site-packages\pyptu\__init__.py


__pyptu__ is a local/private module: 

In [2]:
pyptu.__version__ = "dev-local"
print("pyptu version:", pyptu.__version__)

pyptu version: dev-local


So, to see the version, we'll use metadata library in the format:

`print(md.version("__package_name__"))
`

In [5]:
import importlib.metadata as md
print(md.version("pyptu"))

2024.2


# Using PyPTU for data conversion

In [8]:

import os
import time
import json
import pandas as pd
from pyptu import PTUParser


start_0 = time.time()
#------------------------------------------------------------------------------------------#
PC          = os.getlogin()
input_file  = fr'C:\Users\{PC}\Documents\JupyterNotebook\Summer_25\UTC\raw_10s_run.ptu'
output_file = fr'C:\Users\{PC}\Documents\JupyterNotebook\Summer_25\UTC\ptu_to_txt.dat'

parser = PTUParser(input_file)
parser.load()

parser.photons.to_csv(output_file, sep="\t", index=False)
#------------------------------------------------------------------------------------------#
end_0   = time.time()

file_size_bytes = os.path.getsize(output_file)
file_size_MB    = file_size_bytes / (1024 * 1024)

print(f"File size        \t {file_size_MB:.2f} MB")
print(f"Conversion time  \t {end_0 - start_0:.3f} sec")

# Preview output
df = pd.read_csv(output_file, sep="\t", nrows=20)
df.head(10)


File size        	 779.35 MB
Conversion time  	 33.074 sec


Unnamed: 0,Record Index,Channel,Time Tag,Resolved Time Tag,Dtime
0,0,3,28191,140955.0,0
1,1,0,36653,183265.0,0
2,2,4,41872,209360.0,0
3,3,2,42471,212355.0,0
4,4,2,438737,2193685.0,0
5,5,4,451327,2256635.0,0
6,6,0,559075,2795375.0,0
7,7,0,577964,2889820.0,0
8,8,3,579264,2896320.0,0
9,9,2,603304,3016520.0,0


#### Let's see if we can make the procedure faster by keeping only necessary columns

In [10]:

import os
import time
import json
import pandas as pd
from pyptu import PTUParser


start_1 = time.time()
#------------------------------------------------------------------------------------------#
PC          = os.getlogin()
input_file  = fr'C:\Users\{PC}\Documents\JupyterNotebook\Summer_25\UTC\raw_10s_run.ptu'
output_file = fr'C:\Users\{PC}\Documents\JupyterNotebook\Summer_25\UTC\ptu_to_txt_concise.dat'

parser = PTUParser(input_file)
parser.load()

parser.photons[['Channel', 'Time Tag']].to_csv(output_file, sep="\t", index=False)
#------------------------------------------------------------------------------------------#
end_1   = time.time()

file_size_bytes = os.path.getsize(output_file)
file_size_MB    = file_size_bytes / (1024 * 1024)

print(f"File size        \t {file_size_MB:.2f} MB")
print(f"Conversion time  \t {end_1 - start_1:.3f} sec")

# Preview output
df = pd.read_csv(output_file, sep="\t", nrows=20)
df.head(10)


File size        	 299.76 MB
Conversion time  	 11.110 sec


Unnamed: 0,Channel,Time Tag
0,3,28191
1,0,36653
2,4,41872
3,2,42471
4,2,438737
5,4,451327
6,0,559075
7,0,577964
8,3,579264
9,2,603304


## Make Chunks for parallel processing

In [11]:

import os, time
import pandas as pd
from pyptu import PTUParser
from joblib import Parallel, delayed
from tqdm import tqdm

# ─────────────────────────────── user paths ───────────────────────────────────
PC          = os.getlogin()
base_dir    = fr"C:\Users\{PC}\Documents\JupyterNotebook\Summer_25\UTC"
in_file     = os.path.join(base_dir, "raw_10s_run.ptu")
out_dir     = os.path.join(base_dir, "chunks")          # → …\chunks\chunk_01.txt …
os.makedirs(out_dir, exist_ok=True)


start_0 = time.time()
#─────────────────────────────────────────────────────────────────────────────────────────────────────────#
parser = PTUParser(in_file)
parser.load()

df = parser.photons[["Channel", "Time Tag"]]            # keep only the 2 columns
n_events  = len(df)
N_chunks  = 10
chunk_sz  = chunk_sz = -(-n_events // N_chunks)
print(f"Total events: {n_events:,}  →  {chunk_sz:,} rows per chunk")

def save_chunk(idx: int):
    """Write chunk idx (0-based) to <out_dir>/chunk_XX.txt"""
    start = idx * chunk_sz
    end   = min((idx + 1) * chunk_sz, n_events)
    out   = os.path.join(out_dir, f"chunk_{idx + 1:02d}.txt")
    df.iloc[start:end].to_csv(out, sep="\t", index=False)

Parallel(n_jobs=-1)( delayed(save_chunk)(i) for i in tqdm(range(N_chunks), desc="Writing chunks") )
#─────────────────────────────────────────────────────────────────────────────────────────────────────────#
stop_0  = time.time()

print(f"Done in {stop_0 - start_0:.3f} s → files in {out_dir}")


Total events: 19,114,591  →  1,911,460 rows per chunk


Writing chunks: 100%|████████████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 1208.80it/s]


Done in 3.887 s → files in C:\Users\mhassa11\Documents\JupyterNotebook\Summer_25\UTC\chunks
