-
Notifications
You must be signed in to change notification settings - Fork 0
2. ReadROOT's ROOT file reader
To use the .root file reader coded in the ReadROOT package, we can import the reader straight from ReadROOT
>>> from ReadROOT import read_root
Currently, read_root contains two different readers. The first and original version was used with the old GUI and used C++ for the TOF analysis. The second version is currently used by the new GUI. read_root can also be used on its own to read root files.
The define_cut function takes as input a minimum value start, a maximum value stop and some data data_set. It will return all the indexes that match the defined cut.
The get_unfiltered function takes a DataFrame containing a .root file's data as input. Looking at the Flags of the .root file, it only keeps the data that has a flag equal to 16384. The returned DataFrame contains only the unfiltered data, so no pileup events or saturation events.
The generate_csv_name function takes a file path file_path, a start channel start, a stop channel stop, and a time window window as input. Cuts on the data can be specified and will be added at the end of the newly generated file name:
{run_folder}_{tree_folder}_CH{start}-CH{stop}_{window}_{cuts}.csv
For the three following functions, the compression can be turned off. Using the
compresskeyword and setting it toFalsewill read the.csvfile with no compression. If the wrong compression is selected, the defaultpandaserror will appear, letting you know that the file cannot be read using the selected compression.
The get_cpp_tof_hist function will read the TOF data saved from a .csv file compressed with bz2. It will return the histogram's data for the TOF, so the time difference between the start and the stop channel.
The get_cpp_evse_hist function will read the TOF data saved from a .csv file compressed with bz2. It will return a 2D histogram containing the start energies and stop energies.
The get_cpp_tofvse_hist function will read the TOF data saved from a .csv file compressed with bz2. It will return a 2D histogram containing the TOF data and the stop energies.
It is a better idea to use the newer version of the reader. It is more efficient and easier to use.
Originally, the _root_reader class was made to look for files using Tkinter's filedialog window. Not all of the CoMPASS histograms were coded into it; it is missing all the 2D histograms using the TOF's data. For those histograms, it is easier to use the newer version. To use the old reader, the following commands can be executed:
>>> from ReadROOT import read_root
>>> old_reader = read_root._root_reader
If only the reader is needed, the following commands can be executed instead:
>>> import ReadROOT as r
>>> old_reader = r.reader
The __getdata__ function takes in a file path, a specified TTree key, and a boolean to downcast the data from unsigned integers to integers. The TTree key is used to access the data of the .root file. This function will return a pandas.DataFrame containing all the data from the .root file plus the calculated PSD values.
The __energyhist__ function needs a file path. It will read the data using __getdata__ and then return a 1D histogram of the energy.
The __psdhist__ function needs a file path. It will read the data using __getdata__ and then return a 1D histogram of the PSD.
The __timehist__ function needs a file path. It will read the data using __getdata__ and then return a 1D histogram of the time.
The __tofhist__ function needs two file paths. Both files will be read and the data from those files will then be binned. This function will return a 1D histogram of the time difference between the two selected files.
The __CPPTOF__ function needs two file paths, two cuts, and a time window specified in picoseconds. This function will use the C++ code to find the events that coincide and will return their time difference.
If this function returns an error about a header not being found, this means you need to redo the file configuration for this function to work. The configuration can be done with another part of the ReadROOT package. See the IOClasses to change the configuration.
The __PSDvsE__ function takes as input one file. After calculating the PSD values in __getdata__, the function returns a 2D histogram showing the PSD values against the energy values.
The __MCSgraph__ function takes as input on file. A histogram is used to find the number of counts per seconds and then show the result as a line plot.
The __transform_root_to_excel is a function that shouldn't be used for large files. Since this takes the data from a .root file and places it inside a .csv file, it is possible for the .csv file to be quite large and not easy to read. This function doesn't compress the .csv file so there is no file size reduction implemented.
This is the new version of the
.rootfile reader. All CoMPASS histograms should work as expected, meaning they should return the same thing CoMPASS does. To use this new reader, the following commands can be executed:
>>> from ReadROOT import read_root
>>> new_reader = read_root.root_reader_v2
If only the reader is needed, the following commands can be executed instead:
>>> import ReadROOT as r
>>> new_reader = r.reader_v2
To create a root_reader_v2 instance, a file path to a .root file and a TTree key must be passed to the function.
For all histograms made with the
root_reader_v2, the data used to create the histogram is also returned with the histogram's data.
The open function will try to open the .root file passed to the root_reader_v2 instance. If the file cannot be opened or if there is no data in the file, open will return None. Otherwise, all TBranch of the TTree. The PSD values are then calculated and the timestamps are downcasted to integers if we do not need them in unsigned integers (only used for TOF). The .root file will then be closed and the function will return a pandas.DataFrame containing all the data plus the calculated PSD values.
The get_energy_hist function returns the energy histogram of the selected file. A number of bins can be specified to change how many bins the returned histogram will have.
The get_psd_hist function returns the PSD histogram of the selected file. Since the PSD values do not exist by default in the .root files saved by CoMPASS. They are calculated the moment we use the open function. A number of bins can be specified to change how many bins the returned histogram will have. The range of the histogram will always be set to be from 0 to 1 since PSD values can only be in that range.
The get_time_hist function returns the time histogram of the selected file. This is basically the time difference between value 0 and value 1, value 1 and value 2, etc. This means that the data used for the histogram has one less value than the data from the .root file. The time histogram is also one of the few ranged histograms, meaning that a minimum and maximum value can be selected to change the histogram. A number of bins can be specified to change how many bins the returned histogram will have.
This function (like all functions requiring the TOF data) needs two
.rootfiles to be used. If we do not have two files, and said two files do not have the same length of data, the function will returnNone.
The get_tof_hist function returns the TOF histogram of the selection file. This is the time difference between two different files. The TOF histogram is a ranged histogram, meaning that a minimum and maximum value can be selected to change the histogram. A number of bins can be specified to change how many bins the returned histogram will have.
This function (like all functions requiring the TOF data) needs two
.rootfiles to be used. If we do not have two files, and said two files do not have the same length of data, the function will returnNone.
The get_evse_hist function returns a 2D histogram of the stop energies versus start energies. This is quite helpful when looking for what start events coincide with what stop events. A colormap going from blue to red will show the number of counts in each bin. The more red it is the larger the number of counts.
This function (like all functions requiring the TOF data) needs two
.rootfiles to be used. If we do not have two files, and said two files do not have the same length of data, the function will returnNone.
The get_tofvse_hist function returns a 2D histogram of the TOF data versus the stop energies. A colormap going from blue to red will show the number of counts in each bin. The more red it is the larger the number of counts.
The get_psdvse_hist function returns a 2D histogram of the calculated PSD values versus the energy values of a single channel. This histogram is useful for particle type discrimination. A colormap going from blue to red will show the number of counts in each bin. The more red it is the larger the number of counts.
The get_mcs_graph function returns the number of events per second. This isn't considered a histogram even if it uses a histogram to find the number of counts per second.