# Converting raw .txt files from labview to fretbursts compatible HDF5 files

This notebook will convert multiple TXT files into a single combined HDF5 file, useful if you want to analyse multiple acquisitions as if they were all part of the same run.

In [1]:
import phconvert as phc
import csv
phc.__version__
import numpy as np
import os

# 1.  Read in data

Start by naming the first file and reading it in.

In [2]:
T = 180000000000 #this is the length of EACH experiment
filenames = ["definitiveset/1c1.txt", 
            "definitiveset/1c2.txt"]
savename = 'definitiveset/1cx.hdf5'
print(filenames)


['definitiveset/1c1.txt', 'definitiveset/1c2.txt']


In [35]:
detectors = np.empty([0], dtype=int)
timestamps = np.empty([0], dtype=int)
files = 0

for file in filenames:
    with open(file) as inf:
        reader = csv.reader(inf, delimiter="	")
        ftimestamps = list(zip(*reader))[0]
    
    with open(file) as inf:
        reader = csv.reader(inf, delimiter="	")
        fdetectors = list(zip(*reader))[1]

    fdetectors = np.asarray(fdetectors)
    ftimestamps = np.asarray(ftimestamps)

    ftimestamps = np.int64(ftimestamps)
    fdetectors = np.uint8(fdetectors)
    #the following code is necessary because sometimes there are a couple of photons
    #counted after the stated end of the experiment, I don't know why, my degree
    #was in biology
    bleed = 0
    for x in reversed(ftimestamps): #this is a reversed search because it's looking for numbers at the end, and if it starts at the beginning it will have to go through several million to get there
        if x > T :
            bleed +=1
        else:
            break
    ftimestamps = np.resize(ftimestamps, ftimestamps.size - bleed)
    fdetectors = np.resize(fdetectors, fdetectors.size - bleed)
    #This moves along the timestamps by which experiment it is
    ftimestamps = ftimestamps + (files * T)
    files = files + 1
    print(ftimestamps)
    timestamps = np.concatenate([timestamps, ftimestamps])
    detectors = np.concatenate([detectors, fdetectors])
    
    
timestamps = np.int64(timestamps)
detectors = np.uint8(detectors)
print("done")

[        7295        76304       113235 ... 179991641539 179991642830
 179991651103]
[180000301012 180000362888 180000543792 ... 359990845480 359990853519
 359990933468]
done


# 2. Create some metadata

Put information between quotation marks

In [36]:
description = 'blah blah blah meta data'

author = 'Author Name'
author_affiliation = 'Name of Research Institution'

sample_name = 'describe the sample here'
buffer_name = 'describe the buffer here'
dye_names = 'Cy3B, ATTO647N'   # Comma separates names of fluorophores

# 3. Create Photon-HDF5 data structure

In this section we create all the mandatory and non mandatory groups.
Not all of the are required to save a valid Photon-HDF5 file
(see example in section 4).

## 3.1 `photon_data` group

Contains arrays of photon-data: timestamps, detectors, nanotimes, etc...

Our timestamps are in nanoseconds so we put "10e9"

*See [photon_data group reference](http://photon-hdf5.readthedocs.org/en/latest/phdata.html#photon-data-group)*

In [37]:
timestamps_unit = 10e-9
photon_data = dict(
    timestamps=timestamps,
    detectors=detectors,
    timestamps_specs={'timestamps_unit': timestamps_unit})

## 3.2 `setup` group

The `/setup` group contains information about the measurement setup. 

*See [setup group reference](http://photon-hdf5.readthedocs.org/en/latest/phdata.html#setup-group).*

In [38]:
setup = dict(
    ## Mandatory fields
    num_pixels = 2,                   # using 2 detectors
    num_spots = 1,                    # a single confoca excitation
    num_spectral_ch = 2,              # donor and acceptor detection 
    num_polarization_ch = 1,          # no polarization selection 
    num_split_ch = 1,                 # no beam splitter
    modulated_excitation = False,     # CW excitation, no modulation 
    excitation_alternated = [True],  # CW excitation, no modulation 
    lifetime = False,                 # no TCSPC in detection
    
    ## Optional fields
    excitation_wavelengths = [532e-9, 635e-9],         # List of excitation wavelenghts
    excitation_cw = [True],                    # List of booleans, True if wavelength is CW
    detection_wavelengths = [580e-9, 640e-9],  # Nominal center wavelength 
                                               # each for detection ch
)

## 3.3 `provenance` group

Non-mandatory group containing info about the original file 
prior to Photon-HDF5 conversion. If some information is not 
available the relative field may be omitted.

If you provide the file path for the original file before conversion then this will be packed into the HDF5

*See [provenance group documentation](http://photon-hdf5.readthedocs.org/en/latest/phdata.html#provenance-group).*

In [39]:
provenance = dict(
    filename='original_data_file.dat', 
    software='Acquisition Software Name')

## 3.4 `identity` group

Non-mandatory group containing info about information 
this specific Photon-HDF5 file.

*See [identity group documentation](http://photon-hdf5.readthedocs.org/en/latest/phdata.html#identity-group).*

In [40]:
identity = dict(
    author=author,
    author_affiliation=author_affiliation)

## 3.5 `measurement_specs` group

The optional /photon_data/measurement_specs group contains 
additional information allowing unambiguous interpretation 
of the data for each specific type of measurement.

*See [measurement_specs group documentation](http://photon-hdf5.readthedocs.org/en/latest/phdata.html#measurement-specs).*

In [41]:
measurement_specs = dict(
    measurement_type = 'smFRET-usALEX',
    detectors_specs = {'spectral_ch1': [0],  # list of donor's detector IDs
                       'spectral_ch2': [1]},  # list of acceptor's detector IDs,
    alex_period = 10000,
    )

# 4. Save Photon-HDF5 files

To save a file we need to join together the root fields and group 
in a single dictionary.

In [42]:
photon_data['measurement_specs'] = measurement_specs

data = dict(
    description=description,
    photon_data = photon_data,
    setup=setup,
    identity=identity,
    provenance=provenance
)

In [43]:
phc.hdf5.save_photon_hdf5(data, h5_fname=savename, overwrite=True)

         File info in provenance group will not be added.

Saving: definitiveset/1cx.hdf5


> **NOTE:** a user of this file can correctly interpret the data
> reading that the measurement type is 'smFRET' (meaning smFRET with single laser
> excitation and 2-colors detection) and the IDs of donor and acceptor detectors
> (from `detectors_specs/spectral_ch1` and `spectral_ch2` respectively).