Home

1 Run

From the bin directory, run ./RealTrace [-options] with the following options:

-h, --help                         this help message
-i, --infile                       (required) input data file
-b, --parameter_bounds             (required) file(s) setting the type, step, bounds of the parameters
-c, --csv_config                   file that sets the columns that will be used from the input file
-o, --outdir                       specify output direction and do not use default
-t, --tolerance_maximization       absolute tolerance of maximization between optimization steps, default: 1e-10
-r, --rel_tolerance_joints         relative tolerance of joint calculation: default 1e-10
-space, --search_space             search parameter space in {'log'|'linear'} space, default: 'log'
-noise, --noise_model              measurement noise of fp content {'const'|'scaled'} default: 'const'
-div, --cell_division              cell division model {'gauss'|'binomial'} default: 'gauss'
-m, --maximize                     run maximization
-p, --predict                      run prediction
-j, --joints                       run calculation of joint probabilities

Example: ./RealTrace -c csv_config.txt -b parameters.txt -i data/example.csv -o out/ -l 1 -t 1e-10 -m -p

2.1 Required arguments

infile sets the input file that contains the data, eg as provided by MOMA (see 2.1.1)
parameter_bounds sets the file that defines the parameter space (see 2.1.2)

2.1.1 Input file

The input file is assumed to fulfill the following:

the data points of a cell appear as consecutive rows and are in the correct order with respect to time.
The data set has to include all columns that are set via the csv_config file, i.e. time_col, length_col, fp_col.
The cells can be uniquely identified via the tags provided via parent_tags and cell_tags and each mother cell has at most 2 daughter cells. If that is not the case, the parent_tags and cell_tags are not sufficient and a warning will be printed.
In order to estimate the initial covariance matrix, the data set needs to contain at least (!) 2 cells.
An optional column may be added for the usage of segments, see below for more information.

2.1.2 Parameter file

How the different parameters are treated during the likelihood maximization is defined by the following syntax:

free_parameter = init, step
bound_parameter = init, step, lower, upper
fixed_parameter = init

An example file can look like this:

mean_lambda = 0.01, 1e-3
gamma_lambda = 0.01, 1e-3, 1e-4, 0.05
var_lambda = 1e-07

mean_q = 10, 1e-1
gamma_q = 0.01, 1e-3, 1e-4, 0.05
var_q = 1, 1e-2

beta = 5e-2

var_x = 1e-3, 1e-5
var_g = 1, 1e-3

var_dx = 1e-4, 1e-5
var_dg = 1, 1e-2

ALL parameters are restricted to positive numbers by default avoiding unphysical/meaningless parameter ranges. However, this can be overwritten by setting bounds

During the maximization, the step will be the initial step size. From nlopt doc: "For derivative-free local-optimization algorithms, the optimizer must somehow decide on some initial step size to perturb x by when it begins the optimization. This step size should be big enough that the value of the objective changes significantly, but not too big if you want to find the local optimum nearest to x."

2.2 Using segments

To analyze data sets that contain data points that need be fitted by a different set of underlying parameters, segment indices can be used. For that, a segment_col in the csv_config file can be specified. This column should contain the segment index specifying for each data point to which segment it belongs. The segment indices are required to be consecutive and start at index 0.

The likelihood maximization that determines the parameter estimates is run independently for each segment. That means there is no difference between running different segments in separate runs or as part of the same data set. The same behavior is used for 1d scans. However, the predictions as well as the calculation of the joint probabilities that are used for the correlation functions are calculated by iterating through the entire data set. For that, the following scheme is used Note, that the prior calculation to go from time points 2 to 3 and vice versa both take the parameters of the 0th segment.

For each segment in the data set one parameter file is required submitted in the order of the segment indices. For example:

./RealTrace -b parametersA.txt parametersB.txt ...

will use the parameters in the file parametersA.txt for the segment with index 0 and the parameters in the file parametersB.txt for the segment with index 1, etc...

2.2.3 Optional arguments

(Defaults are in brackets.)

csv_config sets the file that contains information on which columns will be used from the input file (see 2.3.1)
tolerance_maximization (1e-10) sets the stopping criterion by setting the tolerance of maximization: Stop when an optimization step changes the function value by less than tolerance. By setting very low tolerances one might encounter rounding issues, in that case, the last valid step is taken and a warning is printed to stderr.
rel_tolerance_joints (1e-10) sets the stopping criterium for the joint calculation. The calculation is stopped when the cross covariances between the two time points are smaller than the product of the corresponding means times the set tolerance. $\frac{\text{Cov}(z_{n+m}, z_n){i,j}}{ \langle z{n+m}\rangle_i \langle z_n\rangle_j} < \text{tolerance }$
outdir overwrites the default output directory, which is (given the infile dir/example.csv/) dir/example_out/
search_space (log) sets the search space of the parameters to be either in log space or linear space. The parameter file does not need to be changed as everything is done internally.
noise_model (scaled) defines how the measurement noise depends on the content of fluorescence proteins. const means that the measurement is constant with a variance var_g. scaled means the variance of the measurement scales linearly with the fluorescence protein content. In this case var_g is the prefactor of the scaling.
cell_division (binomial) defines the model for cell division. binomial splits the FP content according to the cell sizes of the daughter cells and binomial sampling. In this case, the parameter var_dg is the conversion factor between the FP input and the physical number of independent molecules that can be distributed across cells. gauss refers to a model where the FP contents of the daughter cells are drawn from a gaussian with variance var_dg centered around half of the mother cell FP content

2.3.1 Csv_config file

Example:

time_col = time_min
rescale_time = 60
length_col = length_um
fp_col = GFP
cell_tags = date, cell_id
parent_tags = date, parent_id

The following settings define how the input file will be interpreted. (Defaults are in brackets.)

time_col (time): column from which the time is read
rescale_time (1): the factor by which time will be divided at the start, thus changing the time unit (e.g. rescale_time=60 may change the time unit from sec to min)
length_col (length): column from which the length of the cell is read
length_islog (false): indicates if the cell length in the data file is in logscale (true) or not (false)
fp_col (gfp): column from which the fluorescence protein content is read
delm (,): delimiter between columns, probably ',' or ';'
segment_col (): column from which the segment index is read. Not setting segment_col in the file indicates that segment indices will not be used
filter_col (): column from which the filter will be read. To include a data point, set the entry in this column to True, true, TRUE or 1 and to EXclude a data point, set the entry in this column to False, false, FALSE or 0. Not setting filter_col in the file indicates that the input file will not be filtered
cell_tags (cell_id): columns that will make up the unique cell id, separated by ','
parent_tags (parent_id): columns that will make up the unique cell id of the parent cell ','

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

1 Run

2.1 Required arguments

2.1.1 Input file

2.1.2 Parameter file

2.2 Using segments

2.2.3 Optional arguments

2.3.1 Csv_config file

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally