Merge pull request #37 from 4dn-dcic/2022_10_18_updates

small documentation updates
4dn-dcic · Jan 31, 2023 · e3f53ec · e3f53ec
2 parents f8298ef + b1dc2bf
commit e3f53ec
Show file tree

Hide file tree

Showing 14 changed files with 65 additions and 36 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,3 +1,4 @@
 
 .DS_Store
 *DS_Store
+.idea/*
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -6,14 +6,20 @@
 Changelog
 ---------
 
+0.1.2
+-----
+
+`PR 37 <https://github.com/4dn-dcic/fish_omics_format/pull/37>`_
+
+* Minor documentation updates.
 
 0.1.1
 -----
 
 `PR 36 <https://github.com/4dn-dcic/fish_omics_format/pull/36>`_
 `PR 29 <https://github.com/4dn-dcic/fish_omics_format/pull/29>`_
 
-* Add this ``CHANGELOG.rst``
+* Add this ``CHANGELOG.rst``.
 * Fix various errors in some examples.
 
 0.1.0

diff --git a/docs/source/bio.rst b/docs/source/bio.rst
@@ -33,7 +33,7 @@ This Table can be indexed mandatorily by Spot_ID.
 
 The header MUST contain a mandatory set of fields that describe any algorithm that was used to produce/process data in this table. In case more than one algorithm were used, please use the same set of fields for each of them.
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/bio_header.csv

diff --git a/docs/source/cell.rst b/docs/source/cell.rst
@@ -27,7 +27,7 @@ File Header
 - The first line in the header is always "##FOF-CT_version=vX.X"
 - The second line in the header is always "##Table_namespace=4dn_FOF-CT_cell"
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/cell_header.csv

diff --git a/docs/source/core.rst b/docs/source/core.rst
@@ -12,12 +12,13 @@ Tracing. This table is used to record and exchange the primary results of
 Chromatin Tracing experiments. The Table is organized around individual DNA
 bright Spots that are spatially linked together in a three-dimensional (3D)
 polymeric Trace using a 3D polymeric tracing algorithm. As a result, all Spots
-that share the same Trace_ID, by definition belong to the same Trace.
+that share the same ``Trace_ID``, by definition belong to the same Trace.
 
-Each row reports the X, Y, Z localization, and the Trace assignment
-(i.e., Trace_ID) of a FISH-omics bright Spot and corresponds to a specific
-genomic DNA target sequence identified by chromosome ID (Chrom), and by start
-(Chrom_Start) and end (Chrom_End) chromosome coordinates. In this table the reported X, Y and Z coordinates are assumed to result
+Each row reports the ``X``, ``Y``, ``Z`` localization, and the Trace assignment
+(i.e., ``Trace_ID``) of a FISH-omics bright Spot and corresponds to a specific
+genomic DNA target sequence identified by chromosome ID (``Chrom``), and by start
+(``Chrom_Start``) and end (``Chrom_End``) chromosome coordinates.
+In this table the reported ``X``, ``Y``, ``Z`` coordinates are assumed to result
 from post-processing and quality control procedures and therefore
 correspond to the final localization of the DNA target under study.
 
@@ -28,14 +29,23 @@ structures, cells or extra cellular structures (e.g., Tissue) are identified
 as part of this experiment, this table has to mandatorily include the ID of
 the Sub_Cellular, Cell or Extra Cellular Structure Region of Interest (ROI)
 each Spot/Trace is associated with.
+
 All other spot properties must be kept in the two additional tables
 :ref:`quality` and :ref:`bio`, indexed by Spot_ID and as described in the
 instructions for those tables.
+Additionally, in the case in which the final localization of DNA target results
+from combining multiple detection events (e.g., by combining localization events
+from different focal planes or times), the underlying raw data can be recorded
+in the corresponding :ref:`demultiplexing` table as described in the
+instructions of that table.
+
+Finally, ``Spot_ID`` identifiers are unique across the entire dataset, thus
+allowing to identify unambiguously a Spot in the :ref:`quality`, :ref:`bio` and
+:ref:`demultiplexing`.
 
-Finally, in the case in which the final localization of DNA target results from combining multiple
-detection events (e.g., by combining localization events from different focal planes or times),
-the underlying raw data can be recorded in the corresponding
-:ref:`demultiplexing` table as described in the instructions of that table.
+NOTE: Also RNA Spots have a ``Spot_ID`` (in the :ref:`rna`). Thus, when
+assigning an identifier to each Spot, make sure that this is unique not only
+within the :ref:`core`, but also in the :ref:`rna` if present.
 
 
 Example
@@ -69,7 +79,7 @@ The first columns are always: **Spot_ID**, **Trace_ID**, **X**, **Y**, **Z**,
 **Chrom**, **Chrom_Start**, **Chrom_End**. Additionally in case sub-cellular
 structures, cells or extra cellular structures are identified as part of this
 experiment, the subsequent columns must mandatorily be *Sub_Cell_ROI_ID*,
-*Cell_ID* or *Extra_Cell_ROI_ID*, respectively. 
+*Cell_ID* or *Extra_Cell_ROI_ID*, respectively.
 
 The order of the rows is at user's discretion.
 

diff --git a/docs/source/demultiplexing.rst b/docs/source/demultiplexing.rst
@@ -34,7 +34,7 @@ algorithm that was used to produce/process data in this table.
 In case more than one algorithm were used, please use the same set of fields
 for each of them.
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/demultiplexing_header.csv

diff --git a/docs/source/examples/mapping b/docs/source/examples/mapping
@@ -3,12 +3,12 @@
 ##XYZ_unit=micron
 ##intensity_unit=a.u.
 ##Sub_Cell_ROI_type=PML_body
-##ROI_boundaries_format=(X1,Y1, X2,Y2 Xn,Yn)
+##ROI_boundaries_format=(X1,Y1 X2,Y2 Xn,Yn)
 #^ROI_volume: the volume of this ROI expressed in micron^3.
 #^ROI_intensity: the integrated average signal intensity measured within the boundaries of this ROI, of the marker used to identify this nuclear feature.
 #additional_tables: 4dn_FOF-CT_core, 4dn_FOF-CT_rna, 4dn_FOF-CT_trace
 ##columns=(Sub_Cell_ROI_ID, ROI_boundaries, ROI_volume, ROI_intensity)
-1, (0,0 1,2 3,5)
-2, (0,0 2,3 4,6)
-3, (0,0 3,2 7,5)
-4, (0,0 9,2 9,5)
+1, (0,0 1,2 3,5), 100, 1.00
+2, (0,0 2,3 4,6), 48, 0.90
+3, (0,0 3,2 7,5), 63, 0.67
+4, (0,0 9,2 9,5), 88, 0.10
diff --git a/docs/source/examples/quality b/docs/source/examples/quality
@@ -27,7 +27,7 @@
 #^Z_Loc_Precision: lower and upper bound of 95% confidence interval on Z-position after fit
 #additional_tables: 4dn_FOF-CT_core, 4dn_FOF-CT_rna, 4dn_FOF-CT_trace, 4dn_FOF-CT_cell
 ##columns=(Spot_ID, Channel_ID, Peak_Intensity, Raw_X, Raw_Y, Raw_Z, X_Drift, Y_Drift, Z_Drift, X_Chromatic_Shift, Y_Chromatic_Shift, Z_Chromatic_Shift, X_Loc_Precision, Y_Loc_Precision, Z_Loc_Precision)
-1 1 100 1.1 1.05, 1.2 0.1 0.05, 0.2 0.2 0.2 0.2 0.01, 0.01, 0.01
-2 1 200 1.11, 1.055 1.22, 0.11, 0.055 0.22, 0.22, 0.22, 0.22, 0.012 0.012 0.012
-3 2 500 1.12, 1.054 1.21, 0.12, 0.054 0.21, 0.22, 0.22, 0.22, 0.012 0.012 0.012
-4 3 333 1.13, 1.15, 1.202 0.13, 0.15, 0.202 0.23, 0.23, 0.23, 0.013 0.013 0.013
+1, 1, 100, 1.1, 1.05, 1.2, 0.1, 0.05, 0.2, 0.2, 0.2, 0.2, 0.01, 0.01, 0.01
+2, 1, 200, 1.11, 1.055, 1.22, 0.11, 0.055, 0.22, 0.22, 0.22, 0.22, 0.012, 0.012, 0.012
+3, 2, 500, 1.12, 1.054, 1.21, 0.12, 0.054, 0.21, 0.22, 0.22, 0.22, 0.012, 0.012, 0.012
+4, 3, 333, 1.13, 1.15, 1.202, 0.13, 0.15, 0.202, 0.23, 0.23, 0.23, 0.013, 0.013, 0.013
diff --git a/docs/source/extracell.rst b/docs/source/extracell.rst
@@ -30,7 +30,7 @@ File Header
 - The first line in the header is always "##FOF-CT_version=vX.X"
 - The second line in the header is always "##Table_namespace=4dn_FOF-CT_extracell"
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/extracell_header.csv

diff --git a/docs/source/mapping.rst b/docs/source/mapping.rst
@@ -11,7 +11,8 @@ This table is used to provide the boundaries of Cells and other ROIs
 identified as part of this experiment, and it is required in case Cell and
 other ROI segmentation data were collected as part of this experiment.
 
-This table is mandatory in case a :ref:`subcell`, :ref:`cell`, and/or :ref:`extracell` tables are deposited with this submission.
+This table is mandatory in case a :ref:`subcell`, :ref:`cell`, and/or
+:ref:`extracell` tables are deposited with this submission.
 
 The table is organized on a Cell or ROI basis via a Cell or ROI ID and
 provides the Cell or ROI boundaries in global coordinates as specified by
@@ -52,7 +53,7 @@ File Header
 - The first line in the header is always "##FOF-CT_version=vX.X"
 - The second line in the header is always "##Table_namespace=4dn_FOF-CT_mapping"
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/mapping_header.csv

diff --git a/docs/source/quality.rst b/docs/source/quality.rst
@@ -39,7 +39,7 @@ algorithm that was used to produce/process data in this table.
 In case more than one algorithm were used, please use the same set of fields
 for each of them.
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/quality_header.csv

diff --git a/docs/source/rna.rst b/docs/source/rna.rst
@@ -23,14 +23,25 @@ be 6 (or 7) data columns. These are required. All other data columns are
 optional.
 
 In this table the reported X, Y and Z coordinates are assumed to result
-from post-processing and quality control procedures performed on primary localization events and therefore
-correspond to what is considered the best-bet location of the RNA molecule under study.
+from post-processing and quality control procedures performed on primary
+localization events and therefore correspond to what is considered the best-bet
+location of the RNA molecule under study.
 
-In the case of multiplexed FISH experiments (i.e., `MERFISH <https://doi.org/10.1073/pnas.1912459116>`_) in which the
+In the case of multiplexed FISH experiments (i.e.,
+`MERFISH <https://doi.org/10.1073/pnas.1912459116>`_) in which the
 final location of RNA molecule results from combining multiple
-detection events (e.g., by combining individual Localization events detected in separate planes or images),
-the underlying raw data can be recorded in the corresponding
-:ref:`demultiplexing` as described in the instructions of that table.
+detection events (e.g., by combining individual Localization events detected in
+separate planes or images), the underlying raw data can be recorded in the
+corresponding :ref:`demultiplexing` as described in the instructions of that table.
+
+``Spot_ID`` identifiers are unique across the entire dataset, thus
+allowing to identify unambiguously a Spot in the :ref:`quality`, :ref:`bio` and
+:ref:`demultiplexing`.
+
+NOTE: Also DNA Spots have a ```Spot_ID`` (in the :ref:`core`). Thus, when
+assigning an identifier to each Spot, make sure that this is unique not only
+within the :ref:`rna`, but also in the :ref:`core`.
+
 
 Example
 -------
@@ -47,7 +58,7 @@ algorithm(s) that were used to identify and localize bright Spots.
 In case more than one algorithm were used, please use the same set of fields
 for each of them.
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/rna_header.csv

diff --git a/docs/source/subcell.rst b/docs/source/subcell.rst
@@ -29,7 +29,7 @@ File Header
 - The first line in the header is always "##FOF-CT_version=vX.X"
 - The second line in the header is always "##Table_namespace=4dn_FOF-CT_subcell"
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/subcell_header.csv

diff --git a/docs/source/trace.rst b/docs/source/trace.rst
@@ -21,7 +21,7 @@ File Header
 - The first line in the header is always "##FOF-CT_version=vX.X"
 - The second line in the header is always "##Table_namespace=4dn_FOF-CT_trace"
 
-The header should include a detailed description of each optional columns used.
+The header MUST include a detailed description of each optional columns used.
 
 .. csv-table::
   :file: tables/trace_header.csv