documentation updates

jaredgk · Jul 15, 2020 · 165e282 · 165e282
1 parent 78ddaae
commit 165e282
Show file tree

Hide file tree

Showing 7 changed files with 49 additions and 30 deletions.
diff --git a/...urce/PPP_pages/Functions/stat_sampler.rst → ...urce/PPP_pages/Utilities/stat_sampler.rst b/...urce/PPP_pages/Functions/stat_sampler.rst → ...urce/PPP_pages/Utilities/stat_sampler.rst
diff --git a/docs/source/PPP_pages/examples.rst b/docs/source/PPP_pages/examples.rst
@@ -12,16 +12,21 @@ PPP functions may be called at the command-line as shown in this example:
         
         vcf_filter.py --vcf examples/files/merged_chr1_10000.vcf.gz --filter-only-biallelic --out-format bcf
 
-Details on the usage of a specific function may be found within the *Example usage* section of the function in question. In addition, all example files used may be found within **examples/files** directory.  
-
+Details on the usage of each specific function may be found within the *Example usage* section of the function’s documentation. In addition, all files shown within these examples may be found within **examples/files** directory of the PPP repository.
 
 ##########################
 Jupyter Notebook Pipelines
 ##########################
 
-All PPP functions may also be used within a `Jupyter Notebook <https://jupyter.org/>`_. We have included some examples below:
+All PPP functions may also be used within a `Jupyter Notebook <https://jupyter.org/>`_. We have included two example notebooks.
 
 .. toctree::
    :maxdepth: 1
 
-   jupyter/example_pipeline_pan.ipynb
+   jupyter/example_pipeline_pan.ipynb
+
+.. only:: html
+
+	The Jupyter Notebooks may also be download:
+
+   * :download:`Example Jupyter Pipleine <jupyter/example_pipeline_pan.ipynb>`.
diff --git a/docs/source/PPP_pages/functions.rst b/docs/source/PPP_pages/functions.rst
@@ -1,6 +1,6 @@
-=============
-PPP Functions
-=============
+==============
+Core Functions
+==============
 
 The functions below were developed to perform many of the core operations typically used in population genetic analyses. Each of these functions were designed to perform a single operation (i.e. filtering, phasing, etc.).
 
@@ -9,8 +9,8 @@ The functions below were developed to perform many of the core operations typica
 
    Functions/vcf_filter
    Functions/vcf_calc
-   Functions/stat_sampler
    Functions/loci_filter
    Functions/vcf_split
    Functions/vcf_phase
-   Functions/vcf_four_gamete
+   Functions/vcf_four_gamete
+
diff --git a/docs/source/PPP_pages/intro.rst b/docs/source/PPP_pages/intro.rst
@@ -0,0 +1,29 @@
+============
+Introduction
+============
+
+The Popgen Pipeline Platform (PPP) was written using the Python programming language and designed to operate using Python 3.7. In comparison to a fixed pipeline, the PPP was designed as a collection of modular functions that may combined to generate a wide variety of analyses and pipelines. 
+
+For simplicity, PPP functions are separated into four categories:
+# **Core functions**: Frequently used methods and procedures in population genomic pipelines (e.g. phasing, filtering, four-gamete test, etc.). 
+# **Input file** generators: Input generators for creating the necessary input for population genomic analysis (e.g. generating input for IMa3, TreeMix, G-PhoCS, etc.)
+# **Analyses**: Common population genomic analyses (e.g. isolation and migration, admixture, linkage disequilibrium, etc.)
+# **Utilities**: Simple file-specific procedures often required in population genomic pipelines
+
+For details on specific functions, please see the documentation on each section.
+
+.. image:: PPP_assets/PPP_Pipeline_Figure.png
+   :scale: 50 %
+   :align: center
+
+.. centered::
+   Figure 1: Structure of the PPP
+
+
+##################
+Creating Pipelines
+##################
+
+Most PPP-based pipelines are expected to primarily consist of core functions. To simplify development, all core functions were designed to operate using VCF-based files. The VCF format was selected due to the frequent support for the format among publicly available datasets and population genomics software. At present, pipelines may be generated in one of two methods: i) calling each function by command-line or ii) calling the function within a script, such as a jupyter notebook. Example usage of both methods may be found within <examples.rst>`__.
+
+
diff --git a/docs/source/PPP_pages/model.rst b/docs/source/PPP_pages/model.rst
@@ -2,7 +2,7 @@
 Model File and Creation
 =======================
 
-A core aspect of the PPP is the use of Model files, JSON-based files used to assign and store **population models**. A population model primarily consists of: the populations within the model; the individuals in each population; and a population tree. Model files offer various benefits within the PPP: i) automatic assignment of relevant populations, individuals, or other potential meta-data; ii) simplifed process to examine multiple models; and iii) a single repository of all relevant meta-data. 
+A core aspect of the PPP is the use of Model files, JSON-based files used to assign and store **population models**. A population model primarily consists of: the populations within the model; the individuals in each population; and a population tree. Model files offer various benefits within the PPP: i) automatic assignment of relevant populations, individuals, or other potential meta-data; ii) simplified process to examine multiple models; and iii) a single repository of all relevant meta-data.
 
 
 Model files may be created and edited using our model creator. 

diff --git a/docs/source/PPP_pages/utilities.rst b/docs/source/PPP_pages/utilities.rst
@@ -10,3 +10,4 @@ The utility functions were developed to perform various tasks often needed when
    Utilities/vcf_utilities
    Utilities/bed_utilities
    Utilities/vcf_bed_to_seq
+   Utilities/stat_sampler
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -2,32 +2,16 @@
 Popgen Pipeline Platform
 ========================
 
-.. only:: not html
-
-   ------------
-   Introduction
-   ------------
+The Popgen Pipeline Platform (PPP) is a software platform with the goal of reducing the computational expertise required for conducting population genomic analyses. The PPP was designed as a collection of scripts that facilitate common population genomic workflows in a consistent and standardized environment. Functions were developed to encompass entire workflows, including: input preparation, file format conversion, various population genomic analyses, and output generation. By facilitating entire workflows, the PPP offers several benefits to prospective end users - it reduces the need of redundant in-house software and scripts that would require development time and may be error-prone, or incorrect, depending on the expertise of the investigator. The platform has also been developed with reproducibility and extensibility of analyses in mind.
 
-The Popgen Pipeline Platform (PPP) is a software platform with the goal of reducing the computational expertise required for conducting population genomic analyses. The PPP was designed as a collection of scripts that facilitate common population genomic workflows in a consistent and standardized environment. Functions were developed to encompass entire workflows, including: input preparation, file format conversion, various population genomic analyses, output generation, and visualization. By facilitating entire workflows, the PPP offers several benefits to prospective end users - it reduces the need of redundant in-house software and scripts that would require development time and may be error-prone, or incorrect, depending on the expertise of the investigator. The platform has also been developed with reproducibility and extensibility of analyses in mind.
-
-The PPP was written using the Python programming language and designed to operate using either Python 2.7 or 3.7. However, as `Python 2 will no longer be maintained past January 1, 2020 <https://www.python.org/dev/peps/pep-0373/>`_ we strongly recommend using Python 3. We designed the PPP as a collection of modular functions that users may combine to generate a wide variety of analyses and pipelines. The functions within the PPP are also seperated into four groups: core VCF-based functions; optional BED/STAT file functions; file conversion functions; and analysis functions (Figure 1). 
-
-.. image:: PPP_assets/PPP_Pipeline_Figure.png
-   :scale: 50 %
-   :align: center
-
-.. centered::
-   Figure 1: Structure of the PPP
-
-The core functions of the PPP were designed to operate using VCF-based files primarily due to frequent support for the format among publicly available datasets and population genomics software. Most users will begin their pipelines with these core functions before moving onto an analysis function. Please note that most analysis functions require a preceding file conversion function to operate. 
-
-Please Note: This documentation is currently being devloped and will be updated freqeuntly in the coming days
+Please Note: This documentation is currently being developed and will be updated freqeuntly in the coming days
 
 .. toctree::
    :maxdepth: 2
    :caption: Contents:
-   :hidden:
+   :hidden: 
 
+   PPP_pages/intro
    PPP_pages/install
    PPP_pages/examples
    PPP_pages/functions