Merge branch 'dev'

zqfang · Dec 26, 2017 · f88e3b8 · f88e3b8
2 parents c6568cc + 92c142e
commit f88e3b8
Show file tree

Hide file tree

Showing 13 changed files with 176 additions and 177 deletions.
diff --git a/.gitignore b/.gitignore
@@ -3,6 +3,7 @@
 *.eggs
 *.EGG
 *.EGG-INFO
+*.idea
 bin
 build
 develop-eggs

diff --git a/.idea/vcs.xml b/.idea/vcs.xml
diff --git a/README.rst b/README.rst
@@ -30,7 +30,7 @@ GSEAPY: Gene Set Enrichment Analysis in Python.
 
 
 
-The main documentation for GSEAPY can be found at http://gseapy.rtfd.io/
+The main documentation for GSEApy can be found at http://gseapy.rtfd.io/
 
 An example to use gseapy, please click here: `Example <http://gseapy.readthedocs.io/en/master/gseapy_example.html>`_
 
@@ -39,7 +39,7 @@ An example to use gseapy, please click here: `Example <http://gseapy.readthedocs
 GSEAPY is a python wrapper for **GSEA** and **Enrichr**.
 --------------------------------------------------------------------------------------------
 
-GSEAPY could be used for **RNA-seq, ChIP-seq, Microarry** data. It's used for convenient GO enrichments and produce **publishable quality figures** in python.
+GSEAPY could be used for **RNA-seq, ChIP-seq, Microarry** data. It's used for convenient GO enrichment and produce **publishable quality figures** in python.
 
 
 GSEAPY has five sub-commands available: ``gsea``, ``prerank``, ``ssgsea``, ``replot`` ``enrichr``.
@@ -49,7 +49,7 @@ GSEAPY has five sub-commands available: ``gsea``, ``prerank``, ``ssgsea``, ``rep
 :prerank: The ``prerank`` module produce **Prerank tool** results.  The input expects a pre-ranked gene list dataset with correlation values, which in .rnk format, and gene_sets file in gmt format.  ``prerank`` module is an API to `GSEA` pre-rank tools.
 :ssgsea: The ``ssgsea`` module perform **single sample GSEA(ssGSEA)** analysis.  The input expects a pd.Series (indexed by gene name), or pd.DataFrame (include ``GCT`` file) with expression values and ``GMT`` file. For multi sample input, ssGSEA reconigze gct format, too. ssGSEA enrichment score for the gene set as described by `D. Barbie et al 2009 <http://www.nature.com/nature/journal/v462/n7269/abs/nature08460.html>`_.
 
-:replot: The ``replot`` module reproduce GSEA desktop version results.  The only input for GSEAPY is the location to ``GSEA`` Desktop output results.
+:replot: The ``replot`` module reproduce GSEA desktop version results.  The only input for GSEApy is the location to ``GSEA`` Desktop output results.
 
 :enrichr: The ``enrichr`` module enable you perform gene set enrichment analysis using ``Enrichr`` API. Enrichr is open source and freely available online at: http://amp.pharm.mssm.edu/Enrichr . It runs very fast.
 
@@ -71,7 +71,7 @@ do gene set enrichment analysis in python. So, here is my reason:
 
 * **Running inside python interactive console without switch to R!!!**
 * User friendly for both wet and dry lab usrers.
-* Produce or reproduce pubilishable figures.
+* Produce or reproduce publishable figures.
 * Perform batch jobs easy.
 * Easy to use in bash shell or your  data analysis workflow, e.g. snakemake.
 
@@ -229,12 +229,12 @@ see detail here: `Example <http://gseapy.readthedocs.io/en/master/gseapy_example
 .. code:: python
 
 
-    # assign dataframe, and use enrichr libary data set 'KEGG_2016'
+    # assign dataframe, and use enrichr library data set 'KEGG_2016'
     expression_dataframe = pd.DataFrame()
 
     sample_name = ['A','A','A','B','B','B'] # always only two group,any names you like
 
-    # assign gene_sets parameter with enrichr library name or gmt file on your local computor.
+    # assign gene_sets parameter with enrichr library name or gmt file on your local computer.
     gseapy.gsea(data=expression_dataframe, gene_sets='KEGG_2016', cls= sample_names, outdir='test')
 
     # using prerank tool
@@ -263,7 +263,7 @@ see detail here: `Example <http://gseapy.readthedocs.io/en/master/gseapy_example
 GSEAPY supported gene set libaries :
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
-To see the full list of gseapy supported gene set librarys, please click here: `Library <http://amp.pharm.mssm.edu/Enrichr/#stats>`_
+To see the full list of gseapy supported gene set libraries, please click here: `Library <http://amp.pharm.mssm.edu/Enrichr/#stats>`_
 
 Or use ``get_library_name`` function inside python console.
 

diff --git a/docs/gseapy_tutorial.rst b/docs/gseapy_tutorial.rst
@@ -4,11 +4,11 @@
 A Protocol to Prepare files for GSEAPY
 ======================================
 
-As a biological reseacher, I like protocols, so as other reseachers, too.
+As a biological researcher, I like protocols, so as other researchers, too.
 
 Here is an short tutorial to walk you through gseapy.
 
-For file format explaination, please see `here <http://software.broadinstitute.org/gsea/doc/GSEAUserGuideFrame.html.>`_
+For file format explanation, please see `here <http://software.broadinstitute.org/gsea/doc/GSEAUserGuideFrame.html.>`_
 
 In order to run gseapy successfully, install gseapy use pip.
 
@@ -27,7 +27,7 @@ Use ``gsea`` command, or :func:`gsea`
 Follow the steps blow.
 
 One thing you should know is that the gseapy input files are totally the same as
-``GSEA`` desktop requried. You can use these files below to run ``GSEA`` desktop, too.
+``GSEA`` desktop required. You can use these files below to run ``GSEA`` desktop, too.
 
 
 1. Prepare an tabular text file of gene expression like this:
@@ -44,7 +44,7 @@ commands below:
     df = pd.read_table('./test/gsea_data.txt')
     df.head()
 
-    #or assign df to the paramter 'data'
+    #or assign df to the parameter 'data'
 
 
 .. raw:: html
@@ -157,7 +157,7 @@ An example of cls file looks like below.
 
 All you need to do is to download gene set database file from ``GSEA`` website.
 
-Or you could use enrichr library. In this case, just provide libarary name to parameter 'gene_sets'
+Or you could use enrichr library. In this case, just provide library name to parameter 'gene_sets'
 
 If you would like to use you own gene_sets.gmts files, build such a file use excel,
 and then rename to gene_sets.gmt.
@@ -243,7 +243,7 @@ Use ``ssgsea`` command, or :func:`ssgsea`
 Use ``enrichr`` command, or :func:`enrichr`
 ===============================================================
 
-The only thing you need to prepeare is a gene list file.
+The only thing you need to prepare is a gene list file.
 
 **Note**: Enrichr uses a list of Entrez gene symbols as input.
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -24,8 +24,8 @@ GSEAPY: Gene Set Enrichment Analysis in Python.
 
 .. image:: https://img.shields.io/badge/license-MIT-blue.svg
     :target:  https://img.shields.io/badge/license-MIT-blue.svg
-.. image:: https://img.shields.io/badge/python-3.5-blue.svg
-    :target:   https://img.shields.io/badge/python-3.5-blue.svg
+.. image:: https://img.shields.io/badge/python-3.6-blue.svg
+    :target:   https://img.shields.io/badge/python-3.6-blue.svg
 .. image:: https://img.shields.io/badge/python-2.7-blue.svg
     :target:  https://img.shields.io/badge/python-2.7-blue.svg
 
@@ -69,8 +69,8 @@ I would like to use Pandas to explore my data, but I did not find a  convenient
 do gene set enrichment analysis in python. So, here is my reason: 
 
 * **Running inside python interactive console without switch to R!!!**
-* User friendly for both wet and dry lab usrers.
-* Produce and reproduce pubilishable figures.
+* User friendly for both wet and dry lab users.
+* Produce and reproduce publishable figures.
 * Perform batch jobs easy(using for loops).
 * Easy to use in bash shell or your  data analysis workflow, e.g. snakemake.  
 

diff --git a/docs/introduction.rst b/docs/introduction.rst
@@ -21,8 +21,8 @@ GSEAPY: Gene Set Enrichment Analysis in Python.
 
 .. image:: https://img.shields.io/badge/license-MIT-blue.svg
     :target:  https://img.shields.io/badge/license-MIT-blue.svg
-.. image:: https://img.shields.io/badge/python-3.5-blue.svg
-    :target:   https://img.shields.io/badge/python-3.5-blue.svg
+.. image:: https://img.shields.io/badge/python-3.6-blue.svg
+    :target:   https://img.shields.io/badge/python-3.6-blue.svg
 .. image:: https://img.shields.io/badge/python-2.7-blue.svg
     :target:  https://img.shields.io/badge/python-2.7-blue.svg
 
@@ -56,7 +56,7 @@ do gene set enrichment analysis in python. So, here is my reason:
 
 * **Running inside python interactive console without switch to R!!!**
 * User friendly for both wet and dry lab usrers.
-* Produce pubilishable figures.
+* Produce publishable figures.
 * Perform batch jobs easy(using for loops).
 * Easy to use in bash shell or your  data analysis workflow, e.g. snakemake.  
 
@@ -101,11 +101,11 @@ A graphical introduction of Enrichr
 
 **Note**: Enrichr uses a list of Entrez gene symbols as input. You should convert all gene names to uppercase.
 
-For example, both a list object and txt file are supported for ``enrchr`` API
+For example, both a list object and txt file are supported for ``enrichr`` API
 
 .. code:: python
 
-    # if you perfer to run gseapy.enrchr() inside python console, you could assign a list object to 
+    # if you prefer to run gseapy.enrchr() inside python console, you could assign a list object to
     # gseapy like this.
     gene_list = ['SCARA3', 'LOC100044683', 'CMBL', 'CLIC6', 'IL13RA1', 'TACSTD2', 'DKKL1',
                     'CSF1', 'CITED1', 'SYNPO2L']
@@ -148,7 +148,7 @@ Installation
    # for windows users 
    $ conda install -c bioninja gseapy
 
-   # or use pip to install the lastest release 
+   # or use pip to install the latest release
    $ pip install gseapy
 
 | You may instead want to use the development version from Github, by running

diff --git a/gseapy/__main__.py b/gseapy/__main__.py
@@ -14,7 +14,7 @@
 __version__ = '0.9.1'
 
 def main():
-    """The Main function/pipeline for GSEAPY."""
+    """The Main function/pipeline for GSEApy."""
 
     # Parse options...
     argparser = prepare_argparser()
@@ -175,7 +175,7 @@ def add_prerank_parser(subparsers):
     # group for input files
     prerank_input = argparser_prerank.add_argument_group("Input files arguments")
     prerank_input.add_argument("-r", "--rnk", dest="rnk", action="store", type=str, required=True,
-                             help="ranking dataset file in .rnk format.Same with GSEA.")
+                             help="ranking metric file in .rnk format.Same with GSEA.")
     prerank_input.add_argument("-g", "--gmt", dest="gmt", action="store", type=str, required=True,
                              help="Gene set database in GMT format. Same with GSEA.")
     prerank_input.add_argument("-l", "--label", action='store', nargs=2, dest='label',
@@ -224,11 +224,11 @@ def add_singlesample_parser(subparsers):
 
     # group for General options.
     group_opt = argparser_gsea.add_argument_group("GSEA advanced arguments")
-    group_opt.add_argument("--norm-method", dest = "norm", action="store", type=str,
+    group_opt.add_argument("--nm", "--norm-method", dest = "norm", action="store", type=str,
                            default='rank', metavar='normalize',
                            choices=("rank", "log", "log_rank", "custom"),
                            help="Sample normalization method. Choose from {'rank', 'log', 'log_rank','custom'}. Default: rank")
-    group_opt.add_argument("--no-scale", action='store_false', dest='scale', default=True,
+    group_opt.add_argument("--ns", "--no-scale", action='store_false', dest='scale', default=True,
                            help="If the flag was set, don't normalize the enrichment scores by number of genes.")
     group_opt.add_argument("-n", "--permu-num", dest = "n", action="store", type=int, default=1000, metavar='perNum',
                            help="Number of random permutations. For calculating esnulls. Default: 1000")
@@ -275,18 +275,18 @@ def add_enrichr_parser(subparsers):
     enrichr_opt.add_argument("-i", "--input-list", action="store", dest="gene_list", type=str, required=True, metavar='geneSymbols',
                               help="Enrichr uses a list of Entrez gene symbols as input.")
     enrichr_opt.add_argument("-g", "--gene-sets", action="store", dest="library", type=str, required=True, metavar='gmt',
-                              help="Enrichr library name required. see online tool for libary names.")
-    enrichr_opt.add_argument("--description", action="store", dest="descrip", type=str, default='enrichr', metavar='strings',
+                              help="Enrichr library name required. see online tool for library names.")
+    enrichr_opt.add_argument("--ds", "--description", action="store", dest="descrip", type=str, default='enrichr', metavar='strings',
                               help="It is recommended to enter a short description for your list so that multiple lists \
                               can be differentiated from each other if you choose to save or share your list.")
-    enrichr_opt.add_argument("--cut-off", action="store", dest="thresh", metavar='float', type=float, default=0.05,
+    enrichr_opt.add_argument("--cut", "--cut-off", action="store", dest="thresh", metavar='float', type=float, default=0.05,
                               help="Adjust-Pval cutoff, used for generating plots. Default: 0.05.")
     enrichr_opt.add_argument("-t", "--top-term", dest="term", action="store", type=int, default=10, metavar='int',
                               help="Numbers of top terms showed in the plot. Default: 10")
     #enrichr_opt.add_argument("--scale", dest = "scale", action="store", type=float, default=0.5, metavar='float',
     #                          help="scatter dot scale in the dotplot. Default: 0.5")
     enrichr_opt.add_argument("--no-plot", action='store_true', dest='no_plot', default=False,
-                              help="Suppress the plot output.This is useful only if data are intrested. Default: False.")
+                              help="Suppress the plot output.This is useful only if data are interested. Default: False.")
 
 
     enrichr_output = argparser_enrichr.add_argument_group("Output figure arguments")