Skip to content

Commit

Permalink
Merge branch 'master' of https://github.com/BioNinja/GSEApy
Browse files Browse the repository at this point in the history
  • Loading branch information
Zhuoqing Fang committed Nov 24, 2017
2 parents e8e6a2e + fe210c6 commit b55f021
Showing 1 changed file with 44 additions and 44 deletions.
88 changes: 44 additions & 44 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,23 +29,23 @@ GSEAPY: Gene Set Enrichment Analysis in Python.




The main documentation for GSEAPY can be found at https://pythonhosted.org/gseapy

An example to use gseapy, please click here: `Example <http://pythonhosted.org/gseapy/gseapy_example.html>`_

**Release notes** : https://github.com/BioNinja/gseapy/releases
The main documentation for GSEAPY can be found at http://gseapy.rtfd.io/

GSEAPY is a python wrapper for **GSEA** and **Enrichr**.
An example to use gseapy, please click here: `Example <http://gseapy.readthedocs.io/en/latest/gseapy_example.html>`_

**Release notes** : https://github.com/BioNinja/gseapy/releases

GSEAPY is a python wrapper for **GSEA** and **Enrichr**.
--------------------------------------------------------------------------------------------

GSEAPY could be used for **RNA-seq, ChIP-seq, Microarry** data. It's used for convenient GO enrichments and produce **publishable quality figures** in python.
GSEAPY could be used for **RNA-seq, ChIP-seq, Microarry** data. It's used for convenient GO enrichments and produce **publishable quality figures** in python.


GSEAPY has five sub-commands available: ``gsea``, ``prerank``, ``ssgsea``, ``replot`` ``enrichr``.


:gsea: The ``gsea`` module produce `GSEA <http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page>`_ results.The input requries a txt file(FPKM, Expected Counts, TPM, et.al), a cls file, and gene_sets file in gmt format.
:gsea: The ``gsea`` module produce `GSEA <http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page>`_ results.The input requries a txt file(FPKM, Expected Counts, TPM, et.al), a cls file, and gene_sets file in gmt format.
:prerank: The ``prerank`` module produce **Prerank tool** results. The input expects a pre-ranked gene list dataset with correlation values, which in .rnk format, and gene_sets file in gmt format. ``prerank`` module is an API to `GSEA` pre-rank tools.
:ssgsea: The ``ssgsea`` module perform **single sample GSEA(ssGSEA)** analysis. The input expects a gene list with expression values(same format with ``.rnk`` file, and gene_sets file in gmt format. For multi sample input, ssGSEA reconigze gct format, too. ssGSEA enrichment score for the gene set as described by `D. Barbie et al 2009 <http://www.nature.com/nature/journal/v462/n7269/abs/nature08460.html>`_.

Expand All @@ -58,7 +58,7 @@ Please use 'gseapy COMMAND -h' to see the detail description for each option of


The full ``GSEA`` is far too extensive to describe here; see
`GSEA <http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page>`_ documentation for more information. All files' formats for GSEApy are identical to ``GSEA`` desktop version.
`GSEA <http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Main_Page>`_ documentation for more information. All files' formats for GSEApy are identical to ``GSEA`` desktop version.


**If you use gseapy in your research, you should cite the original ``GSEA`` and ``Enrichr`` paper.**
Expand All @@ -67,16 +67,16 @@ Why GSEAPY
-----------------------------------------------------

I would like to use Pandas to explore my data, but I did not find a convenient tool to
do gene set enrichment analysis in python. So, here is my reason:
do gene set enrichment analysis in python. So, here is my reason:

* **Running inside python interactive console without switch to R!!!**
* User friendly for both wet and dry lab usrers.
* Produce or reproduce pubilishable figures.
* Perform batch jobs easy.
* Easy to use in bash shell or your data analysis workflow, e.g. snakemake.
* Easy to use in bash shell or your data analysis workflow, e.g. snakemake.


GSEA Java version output:
GSEA Java version output:
-------------------------------------------------
This is an example of GSEA desktop application output

Expand All @@ -97,23 +97,23 @@ Using ``Prerank`` or ``replot`` module will reproduce the same figure for GSEA J





Generated by GSEAPY
**GSEAPY figures are supported by all matplotlib figure formats.**

**GSEAPY figures are supported by all matplotlib figure formats.**

You can modify ``GSEA`` plots easily in .pdf files. Please Enjoy.



GSEAPY ``enrichr`` module
GSEAPY ``enrichr`` module
-----------------------------------------------
**note:** For now, enrichr module download enriched results only.

**TODO:** Save enriched table, grids, networks, bar graphs from website server using ``phantomJS`` and ``selenium``.

A graphical introduction of Enrichr
A graphical introduction of Enrichr

.. figure:: docs/enrichr.PNG
:height: 300px
Expand All @@ -132,15 +132,15 @@ Installation

.. code:: shell
# if you have conda
$ conda install -c bioconda gseapy
$ conda install -c bioconda gseapy
# install lastest release
# and for windows users
# and for windows users
$ conda install -c bioninja gseapy
# or use pip to install the lastest release
# or use pip to install the lastest release
$ pip install gseapy
| You may instead want to use the development version from Github, by running
Expand All @@ -156,16 +156,16 @@ Dependency
Mandatory
~~~~~~~~~

* Numpy
* Pandas
* Numpy
* Pandas
* Matplotlib
* Beautifulsoup4
* Requests(for enrichr API)

You may also need to install **lxml, html5lib**, if you could not parse xml files.
You may also need to install **lxml, html5lib**, if you could not parse xml files.




Run GSEAPY
-----------------

Expand All @@ -179,12 +179,12 @@ For command line usage:
~~~~~~~~~~~~~~~~~~~~~~~

.. code:: bash
# An example to reproduce figures using replot module.
$ gseapy replot -i ./Gsea.reports -o test
# An example to run GSEA using gseapy gsea module
$ gseapy gsea -d exptable.txt -c test.cls -g gene_sets.gmt -o test
Expand All @@ -204,15 +204,15 @@ Run gseapy inside python console:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

1. Prepare expression.txt, gene_sets.gmt and test.cls required by GSEA, you could do this

.. code:: python
import gseapy
# run GSEA.
gseapy.gsea(data='expression.txt', gene_sets='gene_sets.gmt', cls='test.cls', outdir='test')
# run prerank
# run prerank
gseapy.prerank(rnk='gsea_data.rnk', gene_sets='gene_sets.gmt', outdir='test')
# run ssGSEA
Expand All @@ -228,16 +228,16 @@ Run gseapy inside python console:
see detail here: `Example <http://pythonhosted.org/gseapy/gseapy_example.html>`_

.. code:: python
# assign dataframe, and use enrichr libary data set 'KEGG_2016'
expression_dataframe = pd.DataFrame()
sample_name = ['A','A','A','B','B','B'] # always only two group,any names you like
sample_name = ['A','A','A','B','B','B'] # always only two group,any names you like
# assign gene_sets parameter with enrichr library name or gmt file on your local computor.
gseapy.gsea(data=expression_dataframe, gene_sets='KEGG_2016', cls= sample_names, outdir='test')
# using prerank tool
gene_ranked_dataframe = pd.DataFrame()
gseapy.prerank(rnk=gene_ranked_dataframe, gene_sets='KEGG_2016', outdir='test')
Expand All @@ -251,15 +251,15 @@ see detail here: `Example <http://pythonhosted.org/gseapy/gseapy_example.html>`_
.. code:: python
# assign a list object to enrichr
gl = ['SCARA3', 'LOC100044683', 'CMBL', 'CLIC6', 'IL13RA1', 'TACSTD2', 'DKKL1', 'CSF1',
gl = ['SCARA3', 'LOC100044683', 'CMBL', 'CLIC6', 'IL13RA1', 'TACSTD2', 'DKKL1', 'CSF1',
'SYNPO2L', 'TINAGL1', 'PTX3', 'BGN', 'HERC1', 'EFNA1', 'CIB2', 'PMP22', 'TMEM173']
gseapy.enrichr(gene_list=gl, description='pathway', gene_sets='KEGG_2016', outdir='test')
# or a txt file path.
gseapy.enrichr(gene_list='gene_list.txt', description='pathway', gene_sets='KEGG_2016',
gseapy.enrichr(gene_list='gene_list.txt', description='pathway', gene_sets='KEGG_2016',
outdir='test', cutoff=0.05, format='png' )
GSEAPY supported gene set libaries :
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand All @@ -269,7 +269,7 @@ To see the full list of gseapy supported gene set librarys, please click here: `
Or use ``get_library_name`` function inside python console.

.. code:: python
#see full list of latest enrichr library names, which will pass to -g parameter:
names = gseapy.get_library_name()
Expand Down

0 comments on commit b55f021

Please sign in to comment.