-
Notifications
You must be signed in to change notification settings - Fork 69
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
67c2a15
commit 39de9bf
Showing
4 changed files
with
204 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
> Python arguments are equivalent to long-option arguments (`--arg`), unless otherwise specified. Flags are True/False arguments in Python. The manual for any gget tool can be called from the command-line using the `-h` `--help` flag. | ||
## gget cellxgene 🍱 | ||
Query data from [CZ CELLxGENE Discover](https://cellxgene.cziscience.com/) using the [CZ CELLxGENE Discover Census](https://github.com/chanzuckerberg/cellxgene-census). | ||
|
||
**General optional arguments** | ||
`-s` `--species` | ||
Choice of 'homo_sapiens' or 'mus_musculus'. Default: 'homo_sapiens'. | ||
|
||
`-g` `--gene` | ||
Str or list of gene name(s) or Ensembl ID(s). Default: None. | ||
NOTE: Use `-e / --ensembl` (Python: `ensembl=True`) when providing Ensembl ID(s) instead of gene name(s). | ||
See https://cellxgene.cziscience.com/gene-expression for examples of available genes. | ||
|
||
`-cn` `--column_names` | ||
List of metadata columns to return (stored in AnnData.obs). | ||
Default: ['dataset_id', 'assay', 'suspension_type', 'sex', 'tissue_general', 'tissue', 'cell_type'] | ||
For more options see: https://api.cellxgene.cziscience.com/curation/ui/#/ -> Schemas -> dataset | ||
|
||
`-o` `--out` | ||
Path to file to save generated AnnData .h5ad file (or .csv with `-mo / --meta_only` (`anndata=False`)). | ||
Required when using from command line! | ||
|
||
**General flags** | ||
`-e` `--ensembl` | ||
Use when genes are provided as Ensembl IDs instead of gene names. | ||
|
||
`-mo` `--meta_only` | ||
Command line only! Only returns metadata dataframe (corresponds to AnnData.obs). | ||
Python: Use `anndata=False`. | ||
|
||
`-q` `--quiet` | ||
Command-line only. Prevents progress information from being displayed. | ||
Python: Use `verbose=False` to prevent progress information from being displayed. | ||
|
||
**Optional arguments corresponding to CZ CELLxGENE Discover metadata attributes** | ||
`--tissue` | ||
Str or list of tissue(s), e.g. ['lung', 'blood']. Default: None. | ||
See https://cellxgene.cziscience.com/gene-expression for examples of available tissues. | ||
|
||
`--cell_type` | ||
Str or list of celltype(s), e.g. ['mucus secreting cell', 'neuroendocrine cell']. Default: None. | ||
See https://cellxgene.cziscience.com/gene-expression and select a tissue to see examples of available celltypes. | ||
|
||
`--development_stage` | ||
Str or list of development stage(s). Default: None. | ||
|
||
`--disease` | ||
Str or list of disease(s). Default: None. | ||
|
||
`--sex` | ||
Str or list of sex(es), e.g. 'female'. Default: None. | ||
|
||
`--dataset_id` | ||
Str or list of CELLxGENE dataset ID(s). Default: None. | ||
|
||
`--tissue_general_ontology_term_id` | ||
Str or list of high-level tissue UBERON ID(s). Default: None. | ||
Tissue labels and their corresponding UBERON IDs are listed [here](https://github.com/chanzuckerberg/single-cell-data-portal/blob/9b94ccb0a2e0a8f6182b213aa4852c491f6f6aff/backend/wmg/data/tissue_mapper.py). | ||
|
||
`--tissue_general` | ||
Str or list of high-level tissue label(s). Default: None. | ||
Tissue labels and their corresponding UBERON IDs are listed [here](https://github.com/chanzuckerberg/single-cell-data-portal/blob/9b94ccb0a2e0a8f6182b213aa4852c491f6f6aff/backend/wmg/data/tissue_mapper.py). | ||
|
||
`--tissue_ontology_term_id` | ||
Str or list of tissue ontology term ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--assay_ontology_term_id` | ||
Str or list of assay ontology term ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--assay` | ||
Str or list of assay(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--cell_type_ontology_term_id` | ||
Str or list of celltype ontology term ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--development_stage_ontology_term_id` | ||
Str or list of development stage ontology term ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--disease_ontology_term_id` | ||
Str or list of disease ontology term ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--donor_id` | ||
Str or list of donor ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--self_reported_ethnicity_ontology_term_id` | ||
Str or list of self reported ethnicity ontology ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--self_reported_ethnicity` | ||
Str or list of self reported ethnicity as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--sex_ontology_term_id` | ||
Str or list of sex ontology ID(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
`--suspension_type` | ||
Str or list of suspension type(s) as defined in the CELLxGENE dataset schema. Default: None. | ||
|
||
|
||
### Examples | ||
```bash | ||
gget cellxgene --gene ACE2 ABCA1 SLC5A1 --tissue lung --cell_type 'mucus secreting cell' 'neuroendocrine cell' -o example_adata.h5ad | ||
``` | ||
```python | ||
# Python | ||
adata = gget.cellxgene( | ||
gene = ["ACE2", "ABCA1", "SLC5A1"], | ||
tissue = "lung", | ||
cell_type = ["mucus secreting cell", "neuroendocrine cell"] | ||
) | ||
adata | ||
``` | ||
→ Returns an AnnData object containing the scRNAseq ACE2, ABCA1, and SLC5A1 count matrix of 3322 human lung mucus secreting and neuroendocrine cells from CZ CELLxGENE Discover and their corresponding metadata. | ||
|
||
Fetch metadata (corresponds to AnnData.obs) only: | ||
```bash | ||
gget cellxgene --meta_only --gene ENSMUSG00000015405 --ensembl --tissue lung --species mus_musculus -o example_meta.csv | ||
``` | ||
```python | ||
# Python | ||
df = gget.cellxgene( | ||
anndata = False, | ||
gene = "ENSMUSG00000015405", | ||
ensembl = True, | ||
tissue = "lung", | ||
species = "mus_musculus" | ||
) | ||
df | ||
``` | ||
→ Returns only the metadata from ENSMUSG00000015405 (ACE2) expression datasets corresponding to mouse lung cells. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
# Welcome to gget's contributing guide <!-- omit in toc --> | ||
|
||
Thank you for investing your time in contributing to our project! Any contribution you make will be reflected on the [gget repo](https://github.com/pachterlab/gget). ✨ | ||
|
||
Read our [Code of Conduct](./CODE_OF_CONDUCT.md) to keep our community approachable and respectable. | ||
|
||
In this guide you will get an overview of the contribution workflow from opening an issue, creating a PR, reviewing, and merging the PR. | ||
|
||
## Getting started | ||
|
||
### Issues | ||
|
||
#### Create a new issue | ||
|
||
If you spot a problem with gget or you have an idea for a new feature, [check if an issue already exists](https://github.com/pachterlab/gget/issues). If a related issue doesn't exist, you can open a new issue using the relevant [issue form](https://github.com/pachterlab/gget/issues/new/choose). | ||
|
||
#### Solve an issue | ||
|
||
Scan through our [existing issues](https://github.com/pachterlab/gget/issues) to find one that interests you. You can narrow down the search using `labels` as filters. If you find an issue to work on, you are welcome to open a PR with a fix. | ||
|
||
### Make Changes | ||
|
||
### Getting started | ||
|
||
1. Fork the repository. | ||
- Using GitHub Desktop: | ||
- [Getting started with GitHub Desktop](https://docs.github.com/en/desktop/installing-and-configuring-github-desktop/getting-started-with-github-desktop) will guide you through setting up Desktop. | ||
- Once Desktop is set up, you can use it to [fork the repo](https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/cloning-and-forking-repositories-from-github-desktop)! | ||
|
||
- Using the command line: | ||
- [Fork the repo](https://docs.github.com/en/github/getting-started-with-github/fork-a-repo#fork-an-example-repository) so that you can make your changes without affecting the original project until you're ready to merge them. | ||
|
||
2. Create a working branch and start with your changes! | ||
|
||
### Commit your update | ||
|
||
Commit the changes once you are happy with them. | ||
|
||
### ‼️ Self-review the following before creating a Pull Request ‼️ | ||
|
||
1. Review the content for technical accuracy. | ||
2. Copy-edit the changes/comments for grammar, spelling, and adherence to the general style of existing gget code. | ||
3. Format your code using [black](https://black.readthedocs.io/en/stable/getting_started.html). | ||
4. Make sure the unit tests pass: | ||
- Developer dependencies can be installed with `pip install -r dev-requirements.txt` | ||
- Run existing unit tests from the gget repository root with `coverage run -m pytest -ra -v tests && coverage report --omit=main.py,tests*` | ||
5. Add new unit tests if applicable: | ||
- Arguments and expected results are stored in json files in ./tests/fixtures/ | ||
- Unit tests can be added to ./tests/test_*.py and will be automatically detected | ||
6. Make sure the edits are compatible with both the Python and the command line interface | ||
- The command line interface and arguments are defined in ./gget/main.py | ||
8. Add new modules/arguments to the documentation if applicable: | ||
- The manual for each module can be edited/added as ./docs/src/*.md | ||
|
||
If you have any questions, feel free to start a [discussion](https://github.com/pachterlab/gget/discussions) or create an issue as described above. | ||
|
||
### Pull Request | ||
|
||
When you're finished with the changes, [create a pull request](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/proposing-changes-to-your-work-with-pull-requests/creating-a-pull-request), also known as a PR. | ||
|
||
‼️ Please make all PRs against the `dev` branch of the gget repository. | ||
|
||
- Don't forget to [link PR to issue](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue) if you are solving one. | ||
- Enable the checkbox to [allow maintainer edits](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/allowing-changes-to-a-pull-request-branch-created-from-a-fork) so the branch can be updated for a merge. | ||
- If you run into any merge issues, checkout this [git tutorial](https://github.com/skills/resolve-merge-conflicts) to help you resolve merge conflicts and other issues. | ||
|
||
Once you submit your PR, a gget team member will review your proposal. We may ask questions or request additional information. | ||
|
||
### Your PR is merged! | ||
|
||
Congratulations! 🎉 The gget team thanks you. ✨ | ||
|
||
Once your PR is merged, your contributions will be publicly visible on the [gget repo](https://github.com/pachterlab/gget). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters