# Download ancient stem cells

Download fastq files from https://www.ncbi.nlm.nih.gov/bioproject/PRJNA244851



Proc Natl Acad Sci U S A. 2015 Dec 22;112(51):E7093-100. doi: 10.1073/pnas.1514789112. Epub 2015 Dec 7.
## The ancestral gene repertoire of animal stem cells.
Alié A1, Hayashi T2, Sugimura I3, Manuel M4, Sugano W5, Mano A5, Satoh N6, Agata K5, Funayama N1.
### Abstract
Stem cells are pivotal for development and tissue homeostasis of multicellular animals, and the quest for a gene toolkit associated with the emergence of stem cells in a common ancestor of all metazoans remains a major challenge for evolutionary biology. We reconstructed the conserved gene repertoire of animal stem cells by transcriptomic profiling of totipotent archeocytes in the demosponge Ephydatia fluviatilis and by tracing shared molecular signatures with flatworm and Hydra stem cells. Phylostratigraphy analyses indicated that most of these stem-cell genes predate animal origin, with only few metazoan innovations, notably including several partners of the Piwi machinery known to promote genome stability. The ancestral stem-cell transcriptome is strikingly poor in transcription factors. Instead, it is rich in RNA regulatory actors, including components of the "germ-line multipotency program" and many RNA-binding proteins known as critical regulators of mammalian embryonic stem cells.

KEYWORDS:
Porifera; RNA binding; evolution; stem cells; uPriSCs

## How to use `download_sra.rf`

```
 Tue  5 Feb - 08:31  ~/code/reflow-workflows/workflows   origin ☊ master ✔ 1☀ 
  reflow run download_sra.rf
flag errors:
        missing mandatory flag -output
        missing mandatory flag -sra_id
usage of download_sra.rf:
  -fastq_dump_disk uint
        GiB of storage for converting to fastq.gz files (per file)
         (default 50)
  -fastq_dump_threads uint
        GiB of memory for samtools cat
         (default 8)
  -output string
        S3 folder location to put the downloaded files (required)
  -sra_disk uint
        GiB of storage for downloading SRA files (per file)
         (default 50)
  -sra_id string
        Can be any of SRR, ERR, or SRX ids. Pipe-separate for multiple, e.g. 'SRR1539523|SRR1539569|SRR1539570' (required)
```

In [6]:
import pandas as pd


data = {'sra_id': ['PRJNA244851'], 'output': ['s3://olgabot-maca/ancient_stem_cells/fastqs/']}
samples = pd.DataFrame(data)
samples['id'] = samples['sra_id']
samples = samples.set_index('id')
samples

Unnamed: 0_level_0,sra_id,output
id,Unnamed: 1_level_1,Unnamed: 2_level_1
PRJNA244851,PRJNA244851,s3://olgabot-maca/ancient_stem_cells/fastqs/


In [7]:
import json

folder = '../ancient_stem_cells/'

! mkdir -p $folder
samples.to_csv(f'{folder}/samples.csv')

config = 	{
		"program": "../../reflow-workflows/workflows/download_sra.rf",
		"runs_file": "samples.csv"
	}

with open(f'{folder}/config.json', 'w') as f:
    json.dump(config, f)
    
! head -n 2 $folder/samples.csv $folder/config.json

==> ../ancient_stem_cells//samples.csv <==
id,sra_id,output
PRJNA244851,PRJNA244851,s3://olgabot-maca/ancient_stem_cells/fastqs/

==> ../ancient_stem_cells//config.json <==
{"program": "../../reflow-workflows/workflows/download_sra.rf", "runs_file": "samples.csv"}