## Creation of a SoS workflow from interactive analysis

### Interactive data analysis

Interactive data analysis can be performed in cells with different kernels as follows. Because SoS is an extension to Python 3, you can use arbitrary Python statements in SoS cells.

In [1]:
fastq_file = 'raw_data.fastq'
count_file = 'aligned.csv'
output_pdf = 'myfigure.pdf'

In [2]:
%expand
echo count_reads --infile {fastq_file} --outfile {count_file}
echo "A,B" > {count_file}
echo "1,2" >> {count_file}

count_reads --infile raw_data.fastq --outfile aligned.csv


In [3]:
%expand
count.data <- read.csv('{count_file}')
pdf('{output_pdf}')
plot(count.data)
dev.off()

### Conversion to a SoS Workflow

SoS workflows within a SoS Notebook are defined by sections marked by section headers (`[name: option]`). A `[global]` section should be used for definitions that will be used by all steps.

You also need to convert scripts to SoS actions so that they can be executed as **complete** scripts. Remember also to change the cell type from subkernel to SoS.

In [4]:
[global]
fastq_file = 'raw_data.fastq'
count_file = 'aligned.csv'
output_pdf = 'myfigure.pdf'

In [5]:
[analysis_10 (align)]
sh: expand=True
    echo count_reads --infile {fastq_file} --outfile {count_file}
    echo "A,B" > {count_file}
    echo "1,2" >> {count_file}

In [6]:
[analysis_20 (plot)]
R: expand=True
    count.data <- read.csv('{count_file}')
    pdf('{output_pdf}')
    plot(count.data)
    dev.off()

In [7]:
%preview --workflow

In [8]:
%sosrun

count_reads --infile raw_data.fastq --outfile aligned.csv
null device 
          1 


## Parameters and signatures