Skip to content

Commit

Permalink
location fix
Browse files Browse the repository at this point in the history
  • Loading branch information
ktpolanski committed Dec 13, 2016
1 parent 3bbb664 commit 116ee36
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README.md
Expand Up @@ -14,15 +14,15 @@ CSI tries to explain the expression of a child gene by taking the expression of

## Test Run

If you want to take CSI out for a spin without using your own data, this can be done with the aid of one of the 10-gene synthetic networks originally used in the CSI and hCSI publications. The dataset to be used on input can be found at `cyverseuk/csi_testdata/dream4_5.csv` under Community Data. Leave all the parameter values as defaults.
If you want to take CSI out for a spin without using your own data, this can be done with the aid of one of the 10-gene synthetic networks originally used in the CSI and hCSI publications. The dataset to be used on input can be found at `iplantcollaborative/example_data/cyverseuk/csi_testdata/dream4_5.csv` under Community Data. Leave all the parameter values as defaults.

## Input in Detail

**Note: CSI is quite computationally intensive.** Consider doing some preliminary analysis before feeding your data into it, with gene totals not exceeding the order of hundreds being preferable. Some good ways to reduce dimensionality include performing differential expression analysis and limiting the gene lists to transcription factors only.

### Gene Expression CSV

**Obligatory input.** Comma-delimited file, with expression data ordered to have genes as rows and time points as columns. In terms of headers, the first column should contain gene IDs, the first row should contain replicate names (repeated for each time point part of the replicate), and the second row should contain the corresponding time of the time point in that replicate. For reference on formatting, consult `cyverseuk/csi_testdata/dream4_5.csv` under Community Data. You can use multiple conditions on input, but CSI will treat them as replicates and not individual conditions, subsequently attempting to infer a joint regulatory model across all of the provided data.
**Obligatory input.** Comma-delimited file, with expression data ordered to have genes as rows and time points as columns. In terms of headers, the first column should contain gene IDs, the first row should contain replicate names (repeated for each time point part of the replicate), and the second row should contain the corresponding time of the time point in that replicate. For reference on formatting, consult `iplantcollaborative/example_data/cyverseuk/csi_testdata/dream4_5.csv` under Community Data. You can use multiple conditions on input, but CSI will treat them as replicates and not individual conditions, subsequently attempting to infer a joint regulatory model across all of the provided data.

### Parental Set Depth

Expand All @@ -44,7 +44,7 @@ CSI's EM optimisation of the individual parent set fits sees a lot of computatio

### Transcription Factor List

By default, CSI treats all the genes provided on input as transcription factors and identifies a network of interactions between them. Every gene is allowed to be both a parent and a child. However, in some specific use cases, a user may want to model the influence of transcription factors on very relevant downstream genes which are not transcription factors themselves. As such, allowing them as parents in the modelling would be contradictory with their biological nature. Providing a list of transcription factors, one line per gene ID (refer to `cyverseuk/csi_testdata/tflist.txt` under Community Data), will tell CSI which of the genes it's allowed to use as parents and which are only to be used as children. Computational constraints apply as always, be very selective with your downstream targets.
By default, CSI treats all the genes provided on input as transcription factors and identifies a network of interactions between them. Every gene is allowed to be both a parent and a child. However, in some specific use cases, a user may want to model the influence of transcription factors on very relevant downstream genes which are not transcription factors themselves. As such, allowing them as parents in the modelling would be contradictory with their biological nature. Providing a list of transcription factors, one line per gene ID (refer to `iplantcollaborative/example_data/cyverseuk/csi_testdata/tflist.txt` under Community Data), will tell CSI which of the genes it's allowed to use as parents and which are only to be used as children. Computational constraints apply as always, be very selective with your downstream targets.

### Data Normalisation

Expand Down

0 comments on commit 116ee36

Please sign in to comment.