Skip to content

Commit

Permalink
Merge f792dce into 0c893ad
Browse files Browse the repository at this point in the history
  • Loading branch information
aschroed committed May 13, 2019
2 parents 0c893ad + f792dce commit 5f871ae
Show file tree
Hide file tree
Showing 8 changed files with 201 additions and 90 deletions.
8 changes: 4 additions & 4 deletions README.md
Expand Up @@ -24,7 +24,7 @@ pip3 install submit4dn --upgrade

### Troubleshooting

If you encounter an error containing something like:
If you encounter an error containing something like:

```
Symbol not found: _PyInt_AsLong
Expand Down Expand Up @@ -85,17 +85,17 @@ get_field_info --type Biosample --comments --outfile biosample.xls

Example list of sheets:
~~~~
get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentAgent --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type IndividualMouse --type ExperimentHiC --type ExperimentSetReplicate --type ExperimentCaptureC --type Target --type GenomicRegion --type ExperimentSet --type Image --comments --outfile MetadataSheets.xls
get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentAgent --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type IndividualMouse --type ExperimentHiC --type ExperimentSetReplicate --type ExperimentCaptureC --type BioFeature --type GenomicRegion --type ExperimentSet --type Image --comments --outfile MetadataSheets.xls
~~~~

Example list of sheets: (using python scripts)
~~~~
python3 -m wranglertools.get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentAgent --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type IndividualHuman --type ExperimentHiC --type ExperimentCaptureC --type Target --type GenomicRegion --type ExperimentSet --type ExperimentSetReplicate --type Image --comments --outfile MetadataSheets.xls
python3 -m wranglertools.get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentAgent --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type IndividualHuman --type ExperimentHiC --type ExperimentCaptureC --type BioFeature --type GenomicRegion --type ExperimentSet --type ExperimentSetReplicate --type Image --comments --outfile MetadataSheets.xls
~~~~

Example list of sheets: (Experiment seq)
~~~~
python3 -m wranglertools.get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentAgent --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type ExperimentSeq --type Target --type GenomicRegion --type ExperimentSet --type ExperimentSetReplicate --type Image --comments --outfile exp_seq_all.xls
python3 -m wranglertools.get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentAgent --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type ExperimentSeq --type BioFeature --type GenomicRegion --type ExperimentSet --type ExperimentSetReplicate --type Image --comments --outfile exp_seq_all.xls
~~~~

Example list of sheets: (Experiment seq simple)
Expand Down
21 changes: 19 additions & 2 deletions doc/metadata_submission.md
Expand Up @@ -69,7 +69,7 @@ In some cases a field value must be formatted in a certain way or the Item will
In other cases a field value must match a certain pattern. For example, if a field requires a DNA sequence then the submitted value must contain only the characters A, T, G, C or N.


_Database Cross Reference (DBxref) fields_, which contain identifiers that refer to external databases, are another case requiring special formatting. In many cases the values of these fields need to be in database\_name:ID format. For example, an SRA experiment identifier would need to be submitted in the form ‘SRA:SRX1234567’ (see also [Basic fields example](#basic-field) above). Note that in a few cases where the field takes only identifiers for one or two specific databases the ID alone can be entered - for example, when entering gene symbols in the *'targeted\_genes’* field of the Target Item you can enter only the gene symbols i.e. PARK2, DLG1.
_Database Cross Reference (DBxref) fields_, which contain identifiers that refer to external databases, are another case requiring special formatting. In many cases the values of these fields need to be in database\_name:ID format. For example, an SRA experiment identifier would need to be submitted in the form ‘SRA:SRX1234567’ (see also [Basic fields example](#basic-field) above).

####When a field specifies a linked item
Some fields in a Sheet for an Item may contain references to another Item. These may be of the same type or different types. Examples of this type of field include the *‘biosource’* field in Biosample or the *‘files’* field in the ExperimentHiC. Note that the latter is also an example of a list field that can take multiple values.
Expand Down Expand Up @@ -264,7 +264,24 @@ The scripts accepts the following parameters:.

**To get the complete list of relevant sheets in one workbook:**

get_field_info --type Publication --type Document --type Vendor --type Protocol --type BiosampleCellCulture --type Biosource --type Enzyme --type Construct --type TreatmentChemical --type TreatmentRnai --type Modification --type Biosample --type FileFastq --type FileSet --type IndividualHuman --type IndividualMouse --type ExperimentHiC --type ExperimentCaptureC --type ExperimentRepliseq --type Target --type GenomicRegion --type ExperimentSet --type ExperimentSetReplicate --type Image --comments --outfile AllItems.xls
get_field_info --type all --comments --outfile AllItems.xls


**You can also generate the sheets needed for a particular type of experiment using pre-set options**

get_field_info --type hic --comments --outfile HiCMetadata.xls

Current presets include:
- hic for most types of Hi-C eg. in situ, dilution, single cell
- chipseq for ChIP-seq
- repliseq for 2-phase or multi-phase Repli-seq
- atacseq for ATAC-seq
- damid for DamID-seq
- chiapet for CHIA-Pet and PLAC-seq
- capturec for Capture Hi-C
- fish for RNA and DNA FISH
- spt for Single Particle Tracking Imaging experiments


##<a name="rest"></a>Submission of metadata using the 4DN REST API
The 4DN-DCIC metadata database can be accessed using a Hypertext-Transfer-Protocol-(HTTP)-based, Representational-state-transfer (RESTful) application programming interface (API) - aka the REST API. In fact, this API is used by the ```import_data``` script used to submit metadata entered into excel spreadsheets as described [in this document](https://docs.google.com/document/d/1Xh4GxapJxWXCbCaSqKwUd9a2wTiXmfQByzP0P8q5rnE). This API was developed by the [ENCODE][encode] project so if you have experience retrieving data from or submitting data to ENCODE use of the 4DN-DCIC API should be familiar to you. The REST API can be used both for data submission and data retrieval, typically using scripts written in your language of choice. Data objects exchanged with the server conform to the standard JavaScript Object Notation (JSON) format. Libraries written for use with your chosen language are typically used for the network connection, data transfer, and parsing of data (for example, requests and json, respectively for Python). For a good introduction to scripting data retrieval (using GET requests) you can refer to [this page](https://www.encodeproject.org/help/rest-api/) on the [ENCODE][encode] web site that also has a good introduction to viewing and understanding JSON formatted data.
Expand Down
5 changes: 3 additions & 2 deletions doc/schema_info.md
Expand Up @@ -6,6 +6,7 @@ award.json | Award | award(s)
biosample.json | Biosample | biosample(s)
biosample\_cell\_culture.json | BiosampleCellCulture | biosample-cell-cultures, biosample\_cell\_culture
biosource.json | Biosource | biosource(s)
bio_feature.json | BioFeature | bio-features, bio\_feature
construct.json | Construct | construct(s)
document.json | Document | document(s)
enzyme.json | Enzyme | enzyme(s)
Expand All @@ -19,6 +20,7 @@ file\_fastq.json | FileFastq | files-fastq, file\_fastq
file\_processed.json | FileProcessed | files-processed, file\_processed
file\_reference.json | FileReference | files-reference, file\_reference
file\_set.json | FileSet | file-sets, file\_set
gene.json | Gene | gene(s)
genomic\_region.json | GenomicRegion | genomic-regions, genomic\_region
image.json | Image | image(s)
individual\_human.json | IndividualHuman | individuals-human, individual\_human
Expand All @@ -37,8 +39,7 @@ software.json | Software | software(s)
sop\_map.json | SopMap | sop-maps, sop\_map
summary\_statistic.json | SummaryStatistic | summary-statistics, summary\_statistic
summary\_statistic\_hi\_c.json | SummaryStatisticHiC | summary-statistics-hi-c, summary\_statistic\_hi\_c
target.json | Target | target(s)
treatment\_chemical.json | TreatmentChemical | treatments-chemical, treatment\_chemical
treatment\_agent.json | TreatmentAgent | treatments-agent, treatment\_agent
treatment\_rnai.json | TreatmentRnai | treatments-rnai, treatment\_rnai
user.json | User | user(s)
vendor.json | Vendor | vendor(s)
Expand Down

0 comments on commit 5f871ae

Please sign in to comment.