-
Notifications
You must be signed in to change notification settings - Fork 3
file_formats
The data file standard formats directly supported by the system are the following:
- BED6: http://genome.ucsc.edu/FAQ/FAQformat.html#format1
- BedGraph: http://genome.ucsc.edu/goldenPath/help/bedgraph.html
- NarrowPeak: http://genome.ucsc.edu/FAQ/FAQformat.html#format12
- BroadPeak: http://genome.ucsc.edu/FAQ/FAQformat.html#format13
- VCF: https://samtools.github.io/hts-specs/VCFv4.2.pdf
The user can upload data files in any of the above formats, ensuring to select the correct "File type" value from the "Add/Upload new dataset" feature on the Web interface. Alternatively, the user can upload data files in a Custom format (either GTF or TAB-delimited). In this case he/she needs to additionally provide an XML schema file which describes the fields (and their types) used in the data files. To create one's own XML schema file, it is recommended to download one of the ready-to-use XML schemas in the gmql_conf folder and modify the described field names and types according to one's GTF or TAB-delimited data file format. Note that, in the XML schema file, field type is defined by the type attribute in the gmqlSchema element.
For further details on the GMQL schema format in XML, please refer to the folder gmql_conf and the paper Masseroli et al. (2016) [PDF].