Skip to content

Commit

Permalink
update doc for better rendering
Browse files Browse the repository at this point in the history
  • Loading branch information
Juke34 committed Apr 15, 2021
1 parent 4bb78db commit 911a10a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/gxf.md
Original file line number Diff line number Diff line change
Expand Up @@ -407,7 +407,7 @@ Originally Ensembl has created the GTF format that has been then slightly modifi
### Main points and differences between GFF formats

format version | year | col1 - seqname | col2 - source | col3 - feature | col4 - start | col5 - end | col6 - score | col7 - strand | col8 - frame | col9 - attribute | Comment
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
--- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | ---
GFF1| 1997 | | | can be anything | integer | integer | numerical value or 0 | '+', '-' or '.' | '0', '1', '2' or '.' | This field is originaly called group. An optional string-valued field that can be used as a name to group together a set of records. | Each String had to be under 256 characters + whole line under 32 000 characters long |
GFF2| 2000 | | | can be anything | integer | integer | numerical value or '.' | '+', '-' or '.' | '0', '1', '2' or '.' | This optional must have an tag value structure following the syntax used within objects in a .ace file, flattened onto one line by semicolon separators. Tags must be standard identifiers ([A-Za-z][A-Za-z0-9_]*). Free text values must be quoted with double quotes. Note: all non-printing characters in such free text value strings (e.g. newlines, tabs, control characters, etc) must be explicitly represented by their C (UNIX) style backslash-escaped representation (e.g. newlines as '\n', tabs as '\t'). As in ACEDB, multiple values can follow a specific tag. form: **Target "HBA_HUMAN" 11 55 ; E_value 0.0003** | The START and STOP codons are included in the CDS |
GFF3| 2004 | \[a-zA-Z0-9.:^*$@!+_?-\|\] | | Column name changed by <type>. This is constrained to be either a term from the Sequence Ontology or an SO accession number. | integer | integer | numerical value or '.' | '+', '-', '.' or '?' | Column name changed by <phase> '0', '1', '2' or '.' | Multiple tag=value pairs are separated by semicolons. URL escaping rules are used for tags or values containing the following characters: ",=;". Spaces are allowed in this field, but tabs must be replaced with the %09 URL escape. Attribute values do not need to be and should not be quoted. The quotes should be included as part of the value by parsers and not stripped. form: **ID=cds00004;Parent=mRNA00001,mRNA00002;Name=edenprotein.4**. Some tags have predefined meaning, they start by capital letter. The ID attributes are only mandatory for those features that have children (the gene and mRNAs), or for those that span multiple lines. Consequently features having parents must have the Parent attribute. | The START and STOP codons are included in the CDS
Expand Down

0 comments on commit 911a10a

Please sign in to comment.