From 8364699b52b3a5d8d00b5b1f064f75b76f7fd1c0 Mon Sep 17 00:00:00 2001 From: Jon Ison Date: Mon, 9 May 2016 16:02:48 +0100 Subject: [PATCH] Misc changes from GitHub metagenomics concepts, simplication of visualisation operations, removing redundant snyonyms etc. --- EDAM_dev.owl | 637 +++++++++++++++++++++++++++++++-------------------------- HOW_TO_EDIT.md | 3 + 2 files changed, 354 insertions(+), 286 deletions(-) diff --git a/EDAM_dev.owl b/EDAM_dev.owl index a350c65..5076f29 100644 --- a/EDAM_dev.owl +++ b/EDAM_dev.owl @@ -33,6 +33,7 @@ formats "EDAM data formats" EDAM Jon Ison, Matus Kalas, Hervé Ménager + 24:02:2016 21:54GMT identifiers "EDAM types of identifiers" data "EDAM types of data" relations "EDAM relations" @@ -48,10 +49,9 @@ 3730 Matúš Kalaš EDAM_format http://edamontology.org/format_ "EDAM data formats" - 1.15_dev topics "EDAM topics" - 24:02:2016 21:54GMT Hervé Ménager + 1.15_dev EDAM is an ontology of well established, familiar concepts that are prevalent within bioinformatics, including types of data and data identifiers, data formats, operations and topics. EDAM is a simple ontology - essentially a set of terms with synonyms and definitions - organised into an intuitive hierarchy for convenient use by curators, software developers and end-users. EDAM is suitable for large-scale semantic annotations and categorization of diverse bioinformatics resources. EDAM is also suitable for diverse application including for example within workbenches and workflow-management systems, software distributions, and resource registries. @@ -418,8 +418,8 @@ - In very unusual cases. true + In very unusual cases. @@ -465,16 +465,16 @@ - true - In very unusual cases. + OBO_REL:has_participant + 'OBO_REL:has_participant' is narrower in the sense that it only relates ontological categories (concepts) that are a 'process' (span:Process) with ontological categories that are a 'continuant' (snap:Continuant), and broader in the sense that it relates with any participating objects not just inputs or input arguments of the subject. - + - 'OBO_REL:has_participant' is narrower in the sense that it only relates ontological categories (concepts) that are a 'process' (span:Process) with ontological categories that are a 'continuant' (snap:Continuant), and broader in the sense that it relates with any participating objects not just inputs or input arguments of the subject. - OBO_REL:has_participant + true + In very unusual cases. - + @@ -505,8 +505,8 @@ - true In very unusual cases. + true @@ -541,8 +541,8 @@ - true In very unusual cases. + true @@ -598,22 +598,22 @@ - In very unusual cases. - true + OBO_REL:inheres_in + Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:inheres_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'specifically_dependent_continuant' (snap:SpecificallyDependentContinuant) with ontological categories that are an 'independent_continuant' (snap:IndependentContinuant), and broader in the sense that it relates any borne subjects not just functions. - + - OBO_REL:function_of Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:function_of' only relates subjects that are a 'function' (snap:Function) with objects that are an 'independent_continuant' (snap:IndependentContinuant), so for example no processes. It does not define explicitly that the subject is a function of the object. + OBO_REL:function_of - OBO_REL:inheres_in - Is defined anywhere? Not in the 'unknown' version of RO. 'OBO_REL:inheres_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'specifically_dependent_continuant' (snap:SpecificallyDependentContinuant) with ontological categories that are an 'independent_continuant' (snap:IndependentContinuant), and broader in the sense that it relates any borne subjects not just functions. + true + In very unusual cases. - + @@ -694,14 +694,14 @@ - 'OBO_REL:participates_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'continuant' (snap:Continuant) with ontological categories that are a 'process' (span:Process), and broader in the sense that it relates any participating subjects not just outputs or output arguments. It is also not clear whether an output (result) actually participates in the process that generates it. OBO_REL:participates_in + 'OBO_REL:participates_in' is narrower in the sense that it only relates ontological categories (concepts) that are a 'continuant' (snap:Continuant) with ontological categories that are a 'process' (span:Process), and broader in the sense that it relates any participating subjects not just outputs or output arguments. It is also not clear whether an output (result) actually participates in the process that generates it. - In very unusual cases. true + In very unusual cases. @@ -741,8 +741,8 @@ - In very unusual cases. true + In very unusual cases. @@ -800,22 +800,22 @@ - EDAM does not distinguish a data record (a tool-understandable information artefact) from data or datum (its content, the tool-understandable encoding of an information). - Data record + EDAM does not distinguish the multiplicity of data, such as one data item (datum) versus a collection of data (data set). + Data set - + - EDAM does not distinguish the multiplicity of data, such as one data item (datum) versus a collection of data (data set). Datum + EDAM does not distinguish the multiplicity of data, such as one data item (datum) versus a collection of data (data set). - EDAM does not distinguish the multiplicity of data, such as one data item (datum) versus a collection of data (data set). - Data set + Data record + EDAM does not distinguish a data record (a tool-understandable information artefact) from data or datum (its content, the tool-understandable encoding of an information). - + @@ -5998,8 +5998,8 @@ - A protein entity has the MIRIAM data type 'UniProt', and an enzyme has the MIRIAM data type 'Enzyme Nomenclature'. UniProt|Enzyme Nomenclature + A protein entity has the MIRIAM data type 'UniProt', and an enzyme has the MIRIAM data type 'Enzyme Nomenclature'. @@ -9112,13 +9112,13 @@ - + - + beta12orEarlier @@ -18402,13 +18402,13 @@ - + - + beta12orEarlier @@ -20464,13 +20464,13 @@ - + - + Identifier of a lipid. @@ -23894,6 +23894,60 @@ + + + + Sequencing metadata name + + 1.15 + Data concerning a sequencing experiment, that may be specified as an input to some tool. + + + + + + + + + + Flow cell + + A flow cell is used to immobilise, amplify and sequence millions of molecules at once. In Illumina machines, a flowcell is composed of 8 "lanes" which allows 8 experiments in a single analysis. + An identifier of a flow cell of a sequencing machine. + 1.15 + + + + + + + + + + Lane + + An identifier of a lane within a flow cell of a sequencing machine, within which millions of sequences are immobilized, amplified and sequenced. + 1.15 + + + + + + + + + + Run + + 1.15 + A number corresponding to the number of an analysis performed by a sequencing machine. For exemple, if it's the 13th analysis, the run is 13. + + + + + + + @@ -26526,8 +26580,8 @@ - Data model A defined data format has its implicit or explicit data model, and EDAM does not distinguish the two. Some data models however do not have any standard way of serialisation into an exchange format, and those are thus not considered formats in EDAM. (Remark: even broader - or closely related - term to 'Data model' would be an 'Information model'.) + Data model @@ -29376,19 +29430,19 @@ - + - + - + BioXSD XML format @@ -29874,13 +29928,13 @@ - + - + Format of a bibliographic reference. @@ -32442,8 +32496,8 @@ pepXML - + http://sashimi.sourceforge.net/schema_revision/pepXML/pepXML_v118.xsd Open data format for the storage, exchange, and processing of peptide sequence assignments of MS/MS scans, intended to provide a common data output format for many different MS/MS search engines and subsequent peptide-level analyses. 1.12 @@ -33066,8 +33120,8 @@ experiments employing a combination of technologies. - Computational tool Computational tool provides one or more operations. + Computational tool @@ -33087,14 +33141,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -33424,14 +33478,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -33481,14 +33535,14 @@ experiments employing a combination of technologies. - - + + - - + + This might be a residue-level search for properties such as solvent accessibility, hydropathy, secondary structure, ligand-binding etc. @@ -33741,14 +33795,14 @@ experiments employing a combination of technologies. - - + + - - + + @@ -34045,14 +34099,14 @@ experiments employing a combination of technologies. - - + + - - + + Analyse experimental protein-protein interaction data from for example yeast two-hybrid analysis, protein microarrays, immunoaffinity chromatography followed by mass spectrometry, phage display etc. @@ -34101,14 +34155,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -34245,14 +34299,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -34306,7 +34360,7 @@ experiments employing a combination of technologies. - + @@ -34318,20 +34372,20 @@ experiments employing a combination of technologies. - - + + - - + + - - + + beta12orEarlier @@ -34391,14 +34445,14 @@ experiments employing a combination of technologies. - - + + - - + + Sequence distance matrix construction @@ -34545,14 +34599,14 @@ experiments employing a combination of technologies. - - + + - - + + @@ -34578,8 +34632,8 @@ experiments employing a combination of technologies. - - + + @@ -34590,8 +34644,8 @@ experiments employing a combination of technologies. - - + + Structural profile generation @@ -34611,20 +34665,20 @@ experiments employing a combination of technologies. - - + + - - + + - - + + Sequence profile alignment @@ -34646,14 +34700,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -34812,14 +34866,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -34837,14 +34891,14 @@ experiments employing a combination of technologies. - - + + - - + + @@ -34888,14 +34942,14 @@ experiments employing a combination of technologies. - - + + - - + + @@ -34929,26 +34983,26 @@ experiments employing a combination of technologies. - - + + - - + + - - + + - - + + Predict and/or optimize oligonucleotide probes for DNA microarrays, for example for transcription profiling of genes, or for genomes and gene families. @@ -34968,14 +35022,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -35061,12 +35115,15 @@ experiments employing a combination of technologies. + Metagenomic inference Expression profiling + The measurement of the expression of multiple genes in a cell, tissue, sample etc., in order to get an impression of biological function. Gene expression profile construction Functional profiling - Generate a gene expression profile or pattern, for example from microarray data. beta12orEarlier Gene expression profile generation + Metagenomic inference is the profiling of phylogenetic marker genes in order to predict metagenome function. + Gene expression profiling generates some sort of gene expression profile or pattern, for example from microarray data. @@ -35249,14 +35306,14 @@ experiments employing a combination of technologies. - - + + - - + + Phylogenetic trees are usually constructed from a set of sequences from which an alignment (or data matrix) is calculated. @@ -35400,13 +35457,13 @@ experiments employing a combination of technologies. - + - + Protein SNP mapping @@ -35443,14 +35500,14 @@ experiments employing a combination of technologies. - - + + - - + + Predict and optimise zinc finger protein domains for DNA/RNA binding (for example for transcription factors and nucleases). @@ -35468,14 +35525,14 @@ experiments employing a combination of technologies. - - + + - - + + beta12orEarlier @@ -35525,8 +35582,8 @@ experiments employing a combination of technologies. - - + + @@ -35537,8 +35594,8 @@ experiments employing a combination of technologies. - - + + Visualization @@ -35925,13 +35982,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - + - + @@ -36513,14 +36570,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Calculate pH-dependent properties from pKa calculations of a protein sequence. @@ -36781,14 +36838,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Sequence feature detection (nucleic acid) @@ -37878,14 +37935,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + beta12orEarlier @@ -38042,7 +38099,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Dotplot plotting - + @@ -38284,14 +38341,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + RNA secondary structure alignment generation @@ -38794,7 +38851,6 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Gene expression profile analysis true - Functional profiling beta12orEarlier Analyse one or more gene expression profiles, typically to interpret them in functional terms. 1.6 @@ -39093,14 +39149,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Identify a plausible model of DNA substitution that explains a molecular (DNA or protein) sequence alignment. @@ -39147,14 +39203,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Predict families of genes and gene function based on their position in a phylogenetic tree. @@ -39329,18 +39385,19 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + + Visualise, format or render a molecular sequence or sequences such as a sequence alignment, possibly with sequence features or properties shown. beta12orEarlier - Visualise, format or render a molecular sequence, possibly with sequence features or properties shown. + Sequence alignment visualisation Sequence rendering @@ -39352,25 +39409,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Sequence alignment visualisation - - - - - - - - - - - - - - - Sequence alignment rendering + + 1.15 + true Visualise, format or print a molecular sequence alignment. beta12orEarlier - - + + @@ -39420,19 +39465,13 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern RNA secondary structure visualisation - - - - - - - - - RNA secondary structure rendering + + true + 1.15 Visualise RNA secondary structure, knots, pseudoknots etc. beta12orEarlier - - + + @@ -39440,19 +39479,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - Protein secondary structure rendering Protein secondary structure visualisation - - - - - - - + Render and visualise protein secondary structure. beta12orEarlier - - + true + 1.15 + + @@ -39475,9 +39509,11 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - Structure rendering - Visualise or render a molecular tertiary structure, for example a high-quality static picture or animation. beta12orEarlier + Visualise or render molecular structure, for example a high-quality static picture or animation. This includes secondary structure such as knots, pseudoknots etc. as well as tertiary and quaternary structure. + Structure rendering + RNA secondary structure visualisation + Protein secondary structure visualisation @@ -40566,7 +40602,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Transmembrane protein visualisation - + @@ -40698,14 +40734,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Structure analysis (protein) @@ -40861,14 +40897,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + This is a broad concept and is used a placeholder for other, more specific concepts. @@ -40888,14 +40924,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Analyse known protein secondary structure data. @@ -41411,14 +41447,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Compare two or more molecular sequences. @@ -41615,14 +41651,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Identify or predict protein-protein interactions, interfaces, binding sites etc. @@ -41813,19 +41849,19 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - + - - + + - - + + Simulate molecular (typically protein) conformation using a computational model of physical forces and computer simulation. @@ -41911,14 +41947,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + Analyse nucleic acid tertiary structural data. @@ -41968,7 +42004,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Helical wheel drawing - + @@ -41988,7 +42024,7 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern Topology diagram drawing - + @@ -42604,14 +42640,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + beta12orEarlier @@ -42909,14 +42945,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + beta12orEarlier @@ -43009,14 +43045,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + beta12orEarlier @@ -43479,14 +43515,14 @@ sequences matching a given sequence motif or pattern, such as a Prosite pattern - - + + - - + + @@ -44370,13 +44406,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - + - + @@ -44495,14 +44531,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - - + + - - + + Generate a checksum of a molecular sequence. @@ -44554,13 +44590,13 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - + - + 1.4 @@ -44605,14 +44641,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - - + + - - + + Recognition of which format the given data is in. @@ -45484,12 +45520,6 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - - - - - - @@ -45500,6 +45530,12 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp + + + + + + 1.12 Identify semantic relationships within a text or between two or more texts using text mining techniques. @@ -45614,14 +45650,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp - - + + - - + + 1.12 @@ -46114,6 +46150,33 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp + + + + Cross-assembly + + 1.15 + Construction of a single sequence assembly of all reads from different samples, typically as part of a comparative metagenomic analysis. + Sequence assembly (cross-assembly) + + + + + + + + + + Sample comparison + + 1.15 + The comparison of samples from a metagenomics study, for example, by comparison of metagenome shotgun reads or assembled contig sequences, by comparison of functional profiles, or some other method. + + + + + + @@ -46583,12 +46646,14 @@ Trim sequences (typically from an automated DNA sequencer) to remove sequence-sp Proteomics beta12orEarlier + Metaproteomics Protein and peptide identification, especially in the study of whole proteomes of organisms. Protein and peptide identification Peptide identification Proteomics includes any methods (especially high-throughput) that separate, characterize and identify expressed proteins such as mass spectrometry, two-dimensional gel electrophoresis and protein microarrays, as well as in-silico methods that perform proteolytic or mass calculations on a protein sequence and other analyses of protein expression data, for example in different cells or tissues. true http://purl.bioontology.org/ontology/MSH/D040901 + Includes metaproteomics: proteomics analysis of an environmental sample. Protein expression diff --git a/HOW_TO_EDIT.md b/HOW_TO_EDIT.md index fb9967b..8163cee 100644 --- a/HOW_TO_EDIT.md +++ b/HOW_TO_EDIT.md @@ -128,6 +128,9 @@ Note that : - Exact synonym (`oboInOwl:hasExactSynonym`) - bog-standard synyonsm - Narrow synonym (`oboInOwl:hasNarrowSynonym`) - specialisms of the term - Broad synonym (`oboInOwl:hasBroadSynonym`) - generalisations of the term + +NB: Do **not** include American spellings or case variants as synyonyms. + - The **definition** should be a concise and lucid description of the concept, without acronyms, and avoiding jargon. - Peripheral but important information can go in the **comment** (`rdfs:comment`).