Skip to content

Commit

Permalink
Update Readme
Browse files Browse the repository at this point in the history
  • Loading branch information
NicoRiedel committed Oct 7, 2020
1 parent c86abdb commit f60127d
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@ Individual keyword categories:
| UPON_REQUEST | Phrase describing that data are only available upon request | ("upon request" OR "on request" OR "upon reasonable request") |
| ALL_DATA | Set of words describing all data or raw data | ("all data" OR "all array data" OR "raw data" OR "full data set" OR "full dataset" OR "crystallographic data" OR "subject-level data") |
| NOT_DATA | Set of negations of the data phrases | ("not all data" OR "not all array data" OR "no raw data" OR "no full data set" OR "no full dataset") |
| FIELD_SPECIFIC_REPO | Set of names and abbreviations of field-specific repositories | ("GEO" OR "Gene Expression Omnibus" OR "European Nucleotide Archive" OR "National Center for Biotechnology Information" OR "European Molecular Biology Laboratory" OR "EMBL-EBI" OR "BioProject" OR "Sequence Read Archive" OR "SRA" OR "ENA" OR "MassIVE" OR "ProteomeXchange" OR "Proteome Exchange" OR "ProteomeExchange" OR "MetaboLights" OR "Array-Express" OR "ArrayExpress" OR "Array Express" OR "PRIDE" OR "DNA Data Bank of Japan" OR "DDBJ" OR "Genbank" OR "Protein Databank" OR "Protein Data Bank" OR "PDB" OR "Metagenomics Rapid Annotation using Subsystem Technology" OR "MG-RAST" OR "metabolights" OR "OpenAgrar" OR "Open Agrar" OR "Electron microscopy data bank" OR "emdb" OR "Cambridge Crystallographic Data Centre" OR "CCDC" OR "Treebase" OR "dbSNP" OR "dbGaP" OR "IntAct" OR "ClinVar" OR "accession number" OR "accession code" OR "accession numbers" OR "accession codes") |
| ACCESSION_NR | Set of regular expressions that represent the accession number formats of different (biomedicine-related) repositories | ("G(SE\|SM\|DS\|PL)[[:digit:]]{2,}" OR "PRJ(E\|D\|N\|EB\|DB\|NB)[:digit:]+" OR "SAM(E\|D\|N)[A-Z]?[:digit:]+" OR "[A-Z]{1}[:digit:]{5}" OR "[A-Z]{2}[:digit:]{6}" OR "[A-Z]{3}[:digit:]{5}" OR "[A-Z]{4,6}[:digit:]{7,9}" OR "GCA_[:digit:]{9}\\.[:digit:]+" OR "PRJNA[[:digit:]]{3,}" OR "SR(P\|R\|X\|S\|Z)[[:digit:]]{3,}" OR "E-[A-Z]{4}-[:digit:]{1,}" OR "[:digit:]{1}[A-Z]{1}[[:alnum:]]{2}" OR "MTBLS[[:digit:]]{2,}" OR "10.17590" OR "10.5073" OR "EMD-[[:digit:]]{4,}" OR "[[:digit:]]{7}" OR "[A-Z]{2}_[:digit:]{6,}" OR "[A-Z]{2}-[:digit:]{4,}") |
| FIELD_SPECIFIC_REPO | Set of names and abbreviations of field-specific repositories | ("GEO" OR "Gene Expression Omnibus" OR "European Nucleotide Archive" OR "National Center for Biotechnology Information" OR "European Molecular Biology Laboratory" OR "EMBL-EBI" OR "BioProject" OR "Sequence Read Archive" OR "SRA" OR "ENA" OR "MassIVE" OR "ProteomeXchange" OR "Proteome Exchange" OR "ProteomeExchange" OR "MetaboLights" OR "Array-Express" OR "ArrayExpress" OR "Array Express" OR "PRIDE" OR "DNA Data Bank of Japan" OR "DDBJ" OR "Genbank" OR "Protein Databank" OR "Protein Data Bank" OR "PDB" OR "Metagenomics Rapid Annotation using Subsystem Technology" OR "MG-RAST" OR "metabolights" OR "OpenAgrar" OR "Open Agrar" OR "Electron microscopy data bank" OR "emdb" OR "Cambridge Crystallographic Data Centre" OR "CCDC" OR "Treebase" OR "dbSNP" OR "dbGaP" OR "IntAct" OR "ClinVar" OR "European Variation Archive" OR "dbVar" OR "Mgnify" OR "NCBI Trace Archive" OR "NCBI Assembly" OR "UniProtKB" OR "Protein Circular Dichroism Data Bank" OR "PCDDB" OR "Crystallography Open Database" OR "Coherent X-ray Imaging Data Bank" OR "CXIDB" OR "Biological Magnetic Resonance Data Bank" OR "BMRB" OR "Worldwide Protein Data Bank" OR "wwPDB" OR "Structural Biology Data Grid" OR "NeuroMorpho" OR "G-Node" OR "Neuroimaging Informatics Tools and Resources Collaboratory" OR "NITRC" OR "EBRAINS" OR "GenomeRNAi" OR "Database of Interacting Proteins" OR "IntAct" OR "Japanese Genotype-phenotype Archive" OR "Biological General Repository for Interaction Datasets" OR "PubChem" OR "Genomic Expression Archive" OR "PeptideAtlas" OR "Environmental Data Initiative" OR "LTER Network Information System Data Portal" OR "Global Biodiversity Information Facility" OR "GBIF" OR "Integrated Taxonomic Information System" OR "ITIS" OR "Knowledge Network for Biocomplexity" OR "Morphobank" OR "Kinetic Models of Biological Systems" OR "KiMoSys" OR "The Network Data Exchange" OR "NDEx" OR "FlowRepository" OR "ImmPort" OR "Image Data Resource" OR "Cancer Imaging Archive" OR "SICAS Medical Image Repository" OR "Coherent X-ray Imaging Data Bank" OR "CXIDB" OR "Cell Image Library" OR "Eukaryotic Pathogen Database Resources" OR "EuPathDB" OR "Influenza Research Database" OR "Mouse Genome Informatics" OR "Rat Genome Database" OR "VectorBase" OR "Xenbase" OR "Zebrafish Model Organism Database" OR "ZFIN" OR "HIV Data Archive Program" OR "NAHDAP" OR "National Database for Autism Research" OR "NDAR" OR "PhysioNet" OR "National Database for Clinical Trials related to Mental Illness" OR "NDCT" OR "Research Domain Criteria Database" OR "RdoCdb" OR "Synapse" OR "UK Data Service" OR "caNanoLab" OR "ChEMBL" OR "IoChem-BD" OR "Computational Chemistry Datasets" OR "STRENDA" OR "European Genome–phenome Archive" OR "European Genome phenome Archive" OR "accession number" OR "accession code" OR "accession numbers" OR "accession codes") |
| ACCESSION_NR | Set of regular expressions that represent the accession number formats of different (biomedicine-related) repositories | ("G(SE\|SM\|DS\|PL)[[:digit:]]{2,}" OR "PRJ(E\|D\|N\|EB\|DB\|NB)[:digit:]+" OR "SAM(E\|D\|N)[A-Z]?[:digit:]+" OR "[A-Z]{1}[:digit:]{4}" OR "[A-Z]{2}[:digit:]{6}" OR "[A-Z]{3}[:digit:]{5}" OR "[A-Z]{4,6}[:digit:]{3,}" OR "GCA_[:digit:]{9}\\.[:digit:]+" OR "SR(P\|R\|X\|S\|Z)[[:digit:]]{3,}" OR "(E\|P)-[A-Z]{4}-[:digit:]{1,}" OR "[:digit:]{1}[A-Z]{1}[[:alnum:]]{2}" OR "MTBLS[[:digit:]]{2,}" OR "10.17590" OR "10.5073" OR "10.25493" OR "10.6073" OR "10.15468" OR "10.5063" OR "[[:digit:]]{6}" OR "[A-Z]{2,3}_[:digit:]{5,}" OR "[A-Z]{2,3}-[:digit:]{4,}" OR "[A-Z]{2}[:digit:]{5}-[A-Z]{1}" OR "DIP:[:digit:]{3}" OR "FR-FCM-[[:alnum:]]{4}" OR "ICPSR [:digit:]{4}" OR "SN [:digit:]{4}") |
| REPOSITORIES | Set of names of general-purpose repositories | ("figshare" OR "dryad" OR "zenodo" OR "dataverse" OR "DataverseNL" OR "osf" OR "open science framework" OR "mendeley data" OR "GIGADB" OR "GigaScience database" OR "OpenNeuro") |
| FILE_FORMATS | Set of file formats | ("csv" OR "zip" OR "xls" OR "xlsx" OR "sav" OR "cif" OR "fasta") |
| GITHUB | Github for data has to be treated differently, as we need additional information that data and not only code was shared on Github | (“github”) |
Expand Down

0 comments on commit f60127d

Please sign in to comment.