GitHub is home to over 36 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
COBALAB version of the original evaluation script for the MEDDOCAN task.
[PlanTL/medicine/document/NLP preprocessing] Software to convert PDF files into HTML, TXT or XML files and to normalize EHRs.
Miscellaneous utilities and scripts.
[PlanTL/medicine/word embeddings] Word embeddings generated from Spanish corpora.
[PlanTL/medicine/document annotation/NLP preprocessing/part-of-speech] Part-of-Speech Tagger for medical domain corpus in Spanish based on FreeLing.
Official evaluation script of the Medical Document Anonymization (MEDDOCAN) task.
Script to convert files between MEDDOCAN-Brat, MEDDOCAN-XML, and i2b2 formats.
[PlanTL/medicine/annotated corpus/guidelines/anonimization] protected health information annotations in the Spanish Clinical Case Corpus.
[PlanTL/medicine/document annotation/NLP preprocessing/sentence splitter] Sentence splitting model created using the Apache OpenNLP machine learning toolkit
[PlanTL/medicine/document annotation/NLP preprocessing/tokenization] Tokenization model created using the Apache OpenNLP machine learning toolkit.
[PlanTL/medicine/terminological resource retrieval] A medical negated terms extraction tool.
[PlanTL/medicine/document annotation/negation] Negation detector for Spanish clinical texts based on Wendy Chapman's NegEx algorithm.
[Plan TL/medicine/lexical/terminological resource] A Spanish Medical Abbreviation DataBase.
[PlanTL/medicine/semantic annotation] Software used to generate the Spanish Medical Abbreviation DataBase (AbreMES-DB).
[PlanTL/medicine/terminological resource retrieval] A multilingual medical term extraction tool.
[PlanTL/medicine/document annotation//time] HeidelTime grammar for temporal tagging of Spanish Electronic Health Records (EHR).
[PlanTL/medicine/annotated corpus/guidelines/Part-of-Speech] Part-of-Speech annotations in the Spanish Clinical Case Corpus.
[PlanTL/medicine/annotated corpus/guidelines/tokenization] Tokenization annotations in the Spanish Clinical Case Corpus
[PlanTL/medicine/annotated corpus/guidelines/sentence splitting] Sentence splitting annotations in the Spanish Clinical Case Corpus.
[PlanTL/medicine/document] Spanish Clinical Case Corpus
[PlanTL/medicine/annotated corpus/guidelines/named entities] Annotations of chemical mentions, drugs and biosimilars with therapeutic relevance in clinical reports in the Spanish Clinical Case Corpus.
[PlanTL/medicine/annotated corpus/anonimization] This software masks Protected Health Information (PHI) in documents that have been previously annotated with Brat.
[PlanTL/medicine/dataset generation/retrieval] Crawler to download all the publications written in Spanish from the Spanish SciELO server.
[PlanTL/medicine/neural machine translation/translation models] Files needed to use the Neural Machine Translation system for the Biomedical Domain.
[PlanTL/medicine/lexical/terminological resource] Bilingual medical glossaries for various language pairs.
[PlanTL/medicine/annotated corpus/guidelines/named entities] Medical concept and named entity annotations in the Spanish Clinical Case Corpus.