🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)
-
Updated
May 26, 2024
🧫 A curated list of resources relevant to doing Biomedical Information Extraction (including BioNLP)
An R package with over 50 highly cited, read-to-use, up-to-date COVID-19 pandemic data resources
BERT finetuned on NER downstream tasks
Measuring and visualizing biomedical data variability/heterogeneity across data sources
Three different basic data analysis processes of biomedical data for Python. Level: beginner (~200 lines of pure code).
Bioinformatics Classifier Project — TCGA BRCA Dataset. Exploratory analysis and machine learning classification on TCGA BRCA gene expression data, focusing on PAM50 breast cancer subtypes.
Multiclass classification of breast cancer subtypes using gene expression profiles. Evaluated and compared multiple models (Logistic Regression, Random Forest, HistGradientBoosting) using classification metrics, confusion matrices, and ROC-AUC analysis with Youden’s J statistic on synthetically generated data
A lightweight Flask application for CSV upload, tabular preview, and basic data visualisation using Pandas and Matplotlib. Final project for CS50x.
A lightweight R script for text mining and harmonizing medical phenotype data. Cleans, standardizes, and maps diagnoses to ICD-10 codes, with clinical annotations for enhanced data usability.
Multiclass classification of breast cancer subtypes using synthetic gene expression data. Refactored code to use a single function for model evaluation across Logistic Regression, Random Forest, and HistGradientBoosting, including metrics and ROC-AUC with Youden’s J statistic.
Add a description, image, and links to the biomedical-data topic page so that developers can more easily learn about it.
To associate your repository with the biomedical-data topic, visit your repo's landing page and select "manage topics."