BiomedicalTextMining

Text Mining Framework that recognizes various biomedical entities when connected to a database like PubMed or DrugBank using Machine learning algorithms such as CRF and naive Bayes.

General Info

Interactive biomedical text mining framework built with python to access the scientific abstracts online, to pre-process and to extract drug entity through state of art machine learning techniques such as CRF and naive bayes. This framework facilitates recognizing new drug names from real time biomedical abstracts from PubMed as well as from existing benchmark test datasets such as the DDI2013 Corpus. This project was built for the chemistry department, Karunya Institute of Technology and Sciences to facilitate their quick reseeach into newer drugs for cancer by scouring medical databases and extracting new drug names relating to cancer alleviation.

The GUI was built with python's tkinter package and data for training and testing the machine learning algorithms were obtained from PubMed's API.

Application GUI

Technologies

Python
Tkinter
scikit-learn

Setup

Clone the repository and execute the Gui.py file to lauch the application. Make sure you install the dependencies such as tkinter & scikit-learn. You will also need an internet connection to access the biomedical databases like PubMed through the GUI.

To compare the Naive bayes and CRF algorithms, run their individual file separately.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
__pycache__		__pycache__
images		images
n3-collection-master/n3-collection-master		n3-collection-master/n3-collection-master
venv		venv
.txt		.txt
123.xml		123.xml
29045825.txt		29045825.txt
29080264.txt		29080264.txt
29125975.txt		29125975.txt
29136092.txt		29136092.txt
29159606.txt		29159606.txt
29181270.txt		29181270.txt
29222727.txt		29222727.txt
29258264.xml		29258264.xml
29272820.txt		29272820.txt
29293308.txt		29293308.txt
29306441.txt		29306441.txt
29306743.txt		29306743.txt
29351014.txt		29351014.txt
29351285.txt		29351285.txt
29351650.txt		29351650.txt
29357288.txt		29357288.txt
29372225.txt		29372225.txt
29374427.txt		29374427.txt
29376717.txt		29376717.txt
29385622.txt		29385622.txt
29628139.txt		29628139.txt
Abatacept.csv		Abatacept.csv
Acarbose_feature.txt		Acarbose_feature.txt
CDR_TrainingSet.BioC.xml		CDR_TrainingSet.BioC.xml
DataAnalysis.py		DataAnalysis.py
Gui.py		Gui.py
Input.txt		Input.txt
README.md		README.md
Stemmer.py		Stemmer.py
combine.txt		combine.txt
combined_file.xml		combined_file.xml
combined_file.xml.bak		combined_file.xml.bak
combinexml.py		combinexml.py
conf_f.yaml		conf_f.yaml
crf.model		crf.model
dnr.py		dnr.py
drugfiles.txt		drugfiles.txt
edited_testfile.txt		edited_testfile.txt
extract xml.py		extract xml.py
feature extraction.py		feature extraction.py
features.py		features.py
labelleddata.txt		labelleddata.txt
labelledtaggeddata.txt		labelledtaggeddata.txt
lemmatizer.py		lemmatizer.py
naivebayes		naivebayes
naivebayes.py		naivebayes.py
ner.py		ner.py
new.py		new.py
new.xml		new.xml
newfaetures.txt		newfaetures.txt
newfeature.txt		newfeature.txt
newfeatures.txt		newfeatures.txt
orchid_rpl16.fasta		orchid_rpl16.fasta
pos tagger.py		pos tagger.py
punctuation_remove.py		punctuation_remove.py
sample.txt		sample.txt
sycamore.txt		sycamore.txt
tagdriver.py		tagdriver.py
test.py		test.py
testfile.txt		testfile.txt
testnaive.py		testnaive.py
trainandtest.py		trainandtest.py
x_test.txt		x_test.txt
x_test1.txt		x_test1.txt
x_train.txt		x_train.txt
y_test.txt		y_test.txt
y_train.txt		y_train.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BiomedicalTextMining

Table of contents

General Info

Application GUI

Technologies

Setup

About

Releases

Packages

Languages

ChuksXD/BiomedicalTextMining

Folders and files

Latest commit

History

Repository files navigation

BiomedicalTextMining

Table of contents

General Info

Application GUI

Technologies

Setup

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages