Python
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.idea Dialogue for selecting syntactic rules improved Jan 25, 2017
.settings Started implementation for new extraction method that will support Nov 9, 2016
Data Dialogue for selecting syntactic rules improved Jan 25, 2017
Datasets Dialogue for selecting syntactic rules improved Jan 25, 2017
DefaultSintacticRules Fixes Jul 15, 2016
DrugDrugInteraction Get COPD and Asthma articles from database. Jun 22, 2017
Screenshots Methodology diagram added Jul 13, 2017
Scripts Readme update Jul 11, 2017
Wizard Code updated Jul 13, 2017
.gitignore Dialogue for selecting syntactic rules improved Jan 25, 2017
.project Putting files to git Apr 7, 2016
.pydevproject Putting files to git Apr 7, 2016
AnalyzePattern.py Moving around folders drug-drug interactions extraction Oct 18, 2016
Balance.py data May 6, 2016
CheckInclEx.py Putting files to git Apr 7, 2016
CreateMLDatasetPatientNum.py Test dataset for gender prediction May 13, 2016
CreateTableDataset.py Gender labelled data May 11, 2016
ExtractContentFromAsthmaPapers.py Visual fixes in the wizard May 5, 2017
ExtractIEAtributeToCSV.py Changes in extracting more data (stub&header). Added file to transfor… May 12, 2017
ExtractUsingMetaMap.py More fixes Aug 24, 2016
GetAdverseEvents.py New datasets May 27, 2016
GetBMI.py something Apr 26, 2016
GetGender.py Get BMI fixes Apr 11, 2016
GetNumPatients.py Change of schema Apr 25, 2016
MLTest.py Putting files to git Apr 7, 2016
QueryDBClass.py Improvements Jul 11, 2017
QueryDBClass.pyc Started implementation for new extraction method that will support Nov 9, 2016
QueryDBClassESG.py Readme update Jul 11, 2017
README.md Readme updated. Jul 13, 2017
SMOSpecPrag.model Putting files to git Apr 7, 2016
TableClusters.py Putting files to git Apr 7, 2016
TableInOutStarter.sh Fix shell script Jul 6, 2017
TableLists.py Putting files to git Apr 7, 2016
TableLists.pyc Clear database function implemented Jun 13, 2016
Tests.py Putting files to git Apr 7, 2016
WIETGuide.pdf guide added Oct 12, 2016
getAge.py Age imporvement May 3, 2016

README.md

TabInOut (Table Information Out) - Framework for information extraction from tables

TabInOut is a framework for information extraction from tables and a GUI tool for generating information extraction rules from the tables in literature. The tool is dependent on TableDisentangler and actually presents the second step in the extraction pipeline. Firstly, tables are processed, disentangled and annotated using Tabledisentangler tool. TabInOut uses database created by TableAnnotator, uses all the functional and structural annotation performed by TableDisentangler in order to extract information from the tables. It also creates additional table in the mySQL database where it stores the extracted information.

The framework consists of:

  • Methodology and recipe for information extraction from tables
  • Language for describing syntactics of the cell content and assigning values to the cell content parts
  • A GUI wizard that makes describing information extraction task description easy

For more information view project's GitHub Wiki.

We are currently working on a paper that will present the methodology of TabInOut, however, it is based on case study and a hybrid approach already presented at BIOSTEC and BelBi conference. You can see and read relevant papers we published bellow.

The project is part of my PhD project funded by EPRSC and AstraZeneca.

The main application (Wizard) is located under Wizard folder. You can run it by starting TkGUIFirstScreen.py file. Alternatively you can start TableInOut wizard by running TableInOutStarter.sh from the main directory.

Relevant publications:

User guide

For more information about how to use and run TabInOut, please check our User Guide