Skip to content
/ TabText Public

TabText: a Systematic Approach to Aggregate Knowledge Across Tabular Data Structures

License

Notifications You must be signed in to change notification settings

kimvc7/TabText

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TabText

TabText: A Flexible and Contextual Approach to Tabular Data Representation

This code corresponds to the paper TabText: https://arxiv.org/abs/2110.15829 by Kimberly Villalobos Carballo, Liangyuan Na, Yu Ma, Léonard Boussioux, Cynthia Zeng, Luis R. Soenksen and Dimitris Bertsimas.

This paper presents a systematic framework that leverages language to extract contextual information from tabular structures, resulting in more complete data representations. We investigate the impact of several language syntactic parsing schemes on the performance of TabText representations, and we show the effectiveness of using TabText for labor-consuming data preprocessing. Our experiments demonstrate that augmenting tabular data with our TabText representations can improve the AUC score by up to 6% across nine healthcare classification tasks.

About

TabText: a Systematic Approach to Aggregate Knowledge Across Tabular Data Structures

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published