VLDB-2019/8-TextCube: Automated Construction and Multidimensional Exploration #285

BrambleXu · 2019-11-08T06:13:33Z

Summary:

提出了一个TextCube的数据结构框架。讲了为了做到自动化构建这个框架，用到了哪些技术。#275 的团队。

Resource:

pdf
[code](
[paper-with-code](

Paper information:

Author:
Dataset:
keywords:

Notes:

TextCube provides a critical information organization structure, enhancing text
exploration and analysis for various applications.

We focus on new TextCube construction methods that are scalable, weakly-supervised, domain-independent, language-agnostic, and effective (i.e., generating quality TextCubes from large corpora of various domains).

Module I. Mining Structural Primitives from Text: Phrases, Entities and Relations

AutoPhrase
AutoNER
ReMine [58] which extracts high-confidence relational phrases from domain-specific texts in an end-to-end manner.

Module II. Automated Construction of TextCubes

Taxonomy construction: Taxonomy construction clusters similar concepts and generates a hierarchy of “concept clusters” from massive corpus。模型：TaxoGen [53], a recursive framework that leverages word distributional representations and constructs cluster-based taxonomy using adaptive spherical clustering and local embedding
Embedding learning: serve as the preliminary to document classification and TextCube construction. 模型：JoSE. , an unsupervised text embedding framework that jointly learns word embedding and paragraph embedding by incorporating both local and global contexts to capture more complete text semantics, and present TopicMine [24], a category-name guided word embedding framework that endows word embedding with discriminative power over the specific set of categories
Supervised methods: for text cube construction。 We present how to adapt the supervised methods for text cube construction along with their strength and drawbacks.
Weakly-supervised methods: WeSTClass [25] and WeSHClass [26], which generate pseudo training data for neural classifier pre-training, and then bootstrap the classifier by selftraining on unlabeled documents.

Module III. Multi-Dimensional Exploration of TextCubes

TextCube facilitates multidimensional text analysis

Cube-based multidimensional analysis:
Text summarization:

Model Graph:

Result:：

Thoughts:

Next Reading:

BrambleXu self-assigned this Nov 8, 2019

BrambleXu added the NER(T) Named Entity Recognition Task label Nov 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VLDB-2019/8-TextCube: Automated Construction and Multidimensional Exploration #285

VLDB-2019/8-TextCube: Automated Construction and Multidimensional Exploration #285

BrambleXu commented Nov 8, 2019 •

edited

VLDB-2019/8-TextCube: Automated Construction and Multidimensional Exploration #285

VLDB-2019/8-TextCube: Automated Construction and Multidimensional Exploration #285

Comments

BrambleXu commented Nov 8, 2019 • edited

Module I. Mining Structural Primitives from Text: Phrases, Entities and Relations

Module II. Automated Construction of TextCubes

Module III. Multi-Dimensional Exploration of TextCubes

BrambleXu commented Nov 8, 2019 •

edited