Skip to content

hanqi-qi/HINT

Repository files navigation

HINT (SINE is older version)

Two stage process for long document classification and interpretable evaluation

  1. two paralle sentence representation learning modules:
    a) att-lstm b) tfidf prior vae
  2. full-connected GAT
Baselines:

VMASK
HAN

Datasets

IMDB, YELP, Guardian News

Interpretable evaluation

  1. soft metrics: Completeness and Sufficiency measuring the prediction changes after removing the important words.
    we create two folds for the text after removing the important words as cset, and another fold for extracted important words as sset. Calculation found in here
  2. hard metrics/aggrement with human annotated rationales.
    Input 1: annotated text spans Input 2: identified important words by model. Details found in Here These metrics include partial span-level match and token-level match.

Visualization (Highlight text by. attention weight)

We also conduct human evaluation for the model generated interpretations and they are displayed in seperate pdf (generated by latex) for each document. It contains explains at the word level, where the label-dependent words extracted by Context Representation Learning are highlighted in yellow, while the topic-associated words identified by Topic Representation Learning are highlighted in blue; alsp at the sentence level . So the python code for generate attention-highlight text latex can be as a good template. Code is vis_sine.py

Example is Here

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages