remove rank entries for continuous spans on import #317

thomaskrause · 2014-05-06T15:47:41Z

For each spanning relation (thus a span covering a token) there is an entry in the rank table with the edge. This information is never used in the query generation, but when re-constructing the graph.

For continuous spans the graph can be re-constructed without this explicit storage of the edges (using the left/right_token_index). We should remove the unnecessary rank entries on import.

This will reduce the size of the facts table and it's indexes dramatically on corpora that only contain spans.

thomaskrause added this to the 3.2.0 milestone May 6, 2014

thomaskrause added the enhancement label May 6, 2014

thomaskrause self-assigned this May 6, 2014

thomaskrause added a commit that referenced this issue May 6, 2014

remove rank entries for continuous spans on import (#317)

778fa7a

thomaskrause closed this as completed May 6, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove rank entries for continuous spans on import #317

remove rank entries for continuous spans on import #317

thomaskrause commented May 6, 2014

remove rank entries for continuous spans on import #317

remove rank entries for continuous spans on import #317

Comments

thomaskrause commented May 6, 2014