Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment: use a simplified ragged type for performance #10319

Merged
merged 6 commits into from
Apr 1, 2022

Conversation

danieldk
Copy link
Contributor

@danieldk danieldk commented Feb 17, 2022

Description

This introduces the AlignmentArray type, which is a simplified version of Ragged that performs better on the simple(r) indexing performed for alignment.

When training a biaffine parser pipeline with morphology and tagging on a GPU, this speeds up training by about 19%. This should improve performance for all TrainablePipes that use get_alignment to get gold standard annotations.

Types of change

Performance improvement.

Checklist

  • I confirm that I have the right to submit this contribution under the project's MIT license.
  • I ran the tests, and all new and existing tests passed.
  • My changes don't require a change to the documentation, or if they do, I've added all required information.

This introduces the AlignmentArray type, which is a simplified version
of Ragged that performs better on the simple(r) indexing performed for
alignment.
@danieldk danieldk added the perf / speed Performance: speed label Feb 17, 2022
@svlandeg svlandeg added the feat / training Feature: Training utils, Example, Corpus and converters label Feb 21, 2022
@danieldk danieldk marked this pull request as ready for review March 23, 2022 12:37
@danieldk
Copy link
Contributor Author

Out of draft: trained a transformer model and didn't see any regressions in accuracy.

@danieldk danieldk marked this pull request as draft March 29, 2022 06:20
@danieldk danieldk marked this pull request as ready for review March 29, 2022 08:03
@adrianeboyd
Copy link
Contributor

Also tested on a full set of trained pipeline builds.

@adrianeboyd adrianeboyd merged commit c90dd6f into explosion:master Apr 1, 2022
@danieldk danieldk deleted the alignment-array branch April 1, 2022 07:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / training Feature: Training utils, Example, Corpus and converters perf / speed Performance: speed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants