Document PyKEEN's performance tweaks #56

cthoyt · 2020-07-21T10:45:33Z

Closes #55

This documentation should be a high-level explanation to non-technical users about what is in PyKEEN that makes it better than others. First thing to start with would be copying over the stuff written for the software paper by @lvermue

Aligned entity/relation ID and embeddings vector position
Automatic memory optimization
Sub-batching
Improvements enabled by separate implementations of score_ht, score_rt, score_hr, score_hrt
Filtering with index-based masking
~~Fast OWA~~(Not implemented yet)

Any other ideas?

References #55

lvermue · 2020-07-24T23:48:12Z

@mberr I think that this covers most of what we've done so far.
There definitely has to be done some fine-tuning to the formatting and notation.
Anyway, any comments are welcome :)

@cthoyt Your scrutiny is welcome as well :)

Update description of TriplesFactory

mberr · 2020-07-25T12:25:53Z

I revised the first part about the TriplesFactory / ID-based triple representation. It would be nice to properly link to the properties of TriplesFactory such as entity_label_to_id, etc.

mberr · 2020-07-25T12:27:25Z

I am not sure whether this is the correct place, but somewhere we should also highlight the risk of manually modifying label-to-ID mappings, and the necessity of keeping the mappings consistent between train/test/validation.

cthoyt · 2020-07-25T20:55:37Z

@mberr ive written a bit about this in the "bring your own data" tutorial and also made some improvements to it in #54 that haven't yet been merged

I made some improvements to the language, and also started to improve the notation (it was pretty confusing before with all of the stars, since RST interpreted them as italics)

cthoyt · 2020-08-06T12:45:37Z

@lvermue I made edits for clarity and improved the notation used to be a bit more consistent. Maybe you want to look into the algorithm at the bottom of "Filtering with Index-based Masking" to decide if you think this needs more notation, or if it's okay just as text

cthoyt · 2020-08-12T14:38:42Z

@mali-git @mberr thanks for looking at this, but lets still wait for @lvermue to see if he's happy

Add beginning of outline

1daf603

References #55

cthoyt assigned lvermue and mberr Jul 21, 2020

lvermue added 7 commits July 24, 2020 18:50

Add Tuple broadcasting performance improvement description

8b14f51

Add sub-batching performance description

35699d5

Update sub-batching description

2d7362c

Add automated memory optimization description

c0ab111

Add aligned entity/relation ID performance description

c7c78db

Add filtering with index-based masking performance description

e137213

Merge branch 'master' into add-peformance-explanation

aa4e51c

Update performance.rst

dddffdb

Update description of TriplesFactory

cthoyt added 2 commits July 26, 2020 16:46

Update performance.rst

f74755f

Update performance.rst

bb5e6f5

cthoyt added the documentation Improvements or additions to documentation label Jul 26, 2020

lvermue and others added 3 commits July 27, 2020 01:07

Update formatting

af912a3

Revision 1

779561a

I made some improvements to the language, and also started to improve the notation (it was pretty confusing before with all of the stars, since RST interpreted them as italics)

Pass doc8

ccabcc0

cthoyt marked this pull request as ready for review August 6, 2020 12:44

cthoyt requested a review from mberr August 6, 2020 12:44

cthoyt requested review from lvermue and mali-git August 6, 2020 12:45

mali-git approved these changes Aug 12, 2020

View reviewed changes

mberr approved these changes Aug 12, 2020

View reviewed changes

Merge branch 'master' into add-peformance-explanation

c574f3e

cthoyt added this to the PyKEEN 1.0.3 milestone Aug 12, 2020

Grammar

631e244

lvermue approved these changes Aug 12, 2020

View reviewed changes

Merge branch 'master' into add-peformance-explanation

67b560d

cthoyt merged commit 22f8815 into master Aug 12, 2020

cthoyt deleted the add-peformance-explanation branch August 12, 2020 23:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document PyKEEN's performance tweaks #56

Document PyKEEN's performance tweaks #56

cthoyt commented Jul 21, 2020 •

edited by lvermue

Loading

lvermue commented Jul 24, 2020 •

edited

Loading

mberr commented Jul 25, 2020

mberr commented Jul 25, 2020

cthoyt commented Jul 25, 2020 •

edited

Loading

cthoyt commented Aug 6, 2020

cthoyt commented Aug 12, 2020

Document PyKEEN's performance tweaks #56

Document PyKEEN's performance tweaks #56

Conversation

cthoyt commented Jul 21, 2020 • edited by lvermue Loading

lvermue commented Jul 24, 2020 • edited Loading

mberr commented Jul 25, 2020

mberr commented Jul 25, 2020

cthoyt commented Jul 25, 2020 • edited Loading

cthoyt commented Aug 6, 2020

cthoyt commented Aug 12, 2020

cthoyt commented Jul 21, 2020 •

edited by lvermue

Loading

lvermue commented Jul 24, 2020 •

edited

Loading

cthoyt commented Jul 25, 2020 •

edited

Loading