Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🔬🔁 Evaluation loop #768

Merged
merged 57 commits into from
May 25, 2022
Merged

🔬🔁 Evaluation loop #768

merged 57 commits into from
May 25, 2022

Conversation

mberr
Copy link
Member

@mberr mberr commented Feb 2, 2022

This PR adds an evaluation loop based upon torch's data loaders, and delegates the automatic batch size optimization to torch-max-mem. It also brings support for relation prediction evaluation.

@cthoyt cthoyt added this to the PyKEEN v1.9.0 milestone Feb 13, 2022
@mberr
Copy link
Member Author

mberr commented May 25, 2022

target-specific evaluation datasets have been moved to a new branch, cf. evaluation-loop-2, as they require additional changes to the evaluation to enable separate, independently optimized batch sizes for different targets.

@mberr mberr changed the title WIP: Evaluation loop 🔬🔁 Evaluation loop May 25, 2022
@mberr mberr marked this pull request as ready for review May 25, 2022 14:22
@cthoyt
Copy link
Member

cthoyt commented May 25, 2022

@mberr lgtm but would be nice to get a second reviewer

@mberr mberr requested a review from migalkin May 25, 2022 16:00
@migalkin
Copy link
Member

migalkin commented May 25, 2022

Tried the branch on the ILPC codebase - it works, can reproduce the numbers 👍
My only small wish (which is more about result tracking and hot this PR) is that the default console result tracker dumps all 100+ metrics in the console after each eval step and it's kinda too much, I'd like to have a way to set up seeing maybe only 1-3 representative metrics like realistic hits@10 and inverse harmonic mean rank

@mberr
Copy link
Member Author

mberr commented May 25, 2022

Tried the branch on the ILPC codebase - it works, can reproduce the numbers 👍 My only small wish (which is more about result tracking and hot this PR) is that the default console result tracker dumps all 100+ metrics in the console after each eval step and it's kinda too much, I'd like to have a way to set up seeing maybe only 1-3 representative metrics like realistic hits@10 and inverse harmonic mean rank

Did you know about the metric_filter parameter of the ConsoleTracker? 😇 e.g.

from pykeen.pipeline import pipeline

result = pipeline(
    dataset="nations",
    model="mure",
    result_tracker="console",
    result_tracker_kwargs=dict(metric_filter=r"both.realistic.(hits_at_10|inverse_harmonic_mean_rank)"),
)

will only print realistic H@10 and MRR averaged over head & tail.

@mberr mberr merged commit 605ebec into master May 25, 2022
@mberr mberr deleted the evaluation-loop branch May 25, 2022 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants