Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset about IMDB #4

Closed
Theheavens opened this issue Aug 20, 2021 · 5 comments
Closed

Dataset about IMDB #4

Theheavens opened this issue Aug 20, 2021 · 5 comments

Comments

@Theheavens
Copy link

Theheavens commented Aug 20, 2021

Dataset IMDB has multi labels. But I am wondering how to save the file for evaluation, cause I don't see it in run_multi.py. And else, I want to know what API in sklearn to evaluate multi-label.

@Theheavens Theheavens changed the title Dataset Dataset about IMDB Aug 20, 2021
@Theheavens
Copy link
Author

info.dat in IMDB, #L8160 "meaning" maybe is 'movie->actor'.
info.dat in ACM, the 'Attribute Dimension' of author is not zero, right?

@1049451037
Copy link
Member

You can see here for reference to generate prediction file for IMDB: https://github.com/THUDM/HGB/blob/master/NC/benchmark/methods/baseline/run_multi.py#L151

@1049451037
Copy link
Member

Yes, #L8160 is 'movie->actor'. It is a typo. The info.dat files contain some information that we haven't used so far. To load data and check the attribute dimension, you can use our data_loader. Detailed document is on the way... You can see the code in baseline folder as example to use. Sorry for the inconvenience.

@Theheavens
Copy link
Author

Can we get more chances to submit results everyday?

@1049451037
Copy link
Member

We recommend to tune performance on validation set, which can save your submission budget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants