-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IAB version of uploaded dataset #2
Comments
Also, just wanted to confirm a couple more things:
|
Hi @thefirebanks, it's following IAB v1. Migrating to v2 is a huge effort and we have a small team. So we aborted the effort. I remember in the v2 release CSV file, they had (partial) mapping to v1 categories. Some documents are assigned multiple labels so the total number of unique documents is smaller than 1.16M. Yes, the tgz file is the sample evaluation dataset. Sry I can't provide the full eval set. |
Got it, thank you!!! I can confirm that there is a mapping in the second sheet of the v2 Taxonomy excel file |
Hey @YipingNUS ! Excellent work here. I was looking at the IAB content taxonomy website and I see that they have released up to version 3.0. When I look at old versions (say 2.0), the number of tier 1 categories is around 35 and 560 for tier 2. However, in your paper (and in the data), there are only 23 tier 1 categories and 354 tier 2 categories.
Which version of the IAB taxonomy did you use for this data? 1.0? When I look at the website, it says that version 1.0 is deprecated, but if this is the case, do you know if there's a mapping between the categories in version 1.0 and 2.0? Thanks!
The text was updated successfully, but these errors were encountered: