You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Feb 13, 2021. It is now read-only.
We were trying out to mine wikipedia using this shell script for our entity linker using the dump for 2018/05/01. We were able to generate the hash file but surprisingly the file size was 284 MB. In contrast, the pre-trained model provided has a file size of 1.3G for English Hash trained from November 2015 Wikipedia
@aasish could you suggest what might be happening wrong. Is it because of the compression or are we missing out on some entities? Is there a way that we could combine both the hash files so that we can take into account the recent entities.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Thank you for providing the code.
We were trying out to mine wikipedia using this shell script for our entity linker using the dump for 2018/05/01. We were able to generate the hash file but surprisingly the file size was 284 MB. In contrast, the pre-trained model provided has a file size of 1.3G for English Hash trained from November 2015 Wikipedia
@aasish could you suggest what might be happening wrong. Is it because of the compression or are we missing out on some entities? Is there a way that we could combine both the hash files so that we can take into account the recent entities.
The text was updated successfully, but these errors were encountered: