-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make: 2019-03 enwiki fails during wiki.page_link with "A fatal error has been detected" #396
Comments
Strange This is to keep the build of commons and its categories under the 10 attach table limit |
Oops. Wrote down the wrong table above. Should be pagelinks db, not category db. Note that pagelinks will always have the most number of records (keeps track of all links from one page to another). @desb42 Congrats on building enwiki (and thanks for the issues; will go through them later, but am spending part of weekend to release new XOWA version). I'm sort of surprised this didn't fail for you, but given that the error is in the sqlite lib (somewhere in C, not Java), then it's possible that this is an OS related issue. For comparison, the above error occurred on an openSUSE 13 system. It took me a few attempts, but I finally got the code working. Part of the problem was that XOWA was indexing 1.2 billion varchar fields which seems to break some internal SQLite code (at least on Linux OS). The other part of the problem is that SQLite starts having issues when the database size goes beyond 50 GB (VACCUUMing starts failing, but this time in Java). At any rate, build is continuing on my side. Thanks! |
Having just read your comments in Pglnk_tempdb_mgr.java, I can confirm I build on a Windows 10 box My enwik xowa.wiki.pagelinks.sqlite3 is 55.5GB with tables and row counts: Mine is just below what might be the threshold of a billion? causing the issue (as suggested) |
Thanks for the confirmation. I suspected you were on Windows, but didn't know which version The 1.2 billion number is from the original enwiki-20190301-pagelinks.sql dump. I believe this collapses to 998 k b/c some pages will have multiple links to the same page . pseudo-example below. My numbers are basically the same except for penalty_summary. See below. Hope this helps.
|
Just one other note before closing the issue:
Actually, the main reason this probably reduces is because of missing links (redlinks). The dump includes all links, even if they don't refer to a real page |
The 2019-03 enwiki build failed with the error below
I think the problem is that the CREATE INDEX failed on the temp
categorypagelinks table b/c it has 1,215,821,988 rows.I should have a fix sometime tomorrow.
The text was updated successfully, but these errors were encountered: