Skip to content
This repository has been archived by the owner on Nov 25, 2019. It is now read-only.

mmap.error: [Errno 12] Cannot allocate memory when starting to compile wiki dictionary #17

Closed
oha59 opened this issue Mar 1, 2012 · 4 comments

Comments

@oha59
Copy link

oha59 commented Mar 1, 2012

After processing of the CDB-Files from dewiki is finished I got an error when aardc is about to start compiling the aar files.

100.00% t: 2 days, 16:11:42 avg: 10.0/s a: 1373941 r: 946049 s: 0 e: 0 to: 0 f: 0
Compiling .aar files
Traceback (most recent call last):
File "/home/oliver/env-aard/bin/aardc", line 9, in
load_entry_point('aardtools==0.8.3', 'console_scripts', 'aardc')()
File "/home/oliver/env-aard/lib/python2.6/site-packages/aardtools/compiler.py", line 1094, in main
compiler.compile()
File "/home/oliver/env-aard/lib/python2.6/site-packages/aardtools/compiler.py", line 507, in compile
for volume in self.make_volumes(create_volume_func, articles):
File "/home/oliver/env-aard/lib/python2.6/site-packages/aardtools/compiler.py", line 526, in make_volumes
for title, serialized_article in articles:
File "/home/oliver/env-aard/lib/python2.6/site-packages/aardtools/compiler.py", line 388, in sorted
article_store = mmap.mmap(article_store_f.fileno(), 0)
mmap.error: [Errno 12] Cannot allocate memory

Compilation of an older dewiki dump finished successfully a couple of days ago.

Now I have the following files in the aardc working directory:
-rw------- 1 oliver oliver 38905125 2012-03-01 14:38 aa-8x_Ux4.titles
-rw------- 1 oliver oliver 3374091660 2012-03-01 14:38 aa-Pg6S6z.articles
-rw------- 1 oliver oliver 41759820 2012-03-01 14:38 aa-R5GzZO.index

Processing took two days an 16 hours, so I don't feel like doing it again.
Free disk space is 7,7 GB, but I'm not sure, if that is the problem.
Is there a way to start the compilation of the .aar files using the titles, articles and index files without the need to create them again from the CDB files?

@itkach
Copy link
Member

itkach commented Mar 1, 2012

It's probably not too hard to hack compiler.py to pick up and continue, but this is not something that works out of the box. Are you trying to do this on a 32-bit machine, by any chance? If so, this is not going to work, compiler uses memory mapped files and 32-bit doesn't provide enough address space for something as big as dewiki.

@oha59
Copy link
Author

oha59 commented Mar 1, 2012

I am currently on a 64-bit machine, but I'm running Ubuntu 10.10 32-bit. I also have a 32-bit machine, unfortunately no Ubuntu installed at the moment. Could my problem be related to insufficient disk space?

@oha59 oha59 closed this as completed Mar 1, 2012
@oha59 oha59 reopened this Mar 1, 2012
@oha59
Copy link
Author

oha59 commented Mar 1, 2012

Hit the wrong button...

@itkach
Copy link
Member

itkach commented Mar 1, 2012

I don't think this is related to disk space. The fact that actual hardware is 64-bit doesn't matter - if OS is 32-bit it is 32-bit for all intents and purposes. Article store is >3G. While technically 32-bit allows to address 4G, finding >3G of continuous address space (which is what mmap does) is a tough proposition. To compile something like dewiki aardc needs 64-bit, sorry.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants