Fix unlink chunk with mismatched lengths status taking minutes #76

Fenixin · 2014-10-04T16:27:17Z

Hello!

Just found what I think is a bug. When you try to unlink a chunk that has the status mismatched lengths, and the length attribute of the chunks is ridiculous big, it can take minutes to finish the call because of how the requiereblocks() calculate the block size.

Here is small fix for that problem.

Please, if you don't like the fix, feel free to implement it in any way you like.

macfreek · 2014-10-14T11:29:39Z

Hi @Fenixin, thanks for your report. Indeed, the loop could be smarter. I just committed a patch to my own repository: macfreek@2e168fe

I also wrote a test case for your scenario, but couldn't reproduce the slowness you reported: on my laptop, running all tests takes 1.9 seconds, before and after the patch.

Could you check if this fix, also helps your performance issue? If not, could you email me a region file that exhibited this issue?

macfreek · 2014-10-14T11:31:12Z

PS: Sorry for the slow feedback! You submitted the patch on my sons 2nd birthday, and I completely forgot about it afterwards.

macfreek · 2014-10-24T14:40:15Z

@Fenixin, let me know if you had a chance to look at the patch.

Fenixin · 2014-10-24T22:23:47Z

Sorry. Will do this weekend!

Fenixin · 2014-10-26T09:39:30Z

I won't be able to look at this until tomorrow. Sorry again.

macfreek · 2014-10-26T09:59:56Z

No problem at all! I was just curious if this patch works for you, especially since I could not reproduce your problem. I didn't mean to rush you. :)

Fenixin · 2014-10-28T21:13:12Z

I'm really busy right now. I will give it a look as soon as I get some time.

It seems I have lost the problematic region file :/ . If I find it I will send it to you.

Fenixin · 2014-11-29T21:00:45Z

Sorry for my dramatic disappearance.

Just tested your commit and I can say that it solves the issue. Thanks!

I'm sending you the (very broken) region file so you can test it yourself (if you want to). I will use your public email address in your github account.

Feel free to close this pull request and push your changes.

macfreek · 2014-12-08T15:54:42Z

@Fenixin Thanks for the file. I could not reproduce your result at first. Reading the file was about as fast before and after my fix.

However, I have a hunch on the root-cause.

What happens is that nbt.region.RegionFile attempts to read a chunk, even if it seems corrupt. This is by desgin. The idea is that it may be possible to retrieve some of the data after all. If, during the reading of data an error occurs, it raises the root causo: a corrupted chunk.

In the current code, if the headers say the chunk is 2987075640 bytes, it attempts to read 2987075640 bytes, even if the file is only 4202496 bytes. What happens next highly depends on the underlying operating systems: Python simply proxies the IO command to the underlying IO library of the OS. On my laptop, the difference also depends on the OS and exactly how the file was opened. In Python2, using the open() function, reading behind the end of the file returned immediately with an empty string. In Pyton3, or in Python2 using the io.open() function (which returns a io.BufferedReader object), reading behind the end of the file raises an IOError: [Errno 22] Invalid argument (EINVAL). I suspect that your operating system attempts to read 2987075640 (3 GByte), which took such a long time.

I think the general design is still correct: attempting to read a seemingly corrupt chunk, and only raising an exception if that fails. What I'll fix is that a read operation should not attempt to read behind the end of the file. I still have to think if we should surround the seek() and read() operations with a try/except block.

This closes twoolie#76. Make it clear that get_blockdata() always attempts to read data, even if the headers are corrupt. In nearly all cases, it will still fail, but at least it tried.

Fixes for #76 and #78

macfreek · 2015-01-04T06:26:42Z

Fixed with merge #79.

Fix unlink chunk with mismatched lengtg status takes minutes.

b5bb272

This was referenced Jan 2, 2015

HTTPS issues with Python 2. #78

Closed

Fixes for #76 and #78 #79

Merged

macfreek closed this in #79 Jan 4, 2015

macfreek added a commit that referenced this pull request Jan 4, 2015

Merge pull request #79 from macfreek/master

864580f

Fixes for #76 and #78

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix unlink chunk with mismatched lengths status taking minutes #76

Fix unlink chunk with mismatched lengths status taking minutes #76

Fenixin commented Oct 4, 2014

macfreek commented Oct 14, 2014

macfreek commented Oct 14, 2014

macfreek commented Oct 24, 2014

Fenixin commented Oct 24, 2014

Fenixin commented Oct 26, 2014

macfreek commented Oct 26, 2014

Fenixin commented Oct 28, 2014

Fenixin commented Nov 29, 2014

macfreek commented Dec 8, 2014

macfreek commented Jan 4, 2015

Fix unlink chunk with mismatched lengths status taking minutes #76

Fix unlink chunk with mismatched lengths status taking minutes #76

Conversation

Fenixin commented Oct 4, 2014

macfreek commented Oct 14, 2014

macfreek commented Oct 14, 2014

macfreek commented Oct 24, 2014

Fenixin commented Oct 24, 2014

Fenixin commented Oct 26, 2014

macfreek commented Oct 26, 2014

Fenixin commented Oct 28, 2014

Fenixin commented Nov 29, 2014

macfreek commented Dec 8, 2014

macfreek commented Jan 4, 2015