Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash while loading Fuji XTrans files #3154

Closed
pi99y opened this issue Feb 17, 2016 · 160 comments
Closed

Crash while loading Fuji XTrans files #3154

pi99y opened this issue Feb 17, 2016 · 160 comments
Assignees
Labels
type: bug Something is not doing what it's supposed to be doing

Comments

@pi99y
Copy link

pi99y commented Feb 17, 2016

Hi, I get quite frequent crashes that seem to be caused by older pp3 files, but it's really hard to pinpoint the issue, since it's a bit "random"...

Tested on:

  • OS: Windows 8.1 64bit, 8GB ram
  • RT: RawTherapee_WinVista_64_Gtk3_Debug_4.2.730.zip Changeset: f5d5083
  • Fuji X-Pro1 XTrans RAF files

I open up a folder with a at least a few images, try to load a file and it sometimes crashes. It can either crash while still in (or returning to) "file browser" view or already in the "editing view", but seems like it always crashes while either "loading", "demosaicing" or "processing" - not later on while editing (if i can get to that stage).

It seems not to be tied to specific image since it sometimes crashes on one file, some other time on another file, but nothing really consistant.

I tried to narrow it down, but I can only say that it appears that at least two things help while reproducing:

  • thumbnails of images are not yet created (It's easier to reproduce if I clear the cache before opening RT) Since this mostly happens during the thumbnails are being generated it helps to have many images in the folder - so you have more time to reproduce it. This is probably also why I can't reproduce this with a single file (that crashed before).
  • images have old pp3 files present (it doesn't happen if I create new pp3's with the current RT version, although the general "profile" I load has roughly the same preferences - since that's the one I always start with)
    PP3 files in this case say "AppVersion=4.1.80 Version=321".
  • it could be raf/xtrans specific (can't test other)

Most of the time it crashed I get the debug report saying "pure virtual method called, terminate called without an active exception" but not always (sometimes there is no error message in the debug window)

Also note that the "failed to fetch network locations" error you see in the screenshot pops up every time I start RT and seems not to be related to this issue.

Screenshot: https://i.imgur.com/PMbtxiM.jpg
PP3: http://filebin.net/tvof90mays
(I'm not missing any files that are referred to in a profile, just the RT version is newer)

@heckflosse heckflosse added type: bug Something is not doing what it's supposed to be doing scope: file format Camera or image file formats labels Feb 17, 2016
@heckflosse
Copy link
Collaborator

@pi99y It's related to xtrans files. I can reproduce it in Release build of master branch. It's not related to old pp3 files (at least the crash I get).

@heckflosse heckflosse changed the title Crash while loading images (with older pp3 files) Crash while loading Fuji XTrans files Feb 17, 2016
@heckflosse heckflosse self-assigned this Feb 17, 2016
@sguyader
Copy link

sguyader commented Feb 17, 2016 via email

@heckflosse
Copy link
Collaborator

I can sometimes reproduce it with a X-Pro 2 dng file. Sometimes it crashes, sometimes I got black output when saving to tif (bit not when saving to jpg) and sometimes it works fine. Looking

@pi99y
Copy link
Author

pi99y commented Feb 17, 2016

@heckflosse Great! I mean, great you can reproduce it. You're much likley to find the cause than me. Thanks for looking into it.

@heckflosse
Copy link
Collaborator

Unfortunately I can't reproduce it in debug build. I'll run it through valgrind

@pi99y
Copy link
Author

pi99y commented Feb 17, 2016

For me it's also present (the crash) in:
Version: 4.2.699, Changeset: 3c016c4
Version: 4.2.674, Changeset: 676a444
Version: 4.2.670, Changeset: 1a91217
Version: 4.2.539, Changeset: 6260b47
Version: 4.2.510, Changeset: 7ef35ab

EDIT:
I can now CONFIRM that it also happens (but seems less frequent) with pp3s of current version (.730). However it never happened to me if I removed the pp3 altogether. So it might be still something with reading/applying pp3s.

EDIT: "backtrace" screenshot
https://i.imgur.com/1GAusId.jpg

EDIT: whole gdb log
http://filebin.net/ctjdqlr0j7

@pi99y
Copy link
Author

pi99y commented Feb 17, 2016

Disabling film simulation in those pp3 files seems to "fix" it...

I "mass edited" the:

[Film Simulation]
Enabled=true

to

[Film Simulation]
Enabled=false

in the pp3 files and now I either cannot reproduce or it is at least much less frequent (no crash yet)... so my crash is somehow related to that...

@heckflosse
Copy link
Collaborator

I opened #3156 because I think the crashes I observed have a different reason than the crashes pi99y observed.

@pi99y
Copy link
Author

pi99y commented Feb 17, 2016

I did a test:

I downloaded a PEF image and copied it 50 times, applied my usual default profile and then applied another bundled profile on top of that.
I then set about 20 different film simulations across those 50 images.
I closed RT and cleared cache.
I ran RT again and tried to crash it, I couldn't.

Then I did the same test, but I uset a single FUJI Xtrans file (50 copies of it).
Other steps were the same...
I could crash it again.

So for now it looks it could be a combination of Xtrans & film-simulation...

@pi99y
Copy link
Author

pi99y commented Feb 18, 2016

@heckflosse Sadly, @sguyader 's new RawTherapee_WinVista_64_Gtk3_Debug_4.2.764 Changeset: 5086c0d didn't solve this.

Here are two new backtraces of this crash: http://filebin.net/ctjdqlr0j7/gdb2.txt and http://filebin.net/ctjdqlr0j7/gdb3.txt

Thanks.

@sguyader
Copy link

@pi99y Ingo's fix was pushed to the Master branch, so if you want to test it you should try my Master (4.2.677) build.

@pi99y
Copy link
Author

pi99y commented Feb 18, 2016

Oh, cool, so there still is a chance! :) I'll report back.

@pi99y
Copy link
Author

pi99y commented Feb 18, 2016

Unfortunately it's still the same (with Master branch: 4.2.677)... still crashes. New backtrace log: http://filebin.net/ctjdqlr0j7/gdb4.txt

@Karteileiche
Copy link

Hello!

any news about the bug? If required, i can provide raw files too which crash RT.

@heckflosse
Copy link
Collaborator

Please provide files 👍

@Karteileiche
Copy link

The problem exists since i updated my XE2 to FW4.0.

I have uploaded 3 files at http://filebin.net/hv52p3oyjh . Different content but same parameters. If you need RAW with different settings please specify.

I use RT 4.2.0 with Linux Mint and with Windows 7 64bit (4.2.699, 64bit). Both quit when i want to open a file from the file manager to the edit window. The file manager itself shows all files with correct meta data.

@heckflosse
Copy link
Collaborator

@Karteileiche After solving the trouble with your zip (it contains a tar file with extension .RAF) I could open all three files without problems using rt 4.2.745 on Win7/64

@Karteileiche
Copy link

Sorry for the ZIP-problem.. i just used the "compress" option in Mint/Dolphin file manager.

I did not know that there is a newer version, it is not listed at http://rawtherapee.com/downloads.
But i found the links at the discuss platform and tried. It seems to be fixed in that version.

Thanks for your Help.

@heckflosse
Copy link
Collaborator

@Karteileiche Ok, thank you for the report and the example files 👍

@pi99y
Copy link
Author

pi99y commented Mar 7, 2016

Sorry for the inactivity. I can still easily reproduce the crash (with the images I provided to @heckflosse) on the latest "master" RawTherapee version 4.2.752 b3f6f1c
But mainly on those files with old pp3s. Editing images without (or current) profiles seems totaly fine.

@heckflosse
Copy link
Collaborator

@pi99y I tried to reproduce the crashes with the images you provided. But all works fine here. If you have one image with the corresponding pp3 which allows to clearly reproduce the crash, please tell me.

@pi99y
Copy link
Author

pi99y commented Mar 7, 2016

No, sorry, for me it seems it's the process of creating previews/thumbnails while trying to edit what seems to crash it. A specific SET of files. Sometimes one fine works fine, the other time it crashes...

ALSO: could it be that you can't reproduce it from my files since you don't have CLUTs at the right location (as referenced in my pp3s)? Previously I reported that it seems film simulation related...

@heckflosse
Copy link
Collaborator

@pi99y I don't remember. I'll try again

@candardo
Copy link

RAF files from my X-E2 firmware 4.0 crash RawTherapee on Linux (Debian sid, RT 4.2.something) and work on Windows (W10 RT 4.2.905). Having no problems with the latest (windows) version I expected this bug to closed and filed a bug to the Debian BTS asking for a new build.

Any updates on this issue?

@Karteileiche
Copy link

@candardo

you are right, the linux version lacks on this fix. i am also waiting for an update.

@heckflosse
Beside the XE2 FW4.0 incompatibility, i experienced crashes several times when saving files during the edit process (saving directly form the edit window, not using the queue).

@Beep6581
Copy link
Owner

  1. Trouble with a raw file? Always provide a sample raw file, use http://filebin.net/
  2. Crash? http://rawpedia.rawtherapee.com/How_to_write_useful_bug_reports

@Floessie
Copy link
Collaborator

Floessie commented May 9, 2016

@heckflosse Okay, I'll investigate tomorrow. :)

@heckflosse
Copy link
Collaborator

@Floessie great 👍

@Floessie
Copy link
Collaborator

Floessie commented May 10, 2016

@heckflosse I've got an idea for my evening session: If we can't go the high road (__m128i), we'll go the low road (__m64) with _mm_movepi64_pi64, _mm_slri_si128, and _mm_cvtpu16_ps. That should be viable.

I've already started a "dedusting" branch last evening, which I will push for you to test when I'm through with the vfloat2 conversion. I guess, we will gain some additional speed from this. 👍

@Floessie
Copy link
Collaborator

@heckflosse Could you please check, if it still crashes for you?

@heckflosse
Copy link
Collaborator

@Floessie I just pulled your changes. Making generic build now...

@heckflosse
Copy link
Collaborator

@Floessie Generic Release build crashes, making Generic Debug build now...

@heckflosse
Copy link
Collaborator

@Floessie Sorry, tested wrong branch....

@Floessie
Copy link
Collaborator

@heckflosse Phew, you like to raise the tension! :)

@heckflosse
Copy link
Collaborator

@Floessie dedusting... branch still crashes with generic debug build. Making further tests now...

@Floessie
Copy link
Collaborator

@heckflosse I not only changed getClutValue() but also the STFUs etc. in improcfun.cc...

@heckflosse
Copy link
Collaborator

@Floessie with added -msse4 flag dedusting... generic build works fine!

@Floessie
Copy link
Collaborator

@heckflosse So it's not the alignment but either _mm_movepi64_pi64 (new, thus unlikely) or _mm_cvtpu16_ps (used before, so likely). _mm_cvtpu16_ps translates to multiple instructions, so your compiler might be wrong here...

@heckflosse
Copy link
Collaborator

@Floessie I made further tests. Crashes with Generic Release build without -msse4. Works fine with Generic Release build with -msse4.
I'll test the brute force _mm_set_ps() variant now on non -msse4 Generic build

@heckflosse
Copy link
Collaborator

@Floessie Brute force _mm_set_ps() variant work fine for non -msse4 Generic build

@heckflosse
Copy link
Collaborator

@Floessie Here's the code of _mm_cvtpu16_ps from xmmintrin.h at my system if that's of any use:

/* Convert the four unsigned 16-bit values in A to SPFP form.  */
extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, __artificial__))
_mm_cvtpu16_ps (__m64 __A)
{
  __v2si __hisi, __losi;
  __v4sf __zero, __ra, __rb;

  /* Convert the four words to doublewords.  */
  __losi = (__v2si) __builtin_ia32_punpcklwd ((__v4hi)__A, (__v4hi)0LL);
  __hisi = (__v2si) __builtin_ia32_punpckhwd ((__v4hi)__A, (__v4hi)0LL);

  /* Convert the doublewords to floating point two at a time.  */
  __zero = (__v4sf) _mm_setzero_ps ();
  __ra = __builtin_ia32_cvtpi2ps (__zero, __losi);
  __rb = __builtin_ia32_cvtpi2ps (__ra, __hisi);

  return (__m128) __builtin_ia32_movlhps (__ra, __rb);
}

@Floessie
Copy link
Collaborator

@heckflosse GCC 5.3.1 has exactly the same code. I've got a clang 3.5.0 version here (annotations by me), which is a bit different, and could be tried by pasting it as mm_cvtpu16_ps right in front of getClutValues():

static __inline__ __m128 __attribute__((__always_inline__, __nodebug__))
_mm_cvtpu16_ps(__m64 __a)
{
  __m64 __b, __c;
  __m128 __r;

  __b = _mm_setzero_si64(); // MMX
  __c = _mm_unpackhi_pi16(__a, __b); // MMX
  __r = _mm_setzero_ps(); // SSE
  __r = _mm_cvtpi32_ps(__r, __c); // SSE
  __r = _mm_movelh_ps(__r, __r); // SSE
  __c = _mm_unpacklo_pi16(__a, __b); // MMX
  __r = _mm_cvtpi32_ps(__r, __c); // SSE

  return __r;
}

I have no idea, what's wrong with your compiler and _mm_cvtpu16_ps. Maybe we are on the wrong track, don't know. The last thing I would try is inline assembly, but only when worse comes to worse.

@Floessie
Copy link
Collaborator

Floessie commented May 11, 2016

@heckflosse Don't know how old and still relevant this is, but there are some hints you could try.

Ah, and also see this and this. There is definitely a problem with MinGW, stack alignment, and SSE, at least on 32 bits.

HTH,
Flössie

@heckflosse
Copy link
Collaborator

@Floessie Yes, that's a known problem. For this reason RT win32 builds are made with -mstackrealign. For win64 builds this was not necessary in past and should still not be necessary.
Also, the sse4 code works fine here. It's just the non sse4 code which crashes win64 release builds.

@heckflosse
Copy link
Collaborator

heckflosse commented May 11, 2016

@Floessie I found a solution for SSE2 (non SSE4) code which does not need MMX registers and does not crash. Here's some test code which can easily be adopted to getClutValues():

    uint16_t dummy[8] = {1,2,3,4};
    __m128i v_values = _mm_loadu_si128(reinterpret_cast<const __m128i*>(dummy));

    vint testi = _mm_shuffle_epi32(v_values, _MM_SHUFFLE(1,0,1,0));
    vint testi2 = _mm_shufflelo_epi16( testi, _MM_SHUFFLE(1,1,0,0));
    testi2 = _mm_shufflehi_epi16( testi2, _MM_SHUFFLE(3,3,2,2));
    testi2 = vandm(testi2,_mm_set1_epi32(0x0000ffff));
    vfloat test = _mm_cvtepi32_ps(testi2);

    float test2[4];
    _mm_storeu_ps(test2,test);
    printf("%f %f %f %f\n",test2[0],test2[1],test2[2],test2[3]);

@heckflosse
Copy link
Collaborator

@Floessie Here is a patch for your dedusting branch.

@Floessie
Copy link
Collaborator

@heckflosse YEAH!!! Breakthrough! 👍 Not only this shall work (still cannot test, as I'm not on Windows), but it's nearly as fast as the SSE4 version (I was a bit skeptical at first due to the sheer number of instructions):

  • 233ms for SSE4
  • 243ms for SSE2
  • 275ms with _mm_set_ps() implemented for comparison

It's committed on my dedusting branch for you to retest.

Great work! Thanks a lot!

@heckflosse
Copy link
Collaborator

@Floessie Retested! Works fine. Feel free to change the var names and so on to your coding style and make a PR. Maybe astyle it too. Don't remember if I did that.

@heckflosse
Copy link
Collaborator

@Floessie _mm_set1_epi32(0x0000ffff) could be called once instead of twice in my code. But maybe compiler optimizes it away...

@heckflosse
Copy link
Collaborator

heckflosse commented May 11, 2016

@Floessie

(I was a bit skeptical at first due to the sheer number of instructions)

The number of instructions are:

  1. SSE4 version : 6 per getClutValues() call
  2. SSE2 version: 12 per getClutValues() call (13 when _mm_set1_epi32(0x0000ffff) is called twice)
  3. MMX version: 16 per getClutValues() call

if I counted correctly

@Floessie
Copy link
Collaborator

@heckflosse Yes, it's clear to me that a line count doesn't translate to the number of instructions, that's why I wrote "at first". :) I'll make the mask a constant and file the PR this evening. Again, thanks for solving that crash. Still, there is no explanation as to why the MMX instructions miscompile in this case, or have you found an answer?

@heckflosse
Copy link
Collaborator

@Floessie I searched a bit for information on Windows 64 bit and MMX but found nothing reliable. Anyway, it's solved now. That's the important thing.

@Floessie
Copy link
Collaborator

@heckflosse Definitely! I'm very glad, you found such a perfect solution. And here is my PR.

@heckflosse
Copy link
Collaborator

heckflosse commented May 12, 2016

@Floessie May I suggest a tiny change?
In the current implementation the +8 at buffer allocation can be reduced to +4 and the comment in that line is also not true anymore as we use the unaligned 16 bytes load now for non SSE4 code too ;-)

heckflosse added a commit that referenced this issue May 14, 2016
HaldCLUT cleanups after the dust (#3154) has settled
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Something is not doing what it's supposed to be doing
Projects
None yet
Development

No branches or pull requests

8 participants