Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with dilate #125

Closed
ramyma opened this issue Jan 21, 2024 · 14 comments
Closed

Segmentation fault with dilate #125

ramyma opened this issue Jan 21, 2024 · 14 comments

Comments

@ramyma
Copy link
Contributor

ramyma commented Jan 21, 2024

When I use dilate with higher pixel values (depending on the image), I get a segmentation fault and the whole Beam crashes:

28708 segmentation fault (core dumped)  iex -S mix phx.server

I created a livebook to replicate the issue (uploaded as a zip file since GitHub doesn't support direct livemd file uploads):
dilate_seg_fault.zip

Input image:
image

<!-- livebook:{"file_entries":[{"name":"alpha_test.png","type":"attachment"},{"name":"image.png","type":"attachment"}]} -->

# Image Dilate: Segmentation Fault

```elixir
Mix.install([
  {:kino, "~> 0.12.2"},
  {:image, "~> 0.41.0"}
])
```

## Load image

![](files/image.png)

```elixir
image =
  Kino.FS.file_path("image.png")
  |> File.read!()
  |> Image.from_binary!()
```

## Extract mask from alpha and dilate

```elixir
new_image =
  image
  |> Image.convert_to_mask!()
```

```elixir
new_image
|> Image.dilate!(500)
```

I used a high pixel value to demonstrate that it crashes; with lower values like 300 it crashes sporadically.

The code simply extracts the alpha channel as a mask and applies dilate to it:

image
        |> Image.convert_to_mask!()
        |> Image.dilate!(500)

It depends on the image, but for higher pixel values like 500 it definitely crashes.

@kipcole9
Copy link
Collaborator

A segmentation fault isn't good - I will put together a minimal case using your example and report to @akash-akya and the [vix](https://github.com/akash-akya/vix repo) since the underlying cause would seem to come from libvips itself.

However, I've not seen a use case for a large dilation amount. Can you let me know what you're trying to achieve - perhaps there is another way?

The implementation of Image.dilate/2 is definitely very inefficient for large values (its actually reducing over the range 1..n and generating a new image dilated by one pixel for each reduction using a median filter). I added it primarily to support the Image.meme/3 function.

In addition, I would expect the results to be visually quite unpleasant given the lack of anti-aliasing.

There are some alternative implementations I can look at using libvips morphology which may - or may not - improve the situation.

You might find the following discussions on the libvips repo of some help:

@kipcole9
Copy link
Collaborator

kipcole9 commented Jan 21, 2024

There are some alternative implementations I can look at using libvips morphology which may - or may not - improve the situation.

I recall now why the implementation uses a rank filter. It's because the rank filter can work on any non-complex image whereas the morphogical functions expect a binary (0 or 255) mask. So there may not be a "better" implementation.

It's possible that using Vix.Vips.Operation.rank/4 with a larger window might help but I haven't tried that. The current implementation of Image.dilate/2 uses a 3x3 window (index should be m * m - 1 where m is the window size for dilation).

@ramyma
Copy link
Contributor Author

ramyma commented Jan 22, 2024

I used an exaggerated value in the example for demonstration purposes.

For the real use case, I'm using a value of 40 and it crashes with an image size of 512 x 512.
I'm using dilate to expand a mask to get better blending; blur and feather kinda erode the mask rather than expand it.

Like you said, segmentation fault is irrecoverable either way.

Please let me know if I can provide any more context.

Thanks!

@kipcole9
Copy link
Collaborator

Agreed that segmentation fault is a bug - I'll work on an upstream report on that. I'd be curious to see what you're working on if thats appropriate in the future.

@ramyma
Copy link
Contributor Author

ramyma commented Jan 22, 2024

Agreed that segmentation fault is a bug - I'll work on an upstream report on that. I'd be curious to see what you're working on if thats appropriate in the future.

Appreciate it 🙏

You can check out the project that I'm working on here.

@kipcole9
Copy link
Collaborator

Closing in favour of akash-akya/vix#144 (will reopen if required).

@jcupitt
Copy link

jcupitt commented Jan 27, 2024

Huge pipelines will crash with a C stack overflow.

Repeating operations in a loop works well for immediate mode image processing systems, but it will perform very badly for demand-driven systems like libvips -- repeating an operation 500 times will make a pipeline 500 nodes long!

It's best to find another way to solve the problem. For example, you can dilate the mask with a big radius, then use the dilated mask to dilate the image.

You can also render each iteration to a new memory image. In python:

for _ in range(500):
    image = image.dilate(3).copy_memory()

Now you are making 500 short pipelines and rendering each one to a big area of memory, in effect what systems like PIL or imagemagick do. You'll lose streaming and large image support, but pipeline length will be constrained.

@ramyma
Copy link
Contributor Author

ramyma commented Jan 27, 2024

Huge pipelines will crash with a C stack overflow.

Repeating operations in a loop works well for immediate mode image processing systems, but it will perform very badly for demand-driven systems like libvips -- repeating an operation 500 times will make a pipeline 500 nodes long!

It's best to find another way to solve the problem. For example, you can dilate the mask with a big radius, then use the dilated mask to dilate the image.

You can also render each iteration to a new memory image. In python:

for _ in range(500):
    image = image.dilate(3).copy_memory()

Now you are making 500 short pipelines and rendering each one to a big area of memory, in effect what systems like PIL or imagemagick do. You'll lose streaming and large image support, but pipeline length will be constrained.

It crashes even with a value of 30 or 40 for some images, moreover, crashing the VM is irrecoverable; at least getting an exception thrown would be more graceful.
The 500 value was just an extreme value to be able to reproduce the crash consistently.

@jcupitt
Copy link

jcupitt commented Jan 27, 2024

It depends how large your C stack is, how much is used by your VM, and where the stack pointer is when you start evaluation. I tried in python:

#!/usr/bin/env python

import sys
import pyvips

image = pyvips.Image.new_from_file(sys.argv[1])

for _ in range (int(sys.argv[2])):
    image = 255 - image

data = image.write_to_memory()

I see:

$ ./overflow.py ~/pics/k2.jpg 1
$ ./overflow.py ~/pics/k2.jpg 10
$ ./overflow.py ~/pics/k2.jpg 100
$ ./overflow.py ~/pics/k2.jpg 1000
$ ./overflow.py ~/pics/k2.jpg 10000
Killed

It's not something that libvips can really warn about or control, though vix might be able to. See eg. libvips/libvips#3832 (comment)

@akash-akya
Copy link
Contributor

akash-akya commented Jan 27, 2024

@kipcole9 the issue you shared fail with error bus error. But here in the description I see segmentation fault. Are we sure the sample test script is reproducing the same issue?

@ramyma is it possible to share core-dump?

@ramyma
Copy link
Contributor Author

ramyma commented Jan 28, 2024

@ramyma is it possible to share core-dump?

Weirdly there's no erl_crash.dump generated, it just crashes with Segmentation fault (core dumped).

@kipcole9
Copy link
Collaborator

I have refactored Image.dilate/2 and Image.erode/2 based upon this discussion and @jcupitt's advice. The changelog for the upcoming release now reads:

Breaking Changes

  • Image.erode/2 and Image.dilate/2 now take a radius parameter rather than a pixels parameter. Both functions have been refactored to allow a radius in the range 1..5 with a default of 1. The radius represents the dimension of the matrix used in the Vix.Vips.Operations.range/4 function that underpins dilation and erosion. As such they represent the approximate number of pixels eroded or dilated. In addition, this function now results in a single libvips operation. The previous implementation created n operations (where n was the value of the pixels param) that could result in a slow imaging pipeline and in some cases a segfault of the entire VM due to stack space exhaustion in libvips.

  • The signature for Image.linear_gradient/{1..3} has changed. The function now takes:

    • An image and an optional keyword list of options
    • A width and height as numbers and a keyword list of options
  • Image.dominant_color/2 now returns an {:ok, rgb_color} tuple rather than a [r, g, b] list. Use Image.dominant_color!/2 if only the color value return is required.

@jcupitt
Copy link

jcupitt commented Jan 28, 2024

That sounds great!

refactored to allow a radius in the range 1..5 with a default of 1.

You could probably raise that, I guess?

@kipcole9
Copy link
Collaborator

Yes, it could be raised. I'm not sure if visually it's going to be good enough to accept for the library but I'll do some more tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants