Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

On segmented images terminated atom groups or atoms are not included #107

Open
alexey-krasnov opened this issue Feb 7, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@alexey-krasnov
Copy link

alexey-krasnov commented Feb 7, 2024

Hi guys,

There is a problem with the segmentation of some images when terminated atom groups or atoms are not included in segmented images. I tried to use both expand as True and False and even for expand=False it still cut out some atoms. Could you please provide any information about the origin of the problem and how to avoid it?

Here is the output I got when using vizualization=True.

Example output with expand=True:
US-20220048929-A1_image_1674_output_expand_True

Example output with expand=False:
US-20220048929-A1_image_1674_output_expand_False

The segmented saved files and original image are in the archive:
US-20220048929-A1_image_1674.zip

Best regards,
Aleksei

@Kohulan Kohulan added the bug Something isn't working label Feb 7, 2024
@Kohulan
Copy link
Owner

Kohulan commented Feb 7, 2024

Hi @alexey-krasnov ,

Thanks for bringing this to our attention we will look into this and get back to you with an update.

Kind regards,
Kohulan

@OBrink
Copy link
Collaborator

OBrink commented Feb 15, 2024

Hey @alexey-krasnov,

We have a bit of a dilemma here. The expansion is based on a connected object detection in the binarised and dilated image. If we use a bigger kernel for the dilation, we end up with the wrong inclusion of more objects around the structures. If we choose a smaller kernel, we get the problem that you have described above.

In DECIMER-Image-Segmentation/decimer_segmentation
/complete_structure.py, in the function complete_structure_mask, line 286ff, the kernel is defined as follows:

blur_factor = (
            int(image_array.shape[1] / 185) if image_array.shape[1] / 185 >= 2 else 2
        )
        kernel = np.ones((blur_factor, blur_factor))

If you want to experiment with this, reduce the 185 to, for example, 100 and check how that affects the results. We have done this in our analysis (a lot) and have come to the conclusion that page_width/185 is a good compromise for page formats. If you have different application cases like this image, you may want to choose image_height/185 or play around with the values. As the image does not have a typical page format, and the width comparably small, you end up with a relatively small kernel here. Tbe values have been optimised for page formats from our side.

I hope this helps!
Otto

@OBrink
Copy link
Collaborator

OBrink commented Feb 15, 2024

Another approach that would probably work: In the function get_seeds in DECIMER-Image-Segmentation/decimer_segmentation
/complete_structure.py, we define that the connected object detection includes everything that is covered by the mask and that is in a shrunk bounding box around the mask.


    x_min_limit = mask_x_values.min() + mask_x_diff / 10
    x_max_limit = mask_x_values.max() - mask_x_diff / 10
    y_min_limit = mask_y_values.min() + mask_y_diff / 10
    y_max_limit = mask_y_values.max() - mask_y_diff / 10

The purpose of this is to avoid the wrong inclusion of non-structural elements around the structures that might have been included in the original mask. If you delete the +/- mask_x/y_diff / 10 terms, the expansion would work for your example as all of the elements are touched by the original mask.

But again, this might lead to the wrong inclusion of elements in other cases. There is no simple way to create a function that works for all cases here.

@alexey-krasnov
Copy link
Author

Hi @OBrink, thanks for the provided explanation!

I checked both options separately and together. The best choice right now is only

blur_factor = (
    int(image_array.shape[1] / 100) if image_array.shape[1] / 100 >= 2 else 2 # replaced 185 with 100
)

which leads to more reasonable results with persisting the problem on some images though. It probably needs further checking with a variety of these parameters.

Best regards,
Aleksei

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants