[Enhancement request]H3C can't be reconginized well. #8

kongwf5813 · 2020-07-30T08:39:42Z

First, thank you all of guys. You developed a very good tool.

I tested it today, found two things:

The colored image is not as good as the produced black one.
H3C was recognized as H or H2PNH.

I saw in the StructureImageExtractor class, the colored image would be converted to gray one, some color in original image might be white after this operation. That's why the ouput of black one(I produced) is better than colored one.

I produced the black one by using below code:

public BufferedImage blackImage(BufferedImage srcPic) {
        for (int x = 0; x < srcPic.getWidth(); x++) {
            for (int y = 0; y < srcPic.getHeight(); y++) {
                Color col = new Color(srcPic.getRGB(x, y), true);
                //600 is a number after some test
                if (col.getRed() + col.getGreen() + col.getBlue() <= 600) {
                    //set to black
                    srcPic.setRGB(x, y, 0);
                }
            }
        }
        return srcPic;
    }

  public BufferedImage sharpenImage(BufferedImage srcPic) {
        int imageWidth = srcPic.getWidth();
        int imageHeight = srcPic.getHeight();

        BufferedImage destPic = new BufferedImage(imageWidth, imageHeight,
                BufferedImage.TYPE_3BYTE_BGR);
        float[] data = {-1.0f, -1.0f, -1.0f, -1.0f, 11.0f, -1.0f, -1.0f, -1.0f, -1.0f};
        Kernel kernel = new Kernel(3, 3, data);
        ConvolveOp co = new ConvolveOp(kernel, ConvolveOp.EDGE_NO_OP, null);
        co.filter(srcPic, destPic);
        return destPic;
    }

  public static void main(String[] args) {
    BufferedImage blackImage = blackImage(ImageIO.read(file));
    BufferedImage sharpenImage = sharpenImage(blackImage);
    String moleFileString = Molvec.ocr(sharpenImage);
 }

Please let me know if this two things would be considered.

The text was updated successfully, but these errors were encountered:

dkatzel-ncats · 2020-07-30T14:26:29Z

Thank you for submitting this!

Binarization is something that we’re continuing to improve. We will add this images into our test suite and we’ll see if we can tweak our existing approach to correctly handle this image while still maintaining overall performance.

We will keep this ticket open until it is fixed and pushed to Maven Central

tylerperyea · 2020-10-09T03:29:53Z

@kongwf5813 the current master branch includes this test image, which now seems to work okay. This code has some modifications to the grayscaling procedure that seem to make it handle things like this a little better. The previous default grayscale code took a weighted average of RGB to match the "luminance" calculations typically applied in most software (equivalent to equation 7 here https://openaccess.thecvf.com/content_cvpr_2017/papers/Nguyen_Why_You_Should_CVPR_2017_paper.pdf). While this approach should typically work well, MolVec has a strange quirk where it first inverts the colors of the images, and then performs the grayscale and thresholding. There is an inherent bias to consider black pixels "off" and white pixels "on" in a few steps.

Due to MolVec using the negative image for grayscale/thresholding, converting to HSV space and simply using the "Value" component actually does a decent job of promoting "foreground" elements in a colored image like the one you present. Please try it out and let us know!

Ultimately, these thresholding and grayscale operations could be improved using other techniques. If you've got more examples of images that aren't working well, we could add them to the training set and try to adjust a few things. Eventually there may need to be a larger decision tree on which grayscale/thresholding technique to use based on more heuristics. A sharpenImage method like the one you suggest could be great to handle some of these edge case too!

dkatzel-ncats · 2020-10-14T02:05:07Z

These changes have been released in 0.9.8. closing ticket. Thanks for your help making Molvec better!

kongwf5813 changed the title ~~H3C can't be reconginized well.~~ [Enhancement request]H3C can't be reconginized well. Jul 30, 2020

dkatzel-ncats mentioned this issue Oct 8, 2020

Colorfix #10

Merged

dkatzel-ncats closed this as completed Oct 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Enhancement request]H3C can't be reconginized well. #8

[Enhancement request]H3C can't be reconginized well. #8

kongwf5813 commented Jul 30, 2020 •

edited

Loading

dkatzel-ncats commented Jul 30, 2020 •

edited

Loading

tylerperyea commented Oct 9, 2020

dkatzel-ncats commented Oct 14, 2020

[Enhancement request]H3C can't be reconginized well. #8

[Enhancement request]H3C can't be reconginized well. #8

Comments

kongwf5813 commented Jul 30, 2020 • edited Loading

dkatzel-ncats commented Jul 30, 2020 • edited Loading

tylerperyea commented Oct 9, 2020

dkatzel-ncats commented Oct 14, 2020

kongwf5813 commented Jul 30, 2020 •

edited

Loading

dkatzel-ncats commented Jul 30, 2020 •

edited

Loading