Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get character bbox annotation? #8

Closed
GuokunWang opened this issue Sep 15, 2021 · 5 comments
Closed

How to get character bbox annotation? #8

GuokunWang opened this issue Sep 15, 2021 · 5 comments

Comments

@GuokunWang
Copy link

this project is very helpful for generating synth text for scene text recognition, and it seem to generate text image by combine several character images, but the outputs doesn't contain information of each character, is it possible to get character annotation, for example, each character and its location?

@moonbings
Copy link
Collaborator

Hi,
It generates images by rendering several characters, but can't get character bboxes because transformation is applied after merging characters.
It needs to change rendering process to get character bboxes.
I'll try to improve rendering process to get character bboxes (and text mask).
Thanks.

@kwanghyukAn
Copy link

@moonbings
Hi,
Do you have any plan to make code to get text box? Or is this project over?
Thanks

@Mahmuod1
Copy link

I ask if I can get the word or sentence bbox

@moonbings
Copy link
Collaborator

moonbings commented Aug 25, 2022

Hi,
Sorry for the late reply.
Unfortunately, I can't maintain this project because of personal reasons. 😢

You can modify https://github.com/clovaai/synthtiger/blob/master/examples/synthtiger/template.py#L177-L190 this code to get character/word bbox.
For character bboxes, you can get it by making temporary copied character layers. Apply same transformation to temporary character layers and then return these bboxes.
For word bbox, you can get it by merging character bboxes.
Note that, this bbox is an world coordinates, so you need to change coordinates.

Here's an example.

def _generate_fg(self, color, style):
    ...

    char_layers = [layers.TextLayer(char, **font) for char in chars]
    self.shape.apply(char_layers)
    self.layout.apply(char_layers, {"meta": {"vertical": self.vertical}})

    layer = layers.Group(char_layers).merge()
    self.color.apply([layer], color)
    self.texture.apply([layer])

    self.style.apply([layer], style)
    self.style.apply(char_layers, style) # added

    transform = self.transform.sample() # added
    self.transform.apply([layer], transform) # changed
    self.transform.apply(char_layers, transform) # added

    self.fit.apply([layer])
    self.fit.apply(char_layers) # changed

    self.pad.apply([layer])
    out = layer.output()

    # change coordinates
    for char_layer in char_layers:
        char_layer.topleft -= layer.topleft

    # get bboxes
    char_bboxes = [char_layer.bbox for char_layer in char_layers] # [[left, top, width, height], ...]
    word_bbox = utils.merge_bbox(char_bboxes) # [left, top, width, height]

    return out, label, char_bboxes, word_bbox

And then, you need to modify this part to save bboxes.
https://github.com/clovaai/synthtiger/blob/master/examples/synthtiger/template.py#L132-L153

After changing the template, you can generate data with following command.

python -m synthtiger -o results -w 4 -v examples/synthtiger/template.py SynthTiger examples/synthtiger/config_horizontal.yaml

Thanks.

@moonbings
Copy link
Collaborator

Now, we can get character bboxes and text mask.
Character bbox data is in coord.txt file and mask data is in masks directory.
The format of coord.txt is <image_path>\t<bbox>\t<bbox>\t<bbox>.... (<bbox>=<xmin>,<ymin>,<xmax>,<ymax>)
Check out the latest code.

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants