Skip to content

Ikomia-hub/infer_mmlab_text_detection

Repository files navigation

Algorithm icon

infer_mmlab_text_detection


Stars Website GitHub
Discord community

Run text detection models from MMLAB.

Result example

🚀 Use with Ikomia API

1. Install Ikomia API

We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.

pip install ikomia

2. Create your workflow

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

# Run on your image
wf.run_on(url="https://discuss.poynt.net/uploads/default/original/2X/6/60c4199364474569561cba359d486e6c69ae8cba.jpeg")

# Get results
original_image_output = algo.get_output(0)
text_detection_output = algo.get_output(1)

# Display results
display(original_image_output.get_image_with_graphics(text_detection_output))

☀️ Use with Ikomia Studio

Ikomia Studio offers a friendly UI with the same features as the API.

  • If you haven't started using Ikomia Studio yet, download and install it from this page.

  • For additional guidance on getting started with Ikomia Studio, check out this blog post.

📝 Set algorithm parameters

  • model_name (str, default="dbnet"): pre-trained model name.
  • cfg (str, default="dbnet_resnet18_fpnc_1200e_icdar2015.py"): config of the pretrained model.
  • cuda (bool, default=True): CUDA acceleration if True, run on CPU otherwise.
  • config_file (str, default=""): path to model config file (.py). Only for custom model.
  • model_weight_file (str, default=""): path to model weights file (.pt). Only for custom model.

To run a specific pretrained model, fill model_name and cfg. To run a custom model, for example trained with train_mmlab_text_detection, fill config_file and model_weight_file

Note: parameter key and value should be in string format when added to the dictionary.

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

algo.set_parameters({
    "model_name": "dbnetpp",
    "cfg": "dbnetpp_resnet50-oclip_fpnc_1200e_icdar2015",
    "cuda": "True"
})

# Run on your image  
wf.run_on(url="https://discuss.poynt.net/uploads/default/original/2X/6/60c4199364474569561cba359d486e6c69ae8cba.jpeg")

# Get results
original_image_output = algo.get_output(0)
text_detection_output = algo.get_output(1)

# Display results
display(original_image_output.get_image_with_graphics(text_detection_output))

To know what are all the available pairs (model_name, cfg), run this code snippet.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

# Get possible parameters
possible_parameters = algo.get_model_zoo()

# Print them
print(possible_parameters)

# You can use one of them to choose your pretrain, here the first in the list
algo.set_parameters(possible_parameters[0])

# Then run on your image...

🔍 Explore algorithm outputs

Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

# Run on your image  
wf.run_on(url="https://discuss.poynt.net/uploads/default/original/2X/6/60c4199364474569561cba359d486e6c69ae8cba.jpeg")

# Iterate over outputs
for output in algo.get_outputs():
    # Print information
    print(output)
    # Export it to JSON
    output.to_json()