Highlights

Image API

In this release we have added a new Image API as placeholder to support a more generic multibackend api. You can export/import from files, numpy and dlapck.

>>> # from a torch.tensor
>>> data = torch.randint(0, 255, (3, 4, 5), dtype=torch.uint8)  # CxHxW
>>> pixel_format = PixelFormat(
...     color_space=ColorSpace.RGB,
...     bit_depth=8,
... )
>>> layout = ImageLayout(
...     image_size=ImageSize(4, 5),
...     channels=3,
...     channels_order=ChannelsOrder.CHANNELS_FIRST,
... )
>>> img = Image(data, pixel_format, layout)
>>> assert img.channels == 3

Object Detection API

We have added the ObjectDetector that includes by default the RT-DETR model. The detection pipeline is fully configurable by supplying a pre-processor, a model, and a post-processor. Example usage is shown below.

from io import BytesIO

import cv2
import numpy as np
import requests
import torch
from PIL import Image
import matplotlib.pyplot as plt

from kornia.contrib.models.rt_detr import RTDETR, DETRPostProcessor, RTDETRConfig
from kornia.contrib.object_detection import ObjectDetector, ResizePreProcessor

model_type = "hgnetv2_x"  # also available: resnet18d, resnet34d, resnet50d, resnet101d, hgnetv2_l
checkpoint = f"https://github.com/kornia/kornia/releases/download/v0.7.0/rtdetr_{model_type}.ckpt"
config = RTDETRConfig(model_type, 80, checkpoint=checkpoint)
model = RTDETR.from_config(config).eval()

detector = ObjectDetector(model, ResizePreProcessor(640), DETRPostProcessor(0.3))

url = "https://github.com/kornia/data/raw/main/soccer.jpg"
img = Image.open(BytesIO(requests.get(url).content))
img = np.asarray(img, dtype=np.float32) / 255
img_pt = torch.from_numpy(img).permute(2, 0, 1)
detection = detector.predict([img_pt])

for cls_score_xywh in detection[0].numpy():
    class_id = int(cls_score_xywh[0])
    score = cls_score_xywh[1]
    x, y, w, h = cls_score_xywh[2:].round().astype(int)
    cv2.rectangle(img, (x, y, w, h), (255, 0, 0), 3)

    text = f"{class_id}, {score:.2f}"
    font = cv2.FONT_HERSHEY_SIMPLEX
    (text_width, text_height), _ = cv2.getTextSize(text, font, 1, 2)
    cv2.rectangle(img, (x, y - text_height, text_width, text_height), (255, 0, 0), cv2.FILLED)
    cv2.putText(img, text, (x, y), font, 1, (255, 255, 255), 2)

plt.imshow(img)
plt.show()

Deep Models

As part of the kornia.contrib module, we started building a models module where Deep Learning models for Computer Vision (Semantic Segmentation, Object Detection, etc.) will exist.

From an abstract base class ModelBase, we will implement and make available these deep learning models (eg Segment anything). Similarly, we provide standard structures to be used with the results of these models such as SegmentationResults.

The idea is that we can abstract and standardize how these models will behave with our High level APIs. Like for example interacting with the Visual Prompter backend (today Segment Anything is available).

ModelBase provides methods for loading checkpoints (load_checkpoint), and compiling itself via the torch.compile API. And we plan to increase it according to the needs of the community.

Within this release, we are also making other models available to be used like RT_DETR and tiny_vit.

Example of using these abstractions to implement a model:

# Each model should be a submodule inside the `kornia.contrib.models`, and the Model class itself will be exposed under this
# `models` module.

from kornia.contrib.models.base import ModelBase
from dataclasses import dataclass
from kornia.contrib.models.structures import SegmentationResults
from enum import Enum

class MyModelType(Enum):
    """Map the model types."""
    a = 0
    ...

@dataclass
class MyModelConfig:
    model_type: str | int | SamModelType | None = None
    checkpoint: str | None = None
    ...

class MyModel(ModelBase[MyModelConfig]):
    def __init__(...) -> None:
        ...

    @staticmethod
    def from_config(config: MyModelConfig) -> MyModel:
        """Build the model based on the config"""
        ...

    def forward(...) -> SegmentationResults:
        ...

RT-DETR

In most object detection models, non-maximum suppression (NMS) is necessary to remove overlapping and similar bounding boxes. This post-processing algorithm has high latency, preventing object detectors from reaching real-time speed. DETR is a new class of detectors that eliminate NMS step by using transformer decoder to directly predict bounding boxes. RT-DETR enhances Deformable DETR to achieve real-time speed on server-class GPUs by using an efficient backbone. More details can be seen here

TinyViT

TinyViT is an efficient and high-performing transformer model for images. It achieves a top-1 accuracy of 84.8% on ImageNet-1k with only 21M parameters. See TinyViT for more information.

MobileSAM

MobileSAM replaces the heavy ViT-H backbone in the original SAM with TinyViT, which is more than 100 times smaller in terms of parameters and around 40 times faster in terms of inference speed. See MobileSAM for more details.

To use MobileSAM, simply specify "mobile_sam" in the SamConfig:

from kornia.contrib.visual_prompter import VisualPrompter
from kornia.contrib.models.sam import SamConfig

prompter = VisualPrompter(SamConfig("mobile_sam", pretrained=True))

LightGlue matcher

Added the LightGlue LightGlue-based matcher in kornia API. This is based on the original code from paper “LightGlue: Local Feature Matching at Light Speed”. See [LSP23] for more details.

The LightGlue algorithm won a money prize in the Image Matching Challenge 2023 @ CVPR23: https://www.kaggle.com/competitions/image-matching-challenge-2023/overview

See a working example integrating with COLMAP: #2469

New Sensors API

New kornia.sensors module to interface with sensors like Camera, IMU, GNSS etc.

We added CameraModel , PinholeModel , CameraModelBase for now.

Usage example:

Define a CameraModel

>>> # Pinhole Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.PINHOLE, torch.Tensor([328., 328., 320., 240.]))
>>> # Brown Conrady Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.BROWN_CONRADY, torch.Tensor([1.0, 1.0, 1.0, 1.0,
... 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]))
>>> # Kannala Brandt K3 Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.KANNALA_BRANDT_K3, torch.Tensor([1.0, 1.0, 1.0,
... 1.0, 1.0, 1.0, 1.0, 1.0]))
>>> # Orthographic Camera Model
>>> cam = CameraModel(ImageSize(480, 640), CameraModelType.ORTHOGRAPHIC, torch.Tensor([328., 328., 320., 240.]))
>>> cam.params
tensor([328., 328., 320., 240.])

Added `kornia.geometry.solvers` submodule

New module for geometric vision solvers that include the following:

This is part of an upgrade of the find_fundamental to support the 7POINT algorithm.

Image terminal printing

Added kornia.utils.print_image API for printing any given image tensors or image path to terminal.

>>> kornia.utils.print_image("panda.jpg")

What's Changed

fix skipped tests for cuda TestColorJiggleGen by @johnnv1 in #2341
remove inplace operation by @Parskatt in #2346
Replace bandit, flake8, isort, pyupgrade, and yesqa with ruff by @cclauss in #2292
fix unused import on geometry.conversions by @johnnv1 in #2357
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2345
Visual prompter by @edgarriba in #2356
Integrate vector to liegroups by @cjpurackal in #2344
add from_wxyz and from_qxyz to So3 and Se3 by @cjpurackal in #2359
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2360
RandomGrayscale: add rgb_weights parameter by @adamjstewart in #2361
add support for tuple of types on KORNIA_CHECK_TYPE + bug fix by @cjpurackal in #2353
Remove examples in favor of kornia/tutorials repo by @johnnv1 in #2366
add tutorials testing ci by @johnnv1 in #2367
Update version to 0.6.13-dev by @johnnv1 in #2368
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2370
Optional raise in kornia check functions by @alex-jw-brooks in #2375
remove unused circle ci by @johnnv1 in #2378
Bump accelerate from 0.18.0 to 0.19.0 by @dependabot in #2381
Replace miniconda for setup-python@v4 on env setup CI by @johnnv1 in #2380
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2382
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2388
Use separable filter for SSIM calculation to speed up. by @dodobyte in #2383
Bump pytest-cov from 4 to 4.1.0 by @dependabot in #2392
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2394
Batched quaternion to matrix by @jatentaki in #2395
Batched draw_lines by @cjpurackal in #2373
Fix AugmentationSequential support for RandomResizedCrop by @adamjstewart in #2398
drop python 3.7 support by @johnnv1 in #2400
Bump accelerate from 0.19.0 to 0.20.3 by @dependabot in #2402
Bump pytest from 7.3.1 to 7.3.2 by @dependabot in #2401
Makefile: Drop Python code formatter yapf in favor of psf/black by @cclauss in #2404
update the kornia governance page by @edgarriba in #2403
Update readthedocs.yml with python 3.8 by @edgarriba in #2406
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2408
Add collector job on CI PR's, reusable typing CI and remove unused CI by @johnnv1 in #2379
remove old artefacts in visual_prompting.rst by @edgarriba in #2409
use reshape instead view to convert matrix from quaternion by @edgarriba in #2413
Revamped camera API(introduce kornia.sensors) by @cjpurackal in #2349
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2415
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2418
Bump pytest from 7.3.2 to 7.4.0 by @dependabot in #2416
Fix PLR0124 by @ashnair1 in #2422
Add missing docs for warp_grid by @edgarriba in #2423
Add GPU support apple silicon augmentation 2D by @NripeshN in #2425
Add mps to testing init and fixing test cases by @NripeshN in #2428
drop JIT support from geometry.conversions by @johnnv1 in #2424
Update rst(docs) files to support MPS by @NripeshN in #2430
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2429
fix engine arg missing in opening and closing by @wintertee in #2431
Added terminal printing by @shijianjian in #2407
add support for quaternion in Se3 instantiation by @cjpurackal in #2433
fix missing assertion by @qingpeng9802 in #2435
Add RT-DETR and ObjectDetection API by @gau-nernst in #2363
Update ruff configs by @johnnv1 in #2358
Use ruff to discover and limit code complexity by @cclauss in #2442
Apply some ruff pytest fixes by @cclauss in #2444
Remove deprecated code from kornia.geometry.conversion by @pri1311 in #2437
Python linting: Add more ruff rules by @cclauss in #2441
fix the alpha of focal loss by @qingpeng9802 in #2393
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2448
Enable disallow_untyped_defs on mypy by @johnnv1 in #2252
Adding LightGlue by @ducha-aiki in #2449
Add MobileSAM by @gau-nernst in #2446
Make TinyViT available as a standalone image classifier by @gau-nernst in #2455
Add doc for MobileSAM by @gau-nernst in #2458
Add 7 point algorithm by @anandhupvr in #2390
Add pre-trained flag for SAM by @gau-nernst in #2456
Add Image API by @edgarriba in #1562
Fix auto augmentation transformation matrix by @shijianjian in #2355
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2462
Fix some docs by @qingpeng9802 in #2461
fix aug sharpness duplicated typing by @johnnv1 in #2468
axis_angle typo fix by @cjpurackal in #2463
solvers : Separating General Functionalities(polynomial solver) into solvers Submodule by @anandhupvr in #2465
kornia.sensors docs update by @cjpurackal in #2477
Fix KORNIA_CHECK_SAME_DEVICE cuda test by @johnnv1 in #2479
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2472
Bump accelerate from 0.20.3 to 0.21.0 by @dependabot in #2459
[feat] from_matrix for Se3 and Se2 by @cjpurackal in #2473
Add draw point2d by @alex-jw-brooks in #2387
[CI] bump pytorch to 2.0.1 by @johnnv1 in #2369
include print image in the Image API by @edgarriba in #2481
Fix typo SAM docs by @johnnv1 in #2483
kornia.geometry.solvers (polynomial_solvers) docs by @anandhupvr in #2484
fix image print docs by @edgarriba in #2488
[pre-commit.ci] pre-commit suggestions by @pre-commit-ci in #2491
Add missing docs for new PinholeModel camera model by @edgarriba in #2492
bump version 0.7.0 by @edgarriba in #2476

New Contributors

@alex-jw-brooks made their first contribution in #2375
@dodobyte made their first contribution in #2383
@NripeshN made their first contribution in #2425
@wintertee made their first contribution in #2431
@qingpeng9802 made their first contribution in #2435
@anandhupvr made their first contribution in #2390

Full Changelog: v0.6.12...v0.7.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.7.0 Image API, RT-DETR and Object Detection API, LightGlue Matcher, MobileSam, new Sensors API and many more

Highlights

Image API

Object Detection API

Deep Models

RT-DETR

TinyViT

MobileSAM

LightGlue matcher

New Sensors API

Added `kornia.geometry.solvers` submodule

Image terminal printing

What's Changed

New Contributors

Contributors

v0.7.0 Image API, RT-DETR and Object Detection API, LightGlue Matcher, MobileSam, new Sensors API and many more

Highlights

Image API

Object Detection API

Deep Models

RT-DETR

TinyViT

MobileSAM

LightGlue matcher

New Sensors API

Added kornia.geometry.solvers submodule

Image terminal printing

What's Changed

New Contributors

Contributors

Added `kornia.geometry.solvers` submodule