Import Nx tensor that is in {width, height, channels} shape with RGB data format #80

kipcole9 · 2022-09-26T23:17:55Z

Cocoa, I have tried everything I can think of to take an Nx tensor that is of the shape {width, height, channels} that contains RGB format data, and convert it in eVision to {height, width, channels} in BGR format.

Any chance I might ask for your advice and recommendations?

What I've tried

Here is what I have tried which is, I think, the required process but the saved image is definitely not what is expected!

tensor = File.read!("path_to/color_checker.etf") |> :erlang.binary_to_term()
{:ok, mat} = Evision.Nx.to_mat(tensor)
{:ok, transposed} = Evision.Mat.transpose(mat, [1, 0, 2])
{:ok, bgr} = Evision.cvtColor(transposed, Evision.cv_COLOR_RGB2BGR())
Evision.imwrite "some_path/color_checker.jpg", bgr

I have followed the converse process in Image and the results line up with expectations. Which is not too surprising since Image expects data in {width, height, channels} and RGB format. I mention this just to note that I have verified that the tensor does represent the underlying example image.

Artifacts

The image converted to a tensor and stored with :erlang.term_to_binary/1 and zipped:
color_checker.etf.zip

The original image:

The text was updated successfully, but these errors were encountered:

cocoa-xu · 2022-09-27T11:51:07Z

Seems to be another hidden pit in OpenCV because cvtColor expects the input image's number of dims to be less than or equal to 2. However, as a result of yesterday's fix, the transposed image's dims was [441, 297, 3] and the number of channel was 1.

This is understandable as OpenCV mainly aims to solve computer vision problems, therefore, although its cv::Mat can be a generic tensor, there are some functions expect the Mat to be a "valid image" -- they are expecting a "2D" tensor with number of channels to be 3 (or 1, depending on the function).

So I added another function as a workaround, Evision.Mat.last_dim_as_channel/1. This function would convert the image with dims [441, 297, 3] to a 3-channel image with dims [441, 297].

And I added another function, Evision.Nx.to_mat/2. The second argument tells the NIF the actual underlying shape of the binary data.

tensor = File.read!("color_checker.etf") |> :erlang.binary_to_term()
mat = Evision.Nx.to_mat!(tensor, {297, 441, 3})
transposed = Evision.Mat.transpose!(mat, [1, 0, 2])
transposed = Evision.Mat.last_dim_as_channel!(transposed)
bgr = Evision.cvtColor!(transposed, Evision.cv_COLOR_RGB2BGR())
Evision.imwrite "color_checker.jpg", bgr

The content below is outdated.

Outdated Information

~~However, Nx.transpose seems to work in a different way while numpy's np.transpose giving the expecting result.~~

tensor = File.read!("color_checker.etf") |> :erlang.binary_to_term()
data = Nx.to_binary(tensor)
File.write("data.bin", data)

# Nx.BinaryBackend
transposed_1 = Nx.transpose(tensor, axes: [1, 0, 2])
Nx.shape(transposed_1)
data = Nx.to_binary(transposed_1)
File.write("t1.bin", data)
transposed_1 = Evision.Nx.to_mat_2d!(transposed_1)
Evision.imwrite("transposed_1.jpg", transposed_1)

# Torchx.Backend
torchx_tensor = Nx.backend_copy(tensor, Torchx.Backend)
transposed_2 = Nx.transpose(torchx_tensor, axes: [1, 0, 2])
Nx.shape(transposed_2)
data = Nx.to_binary(transposed_2)
File.write("t2.bin", data)
transposed_2 = Evision.Nx.to_mat_2d!(transposed_2)
Evision.imwrite("transposed_2.jpg", transposed_2)

transposed_1.jpg

transposed_2.jpg

import numpy as np
import cv2

img = np.fromfile("data.bin", dtype=np.uint8).reshape((297, 441, 3))
cv2.imwrite("data.jpg", img)
transposed = np.transpose(img, [1, 0, 2])
cv2.imwrite("np.jpg", transposed)

t1 = np.fromfile("t1.bin", dtype=np.uint8).reshape((441, 297, 3))
cv2.imwrite("t1.jpg", t1)
t2 = np.fromfile("t2.bin", dtype=np.uint8).reshape((441, 297, 3))
cv2.imwrite("t2.jpg", t2)

np.jpg

t1.jpg

t2.jpg

kipcole9 · 2022-10-01T01:57:14Z

@cocoa-xu sorry for the slow reply to some really great work. My observation is that since mat = Evision.Nx.to_mat!(tensor, {297, 441, 3}) is already inverting the width and height, transposed = Evision.Mat.transpose!(mat, [1, 0, 2]) isn't required and in fact results in the image being rotated 90 degrees.

For me the following produced the expected result:

tensor = File.read!("color_checker.etf") |> :erlang.binary_to_term()
mat = Evision.Nx.to_mat!(tensor, {297, 441, 3})
transposed = Evision.Mat.last_dim_as_channel!(transposed)
bgr = Evision.cvtColor!(transposed, Evision.cv_COLOR_RGB2BGR())
Evision.imwrite "color_checker.jpg", bgr

Does the accord with your expectations?

cocoa-xu · 2022-10-01T02:16:01Z

Hi @kipcole9, don't worry I'm glad to help!

Also sorry that I thought you were trying to rotate the image 90 degrees (because when I first read this issue, I assumed the data of the tensor, color_checker.etf, was in WHC format) so that you can get the HWC-format image.

And while I was solving the dims and channel issue, I figured out the actual data layout of the tensor was in HWC format. But then I wasn't quite sure about what is your expected result, so I posted the code that does all three things:

transforming with correct underlying data layout, Evision.Nx.to_mat/2;
transposing the image (WHC -> HWC), Evision.Mat.transpose!/2.
making last dim as its channel, Evision.Mat.last_dim_as_channel/1.

For me the following produced the expected result:

mat = Evision.Nx.to_mat!(tensor, {297, 441, 3})
transposed = Evision.Mat.last_dim_as_channel!(transposed)
bgr = Evision.cvtColor!(transposed, Evision.cv_COLOR_RGB2BGR())
Evision.imwrite "color_checker.jpg", bgr

Does the accord with your expectations?

Yes, this should give the original image (height 297, width 441) in BGR format. :)

kipcole9 · 2022-10-01T02:21:21Z

@cocoa-xu, one last question (hope I'm not pushing too much - just so close now!). The following image is a B&W 2-channel QRcode. Following the "recipe" you kindly created:

iex> mat = Evision.Nx.to_mat!(tensor, {440, 440, 2})                             
%Evision.Mat{
  channels: 1,
  dims: 3,
  type: {:u, 8},
  raw_type: 0,
  shape: {440, 440, 2},
  ref: #Reference<0.2774951902.3863347232.59537>
}
iex> transposed = Evision.Mat.last_dim_as_channel!(mat)                          
%Evision.Mat{
  channels: 2,
  dims: 2,
  type: {:u, 8},
  raw_type: 8,
  shape: {440, 440, 2},
  ref: #Reference<0.2774951902.3863347232.59538>
}
# No color conversion since its black and white 2-channel. Just save.
iex> Evision.imwrite "/Users/kip/Desktop/qrcode_evision.png", transposed         
** (ArgumentError) argument error
    (evision 0.1.6) :evision_nif.imwrite([filename: "/Users/kip/Desktop/qrcode_evision.png", img: #Reference<0.2774951902.3863347232.59538>])
    (evision 0.1.6) lib/generated/evision.ex:14429: Evision.imwrite/2
    iex:34: (file)

Erlang term file of the Nx tensor

qrcode_bw.etf.zip

Original image (B&W matrix)

kipcole9 · 2022-10-01T02:22:27Z

BTW, the new Evision.Mat struct is really great, thank you for doing that.

cocoa-xu · 2022-10-01T02:27:59Z

Sorry I'm afraid that OpenCV does not support saving 2-channel images (yet). I tried the following code in Python:

import cv2
import numpy as np
img = np.zeros((200, 200, 2), dtype=np.uint8)
cv2.imwrite("a.png", img)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
cv2.error: OpenCV(4.5.5) /Users/runner/work/opencv-python/opencv-python/opencv/modules/imgcodecs/src/loadsave.cpp:737: error: (-215:Assertion failed) image.channels() == 1 || image.channels() == 3 || image.channels() == 4 in function 'imwrite_'

As the error suggests, OpenCV's imwrite only expects an image with 1/3/4-channel. I think perhaps we can convert the tensor to a single-channel one and save it?

kipcole9 · 2022-10-01T02:29:38Z

Cool - as long as thats the expected result I can certainly deal with it!

Really appreciate all your support - now on to actually exploiting all the great capabilities of eVision/OpenCV!!!!

cocoa-xu · 2022-10-01T02:32:39Z

Glad I and this library can be of help! And thank you for using it :)

kipcole9 · 2022-10-01T03:23:44Z

Success! (promise not to polute this issue any more):

iex(1)> {:ok, image} = Image.open "test/support/images/qr_code_con.png"
{:ok, %Vix.Vips.Image{ref: #Reference<0.3981987137.3338010659.102834>}}
iex(2)> Image.QRcode.decode image                                     
{:ok, "MECARD:N:Joe;EMAIL:Joe@bloggs.com;;"}

Super happy and no way could it be done without eVision!!! Lots more to be done but I'm going to focus on object detection in the next release series of Image.

cocoa-xu · 2022-10-01T09:46:58Z

Congratulation on the success🎉!!

And please don't worry about replying more to this issue. This issue was automatically closed by GitHub because a linked PR was merged, please always feel free to reply/reopen it (or any other issue) whenever you need it!

cocoa-xu mentioned this issue Sep 27, 2022

Fix dims and channel #81

Merged

cocoa-xu closed this as completed in #81 Sep 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import Nx tensor that is in {width, height, channels} shape with RGB data format #80

Import Nx tensor that is in {width, height, channels} shape with RGB data format #80

kipcole9 commented Sep 26, 2022

cocoa-xu commented Sep 27, 2022 •

edited

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022 •

edited

kipcole9 commented Oct 1, 2022

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022

Import Nx tensor that is in {width, height, channels} shape with RGB data format #80

Import Nx tensor that is in {width, height, channels} shape with RGB data format #80

Comments

kipcole9 commented Sep 26, 2022

What I've tried

Artifacts

cocoa-xu commented Sep 27, 2022 • edited

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022 • edited

kipcole9 commented Oct 1, 2022

Erlang term file of the Nx tensor

Original image (B&W matrix)

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022

kipcole9 commented Oct 1, 2022

cocoa-xu commented Oct 1, 2022

cocoa-xu commented Sep 27, 2022 •

edited

cocoa-xu commented Oct 1, 2022 •

edited