Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to downscale the pixel values in UWP App (C#) ? #22

Closed
kchaitanyabandi opened this issue Jul 30, 2018 · 8 comments
Closed

How to downscale the pixel values in UWP App (C#) ? #22

kchaitanyabandi opened this issue Jul 30, 2018 · 8 comments

Comments

@kchaitanyabandi
Copy link

Hi,

I built a UWP App based on the Squeeze Net example provided in the repository (C#) that uses a Deep Learning model (ONNX) for image classification. I have built the deep learning model in PyTorch where the pixel values of the image have been scaled down from the range [0, 255] to [0, 1] and then normalized with channel wise (RGB) standard deviation and mean. So, this model expects the pixel values other than [0, 255] range.

But in the UWP App, I'm unable to perform this downscaling of the pixel values before binding the inputs to the model. I have searched the SoftwareBitmap class but couldn't find a way to perform this downscaling operation. Any help would be very very appreciated.

I need this operation somewhere in between these lines of code.

`

            await LoadModel();

            // Trigger file picker to select an image file
            var picker = new FileOpenPicker();
            picker.ViewMode = PickerViewMode.Thumbnail;
            picker.SuggestedStartLocation = PickerLocationId.PicturesLibrary;
            picker.FileTypeFilter.Add(".jpg");
            picker.FileTypeFilter.Add(".jpeg");
            picker.FileTypeFilter.Add(".png");
            StorageFile file = await picker.PickSingleFileAsync();

            outputTextBlock.Text = $"The selected Image: {file.Name}";

            SoftwareBitmap softwareBitmap;
            using (IRandomAccessStream stream = await file.OpenAsync(FileAccessMode.Read))
            {
                // Create the decoder from the stream 
                BitmapDecoder decoder = await BitmapDecoder.CreateAsync(stream);

                PixelDataProvider random = await decoder.GetPixelDataAsync();
                // byte[] pD =  random.DetachPixelData();
                //await FileIO.WriteBytesAsync("path/file.ext", pD);
                // System.IO.File.WriteAllBytes("path/file.ext", pD);
                // byteData.Text = $"{pD}";
                

                // Get the SoftwareBitmap representation of the file in BGRA8 format
                softwareBitmap = await decoder.GetSoftwareBitmapAsync();
                softwareBitmap = SoftwareBitmap.Convert(softwareBitmap, BitmapPixelFormat.Bgra8, BitmapAlphaMode.Ignore);
            }
            
            var streamD = await file.OpenReadAsync();
            var imageSource = new BitmapImage();
            await imageSource.SetSourceAsync(streamD);

            selectedImage.Source = imageSource;

            // Display the image
            //SoftwareBitmapSource imageSource = new SoftwareBitmapSource();
            //await imageSource.SetBitmapAsync(softwareBitmap);
            //selectedImage.Source = imageSource;

            // Encapsulate the image within a VideoFrame to be bound and evaluated
            VideoFrame inputWoodImage = VideoFrame.CreateWithSoftwareBitmap(softwareBitmap);

            await EvaluateVideoFrameAsync(inputWoodImage);`

Thanks
Krishna

@LPBourret
Copy link
Contributor

Hi Krishna,
you can access and edit the pixels of a SoftwareBitmap via IMemoryBufferByteAccess, see here for more documentation

@walrusmcd
Copy link
Contributor

walrusmcd commented Aug 14, 2018

Do you have more details on what your model expects? Like, what image formats and dimensions?

I think you are saying that you have an image your model expects, to a dimension similar to squeezenet ? (1,3,224,244) and you also have the pixel data normalized to [0-1] vs. [0-255]. right?

For downscaling to 244,244 you can use the normal VideoFrame::CopyToAsync() . For normalizing the tensor yourself, you would need to then tensorize that video frame into a TensorFloat object. the code would look something like this :

{
	SoftwareBitmap bitmapBuffer = new SoftwareBitmap(BitmapPixelFormat.Bgra8, 224, 224, BitmapAlphaMode.Ignore))
	VideoFrame buffer = VideoFrame.CreateWithSoftwareBitmap(bitmapBuffer))
	await inputFrame.CopyToAsync(buffer);
	SoftwareBitmap resizedBitmap = buffer.SoftwareBitmap;
	WriteableBitmap innerBitmap = new WriteableBitmap(resizedBitmap.PixelWidth, resizedBitmap.PixelHeight);
	resizedBitmap.CopyToBuffer(innerBitmap.PixelBuffer);
	int[] pixels = innerBitmap.GetBitmapContext().Pixels;
	tensorFloats = NormalizeImage(pixels);
}

private float[] NormalizeImage(int[] src)
{
	var normalized = new float[src.Length * 3];
	for (int i = 0; i < src.Length; i++)
	{
		var val = src[i];
		normalized[i * 3 + 0] = (float)(val & 0xFF) / (float)255;
		normalized[i * 3 + 1] = (float)((val >> 8) & 0xFF) / (float)255;
		normalized[i * 3 + 2] = (float)((val >> 16) & 0xFF) / (float)255;
	}
	return normalized;
}

@kchaitanyabandi
Copy link
Author

kchaitanyabandi commented Aug 14, 2018

@LPBourret and @walrusmcd - Thanks for your responses

I was able to access the pixel values. In Pytorch, the model has been trained on image pixel data where each pixel value was initially in the range [0,255] which is converted to [0,1] when the image data is loaded as a tensor in Pytorch (Essentially, every pixel value is divided by 255). I was able to perform this in the UWP app using a writeable bitmap (Inspiration - From this blog)

But I observed that, these pixel values that are in the range [0,1] now can't be normalized with a certain Standard Deviation and Mean because if we do so, the pixel values will now be negative too but the byte storage format of pixels in UWP doesn't allow that.

Is there any way to normalize the pixel values and store them in byte array ?

@kchaitanyabandi
Copy link
Author

Hi @walrusmcd

I have a better understanding of the problem I have now. The following are the steps that I followed:

  1. I trained a basic CNN in Pytorch where the image's pixel data gets stored as a tensor through conversion from [0, 255] (all integers in the pixel data of the image, say the pixel value of red channel at [0,0] position is 200 that lies between 0 and 255) to [0, 1] (Floats are allowed in tensors, for example, correspondingly the tensor value this would be 200/255 = 0.7843 that lies between [0, 1] in pytorch) range by dividing each pixel value in each channels (Red, Green and Blue) by 255. So the training data and hence the expected input for the model is a tensor of shape (N:1, C:3, H, W)

  2. Now, I'm trying to build a UWP app in C# to make use of this model to make predictions. Here, I'm facing a problem in supplying an input to this model. The model expects a tensor object of shape [13H*W] in the range of [0, 1] that take continuous values between 0 and 1. But the softwarebitmap class stores the pixel data in a byte array that can take only integers between [0, 255]. When I divide the pixel values by 255 and convert it to byte datatype, it is either a 0 or 1 which is not what I want.

  3. Can't the pixel data in softwarebitmap store float values instead of byte array ?

Can you please help me with passing the modified float values i.e (continuous values between 0 and 1) to the model ?

Is this not possible with the windows.AI.machinelearning.preview ? I see that it is deprecated now. Is it possible with the pre-release version of windows.AI.machinelearning ?

Thanks for your time.

                using (IRandomAccessStream stream = await file.OpenAsync(FileAccessMode.Read))
                {
                    // Create the decoder from the stream 
                    BitmapDecoder decoder = await BitmapDecoder.CreateAsync(stream);

                    Iwidth = Convert.ToInt32(decoder.PixelWidth);
                    Iheight = Convert.ToInt32(decoder.PixelHeight);
                    Ibmp = new WriteableBitmap(Iwidth, Iheight);

                    stream.Seek(0);
                    await Ibmp.SetSourceAsync(stream);

                    var srcPixelStream = Ibmp.PixelBuffer.AsStream();
                    byte[] srcPixels = new byte[4 * Iwidth * Iheight];
                    int length = srcPixelStream.Read(srcPixels, 0, 4 * Iwidth * Iheight);

                    var random = await decoder.GetPixelDataAsync();
                    var bytes = random.DetachPixelData();

                    byte b, g, r, a;
                    float sb, sg, sr, sa;
                    byte[] destPixels = new byte[4 * Iwidth * Iheight];
                    int pos;
                    // Convert pixel data to [0,1] - I want continuous values between 0 and 1 here but just getting either a 0 or 1
                    for (int y = 0; y < Iheight; y++)
                    {
                        for (int x = 0; x < Iwidth; x++)
                        {
                            pos = (x + y * Iwidth) * 4;
                            b = bytes[pos];
                            g = bytes[pos + 1];
                            r = bytes[pos + 2];
                            a = bytes[pos + 3];

                            sb = (float)b / 255;
                            sg = (float)g / 255;
                            sr = (float)r / 255;
                            sa = (float)a / 255;
                            
                            // sr here is a float value but when I convert it to a byte it becomes either a 0 or 1
                            if (x == 0 & y == 2)
                            {
                                pixelBox.Text = $"Pixel Data : {x}, {y}, {r}, {g}, {b}, {a}, {Convert.ToByte(sr)}, {sg}, {sb}, {sa}, {Byte.MinValue}, {Byte.MaxValue}"; 
                            }
                            
                            //destPixels[pos] = Convert.ToByte(sb); // B
                            //destPixels[pos + 1] = Convert.ToByte(sg); // G
                            //destPixels[pos + 2] = Convert.ToByte(sr); // R
                            //destPixels[pos + 3] = Convert.ToByte(sa); // A

                            destPixels[pos] = (byte)(sb); // B
                            destPixels[pos + 1] = (byte)(sg); // G
                            destPixels[pos + 2] = (byte)(sr); // R
                            destPixels[pos + 3] = (byte)(sa); // A

                        }
                    }

                    // Write modified pixel values back to WriteableBitmap
                    srcPixelStream.Seek(0, SeekOrigin.Begin);
                    srcPixelStream.Write(destPixels, 0, length);

                    // Get the SoftwareBitmap representation of the file in BGRA8 format
                    IsoftwareBitmap = SoftwareBitmap.CreateCopyFromBuffer(Ibmp.PixelBuffer, BitmapPixelFormat.Bgra8, Ibmp.PixelWidth, Ibmp.PixelHeight);
                    IsoftwareBitmap = SoftwareBitmap.Convert(IsoftwareBitmap, BitmapPixelFormat.Bgra8, BitmapAlphaMode.Ignore);

@LPBourret
Copy link
Contributor

using the Windows.AI.MachineLearning API (not preview) you would create a float array of the same shape as your model input requirement then populate it with normalized values from your SoftwareBitmap using the code you posted. You would then create a TensorFloat variable from that float array and bind it before evaluation (see TensorFloat.CreateFromArray() )

@zhangxiang1993
Copy link
Member

Hi @kchaitanyabandi
I think I have a way to solve your question in C++ by not convert it back to BYTES.
Instead of binding a softwareBitmap, I use TensorFloat as the binding object value. (Similar to @LPBourret 's solution)

    BYTE* pData = nullptr;
    UINT32 size = 0;
    winrt::Windows::Graphics::Imaging::BitmapBuffer spBitmapBuffer(softwareBitmap.LockBuffer(winrt::Windows::Graphics::Imaging::BitmapBufferAccessMode::Read));
    winrt::Windows::Foundation::IMemoryBufferReference reference = spBitmapBuffer.CreateReference();
    auto spByteAccess = reference.as<::Windows::Foundation::IMemoryBufferByteAccess>();
    spByteAccess->GetBuffer(&pData, &size);
    
    std::vector<int64_t> shape = { 1, 3, softwareBitmap.PixelHeight() , softwareBitmap.PixelWidth() };
    float* pCPUTensor;
    uint32_t uCapacity;
    TensorFloat tf = TensorFloat::Create(shape);
    com_ptr<ITensorNative> itn = tf.as<ITensorNative>();
    itn->GetBuffer(reinterpret_cast<BYTE**>(&pCPUTensor), &uCapacity);

    uint32_t height = softwareBitmap.PixelHeight();
    uint32_t width = softwareBitmap.PixelWidth();
    for (UINT32 i = 0; i < size; i += 4)
    {
        UINT32 pixelInd = i / 4;
        pCPUTensor[pixelInd] = (float)pData[i];
        pCPUTensor[(height * width) + pixelInd] = (float)pData[i + 1];
        pCPUTensor[(height * width * 2) + pixelInd] = (float)pData[i + 2];
    }

    binding.Bind(model.InputFeatures().First().Current().Name(), tf);

@kumraj
Copy link
Contributor

kumraj commented Jan 24, 2019

reopen if the issue is not resolved

@zhangxiang1993
Copy link
Member

Update: the next coming release onnxruntime 1.7 adds support for pixel data normalized [0, 1] and [-1, 1].
The model must have inputs and outputs specified as IMAGE, instead of TENSOR, and the metadata properties must specify nominalPixelRange as normalized_0_1 or normalized_1_1, else [0, 255] will be assumed by default.

jedychen added a commit to jedychen/Face-Recognition-UWP that referenced this issue Mar 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants