Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input Image format #4

Closed
dgschwend opened this issue Mar 1, 2017 · 6 comments
Closed

Input Image format #4

dgschwend opened this issue Mar 1, 2017 · 6 comments

Comments

@dgschwend
Copy link
Owner

Submitted by @divyapraneetha:

Hi David,

I am working on a similar project. Modifying the network according to the FPGA resources I have. I would like to know the input image format being followed.

I have gone through changing image files to lmdb sets. but the code you have implemented takes .bin image.

I would like to know how did you define input formats.

Any help in this is highly appreciable.

Thanks
Divya

@dgschwend
Copy link
Owner Author

dgschwend commented Mar 1, 2017

Hi Divya
The input image is currently extracted from the Caffe model. Using "classify.py" from DIGITS/Caffe, the network is run forward and then the blob from layer "data" is dumped in binary format. See "classify.py" here https://www.dropbox.com/sh/i0h55cvhhoymc9s/AADDaBbM7YfOKHZCcK3RkqAXa?dl=0
There would surely be a more elegant way, if you investigate how Caffe stores the image data internally.
Regards
David

@dgschwend
Copy link
Owner Author

PS: Here's some Python code which does the JPG->.bin conversion:

#!/usr/bin/env python3

# need openBLAS
# need libraries: pip install pillow, numpy
import PIL.Image                    # pillow image library
import sys
import numpy as np
#import scipy.misc, scipy.ndimage

"""
Load an image from disk and write back as binary input data.

Arguments:
path -- path to an image on disk
width -- resize dimension
height -- resize dimension
"""

def convert_image(path, width, height, outfile):

    image = PIL.Image.open(path)
    image = image.convert('RGB')
    image = np.array(image)

    # Transform to desired size (half-crop, half-fill)
    # height_ratio = float(image.shape[0])/height
    # width_ratio = float(image.shape[1])/width
    # new_ratio = (width_ratio + height_ratio) / 2.0
    # resize_width = int(round(image.shape[1] / new_ratio))
    # resize_height = int(round(image.shape[0] / new_ratio))
    # if width_ratio > height_ratio and (height - resize_height) % 2 == 1:
    #     resize_height += 1
    # elif width_ratio < height_ratio and (width - resize_width) % 2 == 1:
    #     resize_width += 1
    # image = scipy.misc.imresize(image, (resize_height, resize_width), interp='bicubic')
    # if width_ratio > height_ratio:
    #     start = int(round((resize_width-width)/2.0))
    #     image = image[:,start:start+width]
    # else:
    #     start = int(round((resize_height-height)/2.0))
    #     image = image[start:start+height,:]
    #
    # # Fill ends of dimension that is too short with random noise
    # if width_ratio > height_ratio:
    #     padding = (height - resize_height)/2
    #     noise_size = (padding, width, 3)
    #     noise = np.random.randint(0, 255, noise_size).astype('uint8')
    #     image = np.concatenate((noise, image, noise), axis=0)
    # else:
    #     padding = (width - resize_width)/2
    #     noise_size = (height, padding, 3)
    #     noise = np.random.randint(0, 255, noise_size).astype('uint8')
    #     image = np.concatenate((noise, image, noise), axis=1)

    processed = np.zeros((3, width, height), np.float32)

    # Transpose from (height, width, channels) to (channels, height, width)
    #processed = processed.transpose((2,0,1))

    # Channel Swap: RGB -> BGR
    #image = image[(2,1,0),:,:]

    # Subtract Mean, Swap Channels RGB -> BGR, Transpose (H,W,CH) to (CH,H,W)
    #mean_rgb = [104,117,123]
    processed[0,:,:] = (image[:,:,2]-104.0)
    processed[1,:,:] = (image[:,:,1]-117.0)
    processed[2,:,:] = (image[:,:,0]-123.0)

    print("Saving input data in binary format")
    data = np.array(processed)

    print("  shape: %s" % data.shape)
    CH = data.shape[0]
    W = data.shape[1]
    H = data.shape[2]
    pixels = []

    for y in range(H):
        for x in range(W):
            for c in range(CH):
                pixel = data[c,x,y]
                if pixel is None: pixel = 99999
                pixels.append(pixel);

    # Write Pixels to binary file
    print("  write to file %s" % outfile)
    floatstruct = struct.pack('f'*len(pixels), *pixels)
    with open(outfile, "wb") as f:
        f.write(floatstruct)

if __name__ == "__main__":
    path = ""
    width = 256
    height = 256
    outfile = "image.bin"

    if len(sys.argv) == 2:
        path = sys.argv[1]
    elif len(sys.argv) == 4:
        path = sys.argv[1]
        width = int(sys.argv[2])
        height = int(sys.argv[3])
    elif len(sys.argv) == 5:
        path = sys.argv[1]
        width = int(sys.argv[2])
        height = int(sys.argv[3])
        outfile = sys.argv[4]
    else:
        print("usage: %s <inputfile> [width height [outputfile]]" % sys.argv[0])
        exit(-1)

    convert_image(path,width,height,outfile)

@divyapraneetha
Copy link

Thanks for the source. Worked for different images to be resized to 256x256x3. Now I can work on different Images.

Divya

@SapnaBhardwaj1
Copy link

Hi David,

According to the FPGA resources I have, I am downsizing neural network. I would like to know way of generating weights.bin file. I have gone through material available online to extract weights from .caffemodel. The extracted weights need to be considered as float? How to decide on float/int and how accuracy varies on this. And also can you share the reference to generate weights.bin. Thanks in advance for any suggestions.

Thank You

@dgschwend
Copy link
Owner Author

Can you please open a new issue as this is not related to the input image format?

@dgschwend
Copy link
Owner Author

for reference, conversions scripts are available under tools/convert_caffemodel

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants