Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foreground background estimation for TensorFlow version[Question❓] #49

Open
MingtaoGuo opened this issue Aug 23, 2021 · 1 comment
Open

Comments

@MingtaoGuo
Copy link

Hi,

Thank you for your amazing repo. I try to convert estimate_fg_bg_numpy.py to TensorFlow. However, the inference speed is not satisfactory. In GPU 1080Ti, the cupy version just cost 2ms, the TensorFlow version will cost 20ms for 144x256 resolution. Do you know how to correctly revise the numpy code to TensorFlow? Thank you very much.

import numpy as np
from PIL import Image
import time
import tensorflow as tf


def inv2(mat):
    a = mat[..., 0, 0]
    b = mat[..., 0, 1]
    c = mat[..., 1, 0]
    d = mat[..., 1, 1]

    inv_det = 1 / (a * d - b * c)

    inv00 = inv_det * d
    inv01 = inv_det * -b
    inv10 = inv_det * -c
    inv11 = inv_det * a
    inv00 = inv00[:, tf.newaxis, tf.newaxis]
    inv01 = inv01[:, tf.newaxis, tf.newaxis]
    inv10 = inv10[:, tf.newaxis, tf.newaxis]
    inv11 = inv11[:, tf.newaxis, tf.newaxis]
    inv_temp1 = tf.concat([inv00, inv10], axis=1)
    inv_temp2 = tf.concat([inv01, inv11], axis=1)
    inv = tf.concat([inv_temp1, inv_temp2], axis=2)

    return inv


def pixel_coordinates(w, h, flat=False):
    x, y = tf.meshgrid(np.arange(w), np.arange(h))

    if flat:
        x = tf.reshape(x, [-1])
        y = tf.reshape(y, [-1])

    return x, y


def vec_vec_outer(a, b):
    return tf.einsum("...i,...j", a, b)

def estimate_fb_ml(
        input_image,
        input_alpha,
        min_size=2,
        growth_factor=2,
        regularization=1e-5,
        n_iter_func=2,
        print_info=True,):

    h0, w0 = 144, 256

    # Find initial image size.
    w = int(np.ceil(min_size * w0 / h0))
    h = min_size

    # Generate initial foreground and background from input image
    F = tf.image.resize_nearest_neighbor(input_image[tf.newaxis], [h, w])[0]
    B = F * 1.0
    while True:
        if print_info:
            print("New level of size: %d-by-%d" % (w, h))
        # Resize image and alpha to size of current level
        image = tf.image.resize_nearest_neighbor(input_image[tf.newaxis], [h, w])[0]
        alpha = tf.image.resize_nearest_neighbor(input_alpha[tf.newaxis, :, :, tf.newaxis], [h, w])[0, :, :, 0]
        # Iterate a few times
        n_iter = n_iter_func
        for iteration in range(n_iter):
            x, y = pixel_coordinates(w, h, flat=True) # w: 4, h: 2
            # Make alpha into a vector
            a = tf.reshape(alpha, [-1])
            # Build system of linear equations
            U = tf.stack([a, 1 - a], axis=1)
            A = vec_vec_outer(U, U) # 8 x 2 x 2
            b = vec_vec_outer(U, tf.reshape(image, [w*h, 3])) # 8 x 2 x 3
            # For each neighbor
            for dx, dy in [(-1, 0), (1, 0), (0, -1), (0, 1)]:
                x2 = tf.clip_by_value(x + dx, 0, w - 1)
                y2 = tf.clip_by_value(y + dy, 0, h - 1)
                # Vectorized neighbor coordinates
                j = x2 + y2 * w
                # Gradient of alpha
                a_j = tf.nn.embedding_lookup(a, j)
                da = regularization + tf.abs(a - a_j)
                # Update matrix of linear equation system
                A00 = A[:, 0, 0] + da
                A01 = A[:, 0, 1]
                A10 = A[:, 1, 0]
                A11 = A[:, 1, 1] + da
                A00 = A00[:, tf.newaxis, tf.newaxis]
                A01 = A01[:, tf.newaxis, tf.newaxis]
                A10 = A10[:, tf.newaxis, tf.newaxis]
                A11 = A11[:, tf.newaxis, tf.newaxis]
                A_temp1 = tf.concat([A00, A10], axis=1)
                A_temp2 = tf.concat([A01, A11], axis=1)
                A = tf.concat([A_temp1, A_temp2], axis=2)
                # Update rhs of linear equation system
                F_resp = tf.reshape(F, [w * h, 3])
                F_resp_j = tf.nn.embedding_lookup(F_resp, j)
                B_resp = tf.reshape(B, [w * h, 3])
                B_resp_j = tf.nn.embedding_lookup(B_resp, j)
                da_resp = tf.reshape(da, [w * h, 1])
                b0 = b[:, 0, :] + da_resp * F_resp_j
                b1 = b[:, 1, :] + da_resp * B_resp_j
                b = tf.concat([b0[:, tf.newaxis, :], b1[:, tf.newaxis, :]], axis=1)
                # Solve linear equation system for foreground and background
            fb = tf.clip_by_value(tf.matmul(inv2(A), b), 0, 1)

            F = tf.reshape(fb[:, 0, :], [h, w, 3])
            B = tf.reshape(fb[:, 1, :], [h, w, 3])

        # If original image size is reached, return result
        if w >= w0 and h >= h0:
            return F, B

        # Grow image size to next level
        w = min(w0, int(np.ceil(w * growth_factor)))
        h = min(h0, int(np.ceil(h * growth_factor)))

        F = tf.image.resize_nearest_neighbor(F[tf.newaxis], [h, w])[0]
        B = tf.image.resize_nearest_neighbor(B[tf.newaxis], [h, w])[0]



######################################################################
def estimate_foreground_background_tf():
    image_np = np.array(Image.open("./image.png").resize([256, 144]))[:, :, :3] / 255
    alpha_np = np.array(Image.open("./alpha.png").resize([256, 144])) / 255
    image = tf.placeholder(tf.float32, [144, 256, 3])
    alpha = tf.placeholder(tf.float32, [144, 256])
    foreground, background = estimate_fb_ml(image, alpha, n_iter_func=2)
    sess = tf.Session()
    for i in range(10):
        s = time.time()
        sess.run(foreground, feed_dict={image: image_np, alpha: alpha_np})
        e = time.time()
        print("time: ", e - s)


######################################################################
def main():
    estimate_foreground_background_tf()


if __name__ == "__main__":
    main()
@99991
Copy link
Collaborator

99991 commented Sep 6, 2021

TensorFlow creates intermediate copies of all arrays/tensors, which will take some time. This algorithm is mostly bound by memory bandwidth and not computation. This is why it is important to keep data in cache instead of copying it from tensor to tensor, which will invalidate the cache each time.

I do not know of a way to implement this efficiently in TensorFlow. You need some kind of compilation process (as with CuPy). Just-in-time compilation might work, too, but I have not tried it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants