Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example code returns flat image without any segment. #5

Closed
kosuke1701 opened this issue May 3, 2021 · 7 comments
Closed

Example code returns flat image without any segment. #5

kosuke1701 opened this issue May 3, 2021 · 7 comments

Comments

@kosuke1701
Copy link

First of all, thank you very much for sharing your interesting project!

I tried to setup an environment with Docker, and the following example code ran without error and output three images as expected.

python segment.py ./emilia.jpg

However, the output looks like somewhat different from what is shown on README. (The following image is current_skelton.png which is output by the sample code.)

Dockerfile

Note: I've changed numba version according to this issue.

FROM nvidia/cuda:9.0-cudnn7-runtime-ubuntu16.04

RUN apt-get update\
    && apt-get install -y --no-install-recommends \
    wget gcc make zlib1g-dev libssl-dev libopencv-dev \
    && apt-get -y clean \
    && rm -rf /var/lib/apt/lists/*

RUN apt-get remove -y --allow-change-held-packages libcudnn7


RUN apt-get update && apt-get install -y --no-install-recommends \
    libcudnn7=7.0.5.15-1+cuda9.0 \
    && apt-mark hold libcudnn7 && \
    rm -rf /var/lib/apt/lists/*

RUN wget https://www.python.org/ftp/python/3.6.11/Python-3.6.11.tgz \
    && tar zxvf Python-3.6.11.tgz \
    && cd Python-3.6.11 \
    && ./configure \
    && make && make install \
    && ln -s /usr/local/bin/python3.6 /usr/local/bin/python \
    && ln -s /usr/local/bin/pip3.6 /usr/local/bin/pip \
    && cd .. \
    && rm -r Python-3.6.11*

RUN pip install -U pip \
    && pip install tensorflow-gpu==1.5.0 \
    && pip install keras==2.2.4 \
    && pip install opencv-python==3.4.2.17 \
    && pip install numpy==1.15.4 \
    && pip install numba==0.49.0 \
    && pip install scipy==1.1.0 \
    && pip install scikit-image==0.13.0 \
    && pip install scikit-learn==0.22.2 \
    && pip install -U h5py==2.10.0 \
    && pip cache purge

What I did

# Working directory is a directory where I created the Dockerfile.
docker build -t sample:0.1 .

# Mount project directory and execute sample code.
sudo docker run -it --rm --gpus all -v /path/to/DanbooRegion:/DanbooRegion --name debug sample:0.1 /bin/bash
$ cd /DanbooRegion/code
$ python segment.py ./emilia.jpg

Environment

  • OS: Ubuntu 20.04 LTS
  • Docker version: 20.10.5, build 55c4c88
  • nvidia-docker2 version: 2.6.0-1
  • GPU: GeForce RTX 3090
  • Nvidia driver version: 460.73.01
@lllyasviel
Copy link
Owner

Do you mean that you get a blank white output image and cannot get any meaningful outputs? You have uploaded a blank white image, and that image is what you get?

@kosuke1701
Copy link
Author

Yes. The uploaded blank white image is what I got as a skelton map. The other two images (current_flatten.png, current_region.png) are also blank images with different colors.

@lllyasviel
Copy link
Owner

If no python errors are reported, it is likely that the models are not properly loaded. Have you downloaded the pretrained model and put it in the correct places?

@kosuke1701
Copy link
Author

kosuke1701 commented May 3, 2021

I used the following pretrained models which are uploaded to the github repo.

https://github.com/lllyasviel/DanbooRegion/blob/master/code/DanbooRegion2020UNet.net
https://github.com/lllyasviel/DanbooRegion/blob/master/code/srcnn.net

Currently, I'm training a new model from scratch using a provided training code to see whether it works or not.

EDITED:

I tested the new model which is trained from scratch, but it still returned blank images.

@kosuke1701
Copy link
Author

It seems that Ampere architecture of RTX 3090 is not supported by CuDNN 7, which is required by tensorflow version 1.

https://docs.nvidia.com/deeplearning/cudnn/support-matrix/index.html

Since it seems to be an issue at lower layer related to hardware, I close this issue. Thank you for your responses.

@lllyasviel
Copy link
Owner

Thank you for your report! Another thing is that I have seen many of my clients with 3090/30XX using tf 1.4 without any trouble. But the official document does say they do not support these versions. Have you done anything unique to your environment?

@kosuke1701
Copy link
Author

I think I have followed standard procedures to setup my environment. After a bit of research, I found that there is a way to use tensorflow 1.15 with 3090 GPU.

https://www.pugetsystems.com/labs/hpc/How-To-Install-TensorFlow-1-15-for-NVIDIA-RTX30-GPUs-without-docker-or-CUDA-install-2005/

After I set up new tensorflow 1.15 environment with the above procedure, I got different output as follows, and it still seems to be not working :( I will try same environment setup on my different PC with RTX 1070 Ti to see if something different happens when I return home.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants