Implementation of a personal solution of the first homework in Introduction to Image Processing, Computer Vision, and Deep Learning courses in NCKU CSIE, 2023.
-
Python 3.8
-
Pip
- Windows 11
- Ubuntu 20.04 WSL
Use
$ git clone https://github.com/dodo920306/2023_opencvdl_hw1_in_ncku.git
$ pip install -r requirements.txt
to clone the repo and install the prerequisites.
Run
$ ldd ~/.local/lib/python3.8/site-packages/PyQt5/Qt5/plugins/platforms/libqxcb.so | grep "not found"
to check if there are dependent shared libraries missing. It's usual that there's a lot of them.
Use
$ sudo apt update && sudo apt install <missing shared libraries> -y
to collect them.
For example, if you get
$ ldd ~/.local/lib/python3.8/site-packages/PyQt5/Qt5/plugins/platforms/libqxcb.so | grep "not found"
libxcb-icccm.so.4 => not found
libxcb-image.so.0 => not found
, run
$ sudo apt update && sudo apt install libxcb-icccm4 libxcb-image0 -y
in response.
The environment I used for develop this is WSL on Windows 11. If you're doing the same, please use
$ ipconfig
on Windows host to check its IP for the WSL.
You should be able to ping that IP from WSL. If you can't, please run
$ New-NetFirewallRule -DisplayName "WSL" -Direction Inbound -InterfaceAlias "vEthernet (WSL)" -Action Allow
as the administrator on Windows and try again.
Finally, run
$ python main.py
to start the program. You should see the window pop up on your screen.
You may encouter Segmentation fault
when closing the window, it's normal. If you know how to fix it, please send pull request for it because I'm not sure how to solve that.
If you can actually ping Windows from WSL but still can't run main.py, please update your wsl, restart it, and try again.
Once you run main.py successfully, you should see some UI like this
As you can see, the features are divided into 5 main parts: Image Processing, Image Smoothing, Edge Detection, Transforms, and VGG19.
There are 3 buttons can be clicked providing 3 different features:
-
Color Separation
Click this button to get 3 popup windows showing R, G, B images separated from the image uploaded with the load-image-1 button.
-
Color Transformation
Click this button to get 2 popup windows showing grayscale images converted with perceptually weighted formula and average weighted formula respectively from the image uploaded with the load-image-1 button.
-
Color Extraction
Click this button to get 2 popup windows showing yellow/green part of the image uploaded with the load-image-1 button and the image after getting extracted the part.
run
$ python Image_Processing.py
to get this part of UI independently.
There are 3 buttons can be clicked providing 3 different features:
-
Gaussian blur
Click this button to get a popup window showing the image uploaded with the load-image-1 button and a track bar. Slide the track bar to change the radius of the Gaussian kernel, and the image will change correspondingly.
-
Bilateral Filter
Click this button to get a popup window showing the image uploaded with the load-image-1 button and a track bar. Slide the track bar to change the radius of the Bilateral kernel, and the image will change correspondingly.
-
Median Filter
Click this button to get a popup window showing the image uploaded with the load-image-2 button and a track bar. Slide the track bar to change the radius of the Median kernel, and the image will change correspondingly.
run
$ python Image_Smoothing.py
to get this part of UI independently.
There are 4 buttons can be clicked providing 4 different features:
-
Sobel X
Click this button to get a popup window showing the Sobel X image for the image uploaded with the load-image-1 button.
-
Sobel Y
Click this button to get a popup window showing the Sobel Y image for the image uploaded with the load-image-1 button.
-
Combination and Threshold
Click this button to get 2 popup windows showing the image conbining Sobel X image and Sobel Y image for the image uploaded with the load-image-1 button and the image after setting threshold on the first image.
-
Gradient Angle
Click this button to get 2 popup windows showing the image conbining Sobel X image and Sobel Y image for the image uploaded with the load-image-1 button but only parts whose Gradient Angle is between 120 and 180 degrees and between 210 and 310 degrees.
run
$ python Edge_Detection.py
to get this part of UI independently.
Click the button to get a popup window showing the image uploaded with the load-image-1 button but rotated and scaled by the input number (default values are 0 and 1) with the center in (240, 200) and translated with input number Tx and Ty.
run
$ python Edge_Detection.py
to get this part of UI independently.
There are 5 buttons can be clicked providing 4 different features:
-
Show Augmented Images
Click the button to get a popup window showing the augmented images with labels (filenames) from the
Q5_image\Q5_1
directory under the project directory with 9 pictures inside.The augmented techniques used here are
-
transforms.RandomHorizontalFlip()
-
transforms.RandomVerticalFlip()
-
transforms.RandomRotation(30)
-
-
Show the Structure of VGG19 with BN
Click the button to get structure of VGG19 with BN that will be used in next features.
-
Show Training/Validating Accuracy and Loss
Click the button to get a popup window with the image showing the training/validating accuracy and loss of the model if you've trained one.
-
Inference
- Click the "load image" button to load the image you want your model to infer in.
- Click the "Inference" button to get the inference result of your model.
run
$ python VGG19.py
and type 1 to get this part of UI independently, or type 2 to train your own model with the CIFAR-10 dataset.