From the image examples you provided, it can be seen that an image can be perfectly restored to the original image through another image (convolutional block).
However, it is currently only used in the field of image super-resolution.
The encoding and decoding of image features is the most widely used field in images.
Therefore, is it possible to use your network to obtain two features during image encoding, and then use these two features to restore the network during final decoding? In this way, the internal processing features of the large model can be supervised. Can you provide such an encoding and decoding network structure?
If possible, the practicality will rise several levels at once.
@cszn