Skip to content

Model Database

alsa edited this page Jul 18, 2019 · 42 revisions

ESRGAN Models

Upscaling - Drawings

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
Manga109Attempt kingdomakrillic 4 Manga ? 4 ? ? 100 Manga109 RRDB_PSNR_x4
Falcon Fanart LyonHrt 4 Manga 125,000 8 128 ? 3393 Falcon Fanart RRDB_PSNR_x4
Comic Book LyonHrt 4 Comic / Drawings 115,000 8 128 592 1548 Custom (Spiderman) none
ad_test_tf PRAGMA 4 Cartoon / Netflix 5,000 16 128 ? 30000 Custom (American Dad) PSNRx4
De-Toon LyonHrt 4 Toon Shading / Sprite 225,000 8 128 525 7,117 Custom Toon-style photos RRDB_PSNR_x4
Unholy02 DinJerr 4 Anime ? ? ? ? ? CG-Painted Anime Several, see notes
Unholy03 DinJerr 4 Anime ? ? ? ? ? CG-Painted Anime Several, see notes
WaifuGAN v3 DinJerr 4 Anime 30000 2 128 ? 173 CG-Painted Anime Manga109v2

Description

Manga109Attempt is slightly blurry, but performs well as a general upscaler as LyonHrt said:

"I think it just has the right balance, a bit of paper grain some realism and bold colors it’s the closest your going to get to all purpose"

Falcon Fanart tries to improve upon it with the goal of removing checkerboard patterns / and dithering. It has oil colour based shading with sharp lines.

The Comic Book model was trained using stills from the film spiderman into the spiderverse, has a comic book crosshatch shading effect to the images. Sample

The ad_test_tf model was designed for upscaling American Dad NTSC DVD frames (originally at 480p) to match the quality and style of Netflix's equivalent 1080p WEB-DL, which includes a slight desaturation of colors.

De-Toon, is a model that does the opposite of tooning an image. It takes toon style shading and detail, and attempts to make it realistic. Its very sensitive, and can be used on small sprites, to large images. Also included is a alt version, which is less sharp.

Unholy02 and Unholy03 were created by interpolating a whole bunch of models about 30 times, mainly with the Dinjerr's own WaifuGAN model and RRDB_esrgan. It's intended for upscaling CG-painted anime images with light outlines and produces sharper, cleaner, and more aggressive results than manga109, but may produce unnecessary outlines or details when faced with noise, so be wary of jpegs.

WaifuGAN v3 is Dinjerr's third attempt at training from a mostly anime dataset sourced from image boards and is intended for upscaling CG-painted anime with variable outlines. Only PNGs were used, mainly with brush strokes and gradients. Texturised images avoided as much as possible. If too generative, tone down by interpolating with a softer model.

Upscaling - Realistic (photos, prerendered 3D, etc)

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
4xBox buildist 4 GPLv3 Realstic 390,000 8 192 268 11,577 Flickr2K+Div2K+OST PSNR model from same data

Description

4xBox was meant to be an improvement on the RRDB_ESRGAN_x4 model (comparison). It's also trained on photos, but with a much larger dataset which was downscaled with linear interpolation (box filter) instead of bicubic.

Upscaling - Characters and Faces

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
Trixie LyonHrt 4 Star Wars 275K 8 192 87 19,814 ? None
Face Focus LyonHrt 4 Face De-blur 275K 8 192 455 4,157 Custom (Faces) RRDB_PSNR_x4
Face Twittman 4 Face Upscaling 250K 10 128 967 3,765 Custom (Faces) 4xESRGAN

Description

Trixie was made to bring balance to the force... Also to upscale character textures for star wars games, including the heroes, rebels, sith and imperial. Plus a few main aliens...Why called trixie? Because jar jars big adventure would be too long of a name...This also provides good upscale for face textures for general purpose as well as basic star wars textures.

The Face Focus modes was designed for slightly out of focus / blurred images of faces. It is aimed at faces / hair, but it can help to improve other out of focused images too as always just try it.

Upscaling - Specialized

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
Map LyonHrt 4 Map / Old Paper with text 120,000 8 192 361 2311 Custom(Scans) none
Forest LyonHrt 4 Wood / Leaves 160,000 8 192 590 2.2K Custom(?) none
Ground ZaphodBreeblebox 4 Ground Textures 305,000 ? 128 ? ? Custom (Ground textures Google) ?
Misc alsa64 4 GNU GPLv3 Surface Textures 220,000 32 128 338 20,797 Custom (Photos) Manga109Attempt
Armory alsa64 4 GNU GPLv3 Armor, Clothes and Weapons 80,000 26 128 2,600 800 Skyrim Mod textures Manga109Attempt
Wood Laeris 4 Wood 75,000 ? ? ? ? ? ?
Skyrim Diffuse Deorder 4 Skyrim Difuse Textures 105,000 ? 128 ? ? Skyrim Diffuse Textures ?
Xbrz LyonHrt 4 Xbrz style pixel art upscaler 90,000 8 128 368 1897 custom xbrz up-scaled RRDB_PSNR_x4
ScaleNX LyonHrt 4 Scalenx style pixel art upscaler 80,000 8 128 599 1,070 custom scalenx up-scaled from retroarch shader RRDB_PSNR_x4
Xbrz+DD LyonHrt 4 Xbrz style pixel art upscaler with de-dithering 90,000 8 128 470 1,523 custom de-dithered xbrz xbrz

Description

The map model was trained on maps, old documents, papers and various styles of typefaces/fonts. Based on a dataset contributed by alsa64. Sample

The Forest model is focused on trees, leaves, bark and stone can be used for double upscaling for even more detail. Sample

The Ground model was trained on various pictures of stones, dirt and grass using Google's image search.

The Misc model is trained on various pictures shoot by myself, including bricks, stone, dirt, grass, plants, wook, bark, metal and a few others.

The Armory model was trained with modded textures form Skyrim, including Clothing, Armor and Weapons. (Leather, Canvas and Metal should all work - maybe too sharp so interpolate)

The wood model was trained for Skyrim by Laeris.

The Skyrim Diffuse models is supposed to be used with Skyrim's diffuse textures. It is a bit too sharp so I recommend to interpolating with the RDDB_ESRGAN_x4 model or the mangaAttempt109 model, look in Deorder's Skyrim Model Google Drive for an already interpolated version.

Normal Map Upscaling

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
Normal Maps alsa64 4 GNU GPLv3 Normal Maps 36,000 27 128 ? ? Custom (Normal Maps) Normal Maps - Skyrim artifacted
Normal Maps - Skyrim artifacted Deorder 4 Skyrim Normal Maps 145,000 ? 128 ? ? Skyrim Normal Maps ?

Description

The first one is based on the second one it was trained, with a higher learning rate and insane n_workers and batch_size values. It is meant to replace the old Normal Map model from Deorder, but without adding BC1 compression to your normal maps.

The second one was trained on Skyrim's Normal Maps, including compression artifacts, so it will have to be redone.

Grayscale Upscaling

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
Skyrim Alpha Deorder 4 Alpha Channel 105,000 ? 128 ? ? Alpha Channels from Skyrim ?

Description

Trained to upscale grayscale images, like specular or alpha etc.

Artifact Removal

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
BC1 take 1 alsa64 1 GNU GPLv3 BC1 Compression 100,000 2 128 111 1,800 Custom (Photos) Failed Attempts
BC1 take 2 alsa64 1 GNU GPLv3 BC1 Compression 260,850 2 128 106 4,7K Custom (Photos / Manga) JPG (0-20%)
BC1 take 3 Noise Aggressive alsa64 1 GNU GPLv3 BC1 Compression 400,000 2 128 26 28,985 Custom (just about everything) BC1 take 2
JPG (0-20%) alsa64 1 GNU GPLv3 JPG compressed Images 178,178 2 128 52 6230 Custom (Photos / Manga) JPG (20-40%)
JPG (20-40%) alsa64 1 GNU GPLv3 JPG compressed Images 140,798 2 128 42 6230 Custom (Photos / Manga) JPG (40-60%)
JPG (40-60%) alsa64 1 GNU GPLv3 JPG compressed Images 100,000 2 128 31 ~6.5K Custom (Photos / Manga) JPG (60-80%)
JPG (60-80%) alsa64 1 GNU GPLv3 JPG compressed Images 91,000 2 128 27 ~6.5K Custom (Photos / Manga) JPG (80-100%)
JPG (80-100%) alsa64 1 GNU GPLv3 JPG compressed Images 162,000 2 128 51 ~6.5K Custom (Photos / Manga) BC1 take 1
JPG PlusULTRA twittman 1 JPG compressed Images 130,000 1 ? 150 937 Custom (Manga) Failed Attempts
Cinepak twittman 1 Cinepak, msvideo1 and Roq 200,000 1 128 21 ~8K Custom (Manga) none
DeDither alsa64 1 GNU GPLv3 Dithered Images 126,900 2 128 53 4,700 Custom (Photos / Manga) JPG (0-20%)
dither_4x_flickr2k_esrgan, dither_4x_flickr2k_psnr buildist 4 Ordered dithering 280,000 16 128 ? 2640, ~8k Flickr2K, OST dithered with GIMP none
DeSharpen loinne 1 Oversharpened Images 310,000 1 128 48 ~3K Custom (?) Failed Attempts
AntiAliasing twittman 1 Images with pixelated edges 200,000 1 128 440 656 Custom (?) none

Description

Models to remove compression artifacts.

The BC1 take 2 model is better than my first BC1 model (BC1 take 1), It also might improve edges and tone differences between before and after somewhat. The Dataset was based on the JPG dataset, slightly balanced to contain less manga styled images. Note that BC1 compression is also used for the RGB channel in BC3. BC1=DXT1, BC3=DXT5. Do not use any of them for uncompressed textures.

JPG gets compressed witch a Quality Percentage between 0 and 100. So depending on how bad your JPEGs are compressed, choose the model of your choice. You can use ImageMagick to guess the Quality percentage, but keep in mind that it might be wrong, since the image might have been resaved.

The Cinepak model removes movie compressions artifacts from older video compression methods like Cinepak, msvideo1 and Roq.

Dithering is an older compression method, where the amount of color gets reduced, if your image has few colors or banding try the Dedither model.

Ordered dithering is a less common form of dithering that results in distinctive checkerboard/crosshatch patterns, which are misinterpreted as texture by models not trained on it. It's often used on GIFs because the pattern is stable between frames. For the 4x model, start with the ESRGAN model, and interpolate with the PSNR model if the result is too sharp.

The DeSharpen model was made for rare particular cases when the image was destroyed by applying noise, i.e. game textures or any badly exported photos. If your image does not have any oversharpening, it won't hurt them, leaving as is. In theory, this model knows when to activate and when to skip, also can successfully remove artifacts if only some parts of the image are oversharpened, for example in image consisting of several combined images, 1 of them with sharpen noise. It is made to remove sharpen noise, particulary made with Photoshop "sharpen" or "sharpen more" filters OR Imagemagick's -sharpen directive with several varying parameters of Radius and Sigma, from subtle 0.3x0.5 to something extreme like 8x2, somewhere about that.

AntiAliasing is for smoothing jagged edges in images and textures.

Pretrained models for different scales

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
1xESRGAN victorca25 1 Pretrained model 1 128 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
2xESRGAN victorca25 2 Pretrained model 4 128 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
4xESRGAN victorca25 4 Pretrained model 8 128 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
8xESRGAN victorca25 8 Pretrained model 16 128 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth
16xESRGAN victorca25 8 Pretrained model 16 128 65k Combination of DIV2K, Flickr2K and GOPRO RRDB_ESRGAN_x4.pth

Description

These models were transformed from the original RRDB_ESRGAN_x4.pth model into the other scales, in order to be used as pretrained models for new models in those scales.

More information can be found here.

Others

Name Author Scale License Purpose Iterations Batch Size HR Size Epoch Dataset Size Dataset Pretrained Model
normal generator LyonHrt 1 Difuse to Normal 215,000 1 128 45 4,536 Custom (?) none

Description

The model was trained on pairs of diffuse textures and normal maps.

Other Sources:

Cartoon Painted Models

License:

GNU GLPv3:

  • You can't sell the model under that license
  • If you modify, interpolate or use the model as a pretrained model for your own model and share results of your resulting model, it will have to be under the same license, meaning that you can't sell it.
  • You have to state that you used the model and its author for your results.
  • You have to state any changes you made to the model.
  • There are other points, but those are the main ones.

In addition to that all models by:

  • alsa/alsa64

have the following additional restriction:

  • You can't sell results generated with a model using that license.
You can’t perform that action at this time.