# Variational Autoencoders for MapleStory Character Generation
Author: Jin Yeom (jinyeom@utexas.edu)

In this notebook, we're going to build a **variational autoencoder** (VAE) model for generating [MapleStory](http://maplestory.nexon.net/landing/) characters. There wasn't any particular reason for the choice of data (I did collect the dataset myself, thinking it would be cool to be able to generate MapleStory characters), but the main goal of this project is to learn how VAE works as a generative model.

In [6]:
import time
import pathlib
from uuid import uuid4
from urllib.request import Request, urlopen
from urllib.error import HTTPError
from bs4 import BeautifulSoup
from tqdm import tqdm_notebook as tqdm
import torch
from torch import nn
from torch.nn import functional as F
import torchvision
from torchvision import transforms
from torchvision import datasets
from matplotlib import image as mpimg
from matplotlib import pyplot as plt

## Collecting MapleStory image data

I won't share the images that I collected, as they are not mine, but I seemed to have accidentally left the script below that scrapes the same webpages. How careless of me.

In [None]:
# NOTE: do NOT run this script unless it's necessary!

try:
    # mkdir datasets/maple-vae if it doesn't exist yet
    pathlib.Path("datasets/maple-vae").mkdir(parents=True)
except FileExistsError:
    print("datasets/maple-vae already exists, moving on...")

# URL template for scraping MapleStory avatars from the ranking page
url_tmpl = ("http://maplestory.nexon.net/rankings/overall-ranking/"
            "legendary?rebootIndex=0&pageIndex={}#ranking")
for page_index in tqdm(range(1, 70000, 5)):
    url = url_tmpl.format(page_index)
    req = Request(url, headers={"User-Agent": "Mozilla/5.0"}) # pretend you're human!
    soup = BeautifulSoup(urlopen(req), "html.parser")
    for av in soup.find_all("img", {"class": "avatar"}):
        for i in range(5): # try 5 times
            img_req = Request(av["src"], headers={"User-Agent": "Mozilla/5.0"})
            try:
                img = urlopen(img_req).read()
                with open(f"datasets/maple-vae/{uuid4().hex}.png", "wb") as f:
                    f.write(img)
                break
            except ConnectionError:
                now = time.now()
                print(f"[{now}] Connection error, retrying in 3 seconds...", flush=True)
                time.sleep(3)
            except HTTPError:
                now = time.now()
                print(f"[{now}] HTTP error, skipping...", flush=True)
                break
        else:
            raise
    if page_index % 500 == 0:
        time.sleep(5)

## Dataset

In [9]:
def imshow(img):
    img = img / 2 + 0.5
    npimg = img.numpy()
    plt.figure(figsize=(16, 8))
    plt.title("Sampled MapleStory characters")
    plt.imshow(np.transpose(npimg, (1, 2, 0)))

In [None]:
images, labels = iter(train_loader).next()
imshow(torchvision.utils.make_grid(images))

## Model

![VAE](images/VAE.png)