## Azure ML GAN Expriment

To use this notebook, you need to download `config.json` file from Azure ML Workspace and place it in this folder. This will allow us to get the workspace reference right away:

In [2]:
dataset = r'C:\Marcus\Important Docs\MLSA\WikiArt'

In [3]:
from azureml.core import Workspace

try:
    ws = Workspace.from_config()
    print(ws.name, ws.location, ws.resource_group, ws.location, sep='\t')
    print('Library configuration succeeded')
except:
    print('Workspace not found')

Performing interactive authentication. Please follow the instructions on the terminal.


The default web browser has been opened at https://login.microsoftonline.com/organizations/oauth2/v2.0/authorize. Please continue the login in the web browser. If no web browser is available or if the web browser fails to open, use device code flow with `az login --use-device-code`.


Interactive authentication successfully completed.
GAN-art-generator	southeastasia	Azure-ML-Workshop	southeastasia
Library configuration succeeded


Then make sure we have the compute cluster. If the cluster does not exist - we will create it programmatically!

In [4]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# Choose a name for your CPU cluster
cluster_name = "GAN"

# Verify that cluster does not exist already
try:
    cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing cluster, use it.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_D3_V2',
                                                           vm_priority='lowpriority',
                                                           min_nodes=1,
                                                           max_nodes=4)
    cluster = ComputeTarget.create(ws, cluster_name, compute_config)

cluster.wait_for_completion(show_output=True)

Found existing cluster, use it.

Running


Now upload the images dataset into the Azure ML Workspace:

In [6]:
from azureml.core import Dataset
from azureml.data.datapath import DataPath
ds = ws.get_default_datastore()
Dataset.File.upload_directory(src_dir=dataset, target=DataPath(ds,"gan_data"), overwrite = True, show_progress=True)

Validating arguments.
Arguments validated.
'overwrite' is set to True. Any file already present in the target will be overwritten.
Uploading files from 'C:/Marcus/Important Docs/MLSA/WikiArt' to 'gan_data'
Creating new dataset


{
  "source": [
    "('workspaceblobstore', '/gan_data')"
  ],
  "definition": [
    "GetDatastoreFiles"
  ]
}

Now let us create training script:

In [7]:
%%writefile train_gan.py
# KeraGAN trainer script

import argparse
import keragan
import keras
import os
import glob
from azureml.core.run import Run
import matplotlib.pyplot as plt

print("KeraGAN Trainer, version {}".format(keragan.__version__))

run = Run.get_context()

parser = argparse.ArgumentParser(description="KeraGAN Trainer")

parser.add_argument("--path",help="Azure ML Datastore and Dataset dir")
parser.add_argument("--size",help="Image size to use", default=512, type=int)
parser.add_argument("--aspect_variance",help="Allowed aspect variance", default=0.5, type=float)
parser.add_argument("--model_path",help="Path to use for saving models", default='models')
parser.add_argument("--samples_path",help="Path to use for saving samples", default='samples')
parser.add_argument("--save_npy_path",help="Filename to save cached dataset for faster loading")
parser.add_argument("--limit",help="Limit # of images to use",type=int,default=None)
parser.add_argument("--batch_size",help="Minbatch size to use",type=int,default=128)
parser.add_argument("--save_interval",help="Epochs between saving models",type=int,default=100)
parser.add_argument("--save_img_interval",help="Epochs between generating image samples",type=int,default=100)
parser.add_argument("--print_interval",help="Epochs between printing",type=int,default=10)
parser.add_argument("--sample_images",help="View image sample",action='store_const',default=False,const=True)
parser.add_argument("--no_samples",help="Number of sample images to generate during training",type=int,default=10)
parser.add_argument("--latent_dim",help="Dimension of latent space",type=int,default=256)
parser.add_argument("--ignore_smaller",help="Ignore images smaller than required size",action='store_const',default=False,const=True)
parser.add_argument("--crop",help="Crop images to desired aspect ratio",action='store_const',default=False,const=True)
parser.add_argument("--epochs",help="Number of epochs to train",type=int,default=100)
parser.add_argument("--lr",help="Learning rate",type=float,default=0.0001)
args = parser.parse_args()

args.height = args.size
args.width = args.size
args.optimizer = None

dcgan_args = {
    'width': args.width,
    'height': args.height,
    'model_path': args.model_path,
    'samples_path': args.samples_path,
    'optimizer': args.optimizer,
    'lr': args.lr,
    'latent_dim': args.latent_dim
}
gan = keragan.DCGAN(**dcgan_args)

image_dataset_args = {
    'path': args.path,
    'height': args.height,
    'width': args.width,
    'aspect_variance': args.aspect_variance,
    'save_npy_path': args.save_npy_path,
    'ignore_smaller': args.ignore_smaller,
    'limit': args.limit,
    'crop': args.crop
}
imsrc = keragan.ImageDataset(**image_dataset_args)
imsrc.load()
print(imsrc.data,imsrc.data.shape[0])
train = keragan.GANTrainer(image_dataset=imsrc,gan=gan,args=args)

def callbk(tr):
    if tr.gan.epoch % 20 == 0:
        res = tr.gan.sample_images(n=3)
        fig,ax = plt.subplots(1,len(res))
        for i,v in enumerate(res):
            ax[i].imshow(v[0])
        run.log_image("Sample",plot=plt)

train.train(callbk)


Writing train_gan.py


In [8]:
from azureml.core import Experiment, ScriptRunConfig, Environment, Dataset, Datastore
from azureml.core.conda_dependencies import CondaDependencies

# Define the environment
environment = Environment(name='keragan-env')
environment.python.conda_dependencies = CondaDependencies.create(
    conda_packages=['keras','tensorflow','tqdm','matplotlib'],
    pip_packages=['azureml-core','azureml-defaults','imutils', 'opencv-python-headless','git+https://github.com/Marcusng01/keragan@506773f62c36c08a6efb7616181bc010526abf43']
)

gan_data = Dataset.File.from_files(path=(ds, 'gan_data'))
script_params = [
    '--path', gan_data.as_mount(),
    '--model_path', './outputs/models',
    '--samples_path', './outputs/samples',
    '--batch_size', 32,
    '--size', 512,
    '--limit', 3000,
    '--epochs', 10000
]

# Create a ScriptRunConfig
script_run_config = ScriptRunConfig(
    source_directory='.',
    script='train_gan.py',
    arguments=script_params,
    compute_target=cluster,
    environment=environment,
)

# Create an experiment
experiment_name = 'KeraGAN'
exp = Experiment(workspace=ws, name=experiment_name)

# Submit the run
run = exp.submit(config=script_run_config)


### Downloading resulting images

After the experiment has completed, you can download resulting images to your local machine. If the experiment was long-running and the notebook session is lost, you can re-create it knowing the run id (which you can get from the portal). Otherwise you can use the same `run` variable from above (skip the cell below)

In [10]:
#Possible to skip this cell
from azureml.core import Run
run = Run(experiment=exp,run_id='KeraGAN_1584048041337') # <-- provide run id from Azure Portal here
run.get_file_names()[:10]

ServiceException: ServiceException:
	Code: 404
	Message: (UserError) Run KeraGAN_1584048041337 was not found
	Details:

	Headers: {
	    "Date": "Sun, 10 Sep 2023 17:38:11 GMT",
	    "Content-Type": "application/json; charset=utf-8",
	    "Transfer-Encoding": "chunked",
	    "Connection": "keep-alive",
	    "Vary": "Accept-Encoding",
	    "Request-Context": "appId=cid-v1:67969c6a-972f-47a9-8267-e09d830cc328",
	    "x-ms-response-type": "error",
	    "Strict-Transport-Security": "max-age=15724800; includeSubDomains; preload",
	    "X-Content-Type-Options": "nosniff",
	    "x-aml-cluster": "vienna-southeastasia-02",
	    "x-request-time": "0.028",
	    "Content-Encoding": "gzip"
	}
	InnerException: {
    "additional_properties": {
        "statusCode": 404
    },
    "error": {
        "additional_properties": {
            "debugInfo": null
        },
        "code": "UserError",
        "severity": null,
        "message": "Run KeraGAN_1584048041337 was not found",
        "message_format": "Run {runId} was not found",
        "message_parameters": {
            "runId": "KeraGAN_1584048041337"
        },
        "reference_code": null,
        "details_uri": null,
        "target": null,
        "details": [],
        "inner_error": {
            "additional_properties": {},
            "code": "NotFoundError",
            "inner_error": null
        },
        "additional_info": null
    },
    "correlation": {
        "operation": "2c748f8cdfc91dda1f2dc20edb6cca42",
        "request": "f4aefb7647626fcd"
    },
    "environment": "southeastasia",
    "location": "southeastasia",
    "time": {},
    "component_name": "run-history"
}

Следующая команда скачивает все сгенерированные сэмплы (которые наш скрипт помещал в директорию `outputs/samples`) на локальный компьютер. При этом в текущей директории появляется директория `outputs/samples` со всеми файлами.

In [30]:
run.download_files(prefix='outputs/samples')

## Getting the Model and Inferring New Images

Once we get the generator model, we can easily infer more images. To do that, let's find out the name of the latest model (it would have the highest epoch number) and download it:

In [3]:
fnames = list(filter(lambda x : x.startswith('outputs/models/gen_'), run.get_file_names()))
print(fnames[:5])
no = max(map(lambda x: int(x[19:x.find('.')]), fnames))
fname = 'outputs/models/gen_{}.h5'.format(no)
fname_wout_path = fname[fname.rfind('/')+1:]
run.download_file(fname)
print(fname_wout_path)

NameError: name 'run' is not defined

Now let's load the model in Keras, and also find out the size of latent noise vector (it is equivalend to the input size of the network): 

In [1]:
import keras
model = keras.models.load_model(fname_wout_path, compile=False)
latent_dim=model.layers[0].input.shape[1]

2023-09-10 05:53:40.581024: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2023-09-10 05:53:50.015057: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-09-10 05:53:50.015224: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory


NameError: name 'fname_wout_path' is not defined

Now let's generate random noise vector and call the model to generate 10 random images. Output of the network is in the range $[-1,1]$, so we need to scale it linearly to the range $[0,1]$ in order to be correctly displayed by `matplotlib`:

In [None]:
import numpy as np
vec = np.random.normal(0,1,(10,latent_dim))
res = model.predict(vec)
res = (res+1.0)/2

In [2]:
import matplotlib.pyplot as plt
fig,ax = plt.subplots(1,10,figsize=(15,10))
for i in range(10):
    ax[i].imshow(res[i])
    ax[i].axis('off')
plt.show()

NameError: name 'res' is not defined