<a href="https://colab.research.google.com/github/DiGyt/snippets/blob/master/SGAN_BlogPost_group_21.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/new_york_3D_20000ep_300_1000.png">

# Blog Post: Implementing ANNs with TensorFlow
# **Texture Synthesis with Spatial Generative Adversarial Networks**

A replication of Jetchev, Bergmann & Vollgraf (2016) by:

---

Lucas Feldmann

Mail: lufeldmann@uni-osnabrueck.de

Student ID: 983 205

---

Dirk Gütlin

Mail: dguetlin@uni-osnabrueck.de

Student ID: 983 692

---
Josefine Aimée Zerbe

Mail: jzerbe@uni-osnabrueck.de

Student ID: 960 230

# Abstract

Since their first introduction in 2014 by Ian J. Goodfellow and colleagues, Generative Adversarial Networks (GANs) have had a heavy impact on the landscape of neural networks. Briefly described, the idea of GANs is to boost the performance of a generator network by letting it compete against a discriminator network, thus, enabling the generator to model its output closer to the underlying statistical distribution of the data set.   

In this blog post, we describe the implementation and following results of replicating a spatial GAN (SGAN) as developed by Jetchev, Bergmann and Vollgraf in 2016. The SGAN is specialized in texture synthesis, generating new compositions of visual patterns from image data. One major advantage of SGANs is the ability to produce images of variable output sizes. 

In our [Colab script](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=YTjV2VpqjBJI), you can investigate our reconstruction and training of the spatial GAN in Tensorflow. 

# Introduction

During recent years, approaches of deep learning based methods for image classification and image analysis have been under rapid progress. In general, the application of deep neural networks for the statistical classification of a data set can be broadly divided into two main approaches (Jebara, 2004).   
In the first approach, the objective is to discriminate a target from a received input. They are called discriminator models and are a popular solution applied to classification tasks or regression tasks.  
The second approach aims to classify or generate new data from the observed data, thereby trying to model the underlying stochastic distribution of the input as good as possible. Both kinds of networks can be trained to either discriminate or generate visual scenery, auditory patterns, or other forms, symbols, and figures of input data.  

In the last two decades, especially discriminative models have gained a high interest to researchers in the field of machine learning because of their exceptional successes in solving classification or regression problems, reaching results that are nearly human-like, if even better (Hinton et al., 2012; Krizhevsky et al., 2012). Further, due to competitions like the ImageNet Challenge (ILSVRC), more improved networks for discrimination tasks have emerged (Russakovsky et al., 2015).   
In comparison, the implementation of generative models has proven to be quite difficult (Goodfellow et al., 2014). Grasping the underlying distribution of a data set and producing a new, previously unknown instance introduces a vast set of different limitations and problems. Thus, generative models had not yet had the same impact as their discriminative counterparts. However, recent advances have been made by researchers like Goodfellow and colleagues (2014). In their paper, a new technique for training generative neural networks was introduced in which an adversarial approach helps to enhance the learning process of the generative model.  The generative model aims to produce new instances of data that can fool the second, adversarial model. In turn, the competing discriminator model tries to distinguish the newly generated data from the example data. Due to the feedback loop of a discriminative network, the Generative Adversarial Network (GAN) approach promoted large improvement of the generative modeling technique on a general level.   
   
The concept of GANs seems challenging. At the same time, it promises potential for development in the nearby future. Hence, the objective of this project was to have a closer look into the issue by replicating a specialized form of a Generative Adversarial Network - that is, a Spatial Generative Adversarial Network (SGAN). First introduced by the Zalando research group around Jetchev, Bergmann and Vollgraf (2016), spatial GANs aim at generating new image data of visual patterns in textures. One example used in the original paper and reproduced by our group is the view of the city of Barcelona from a bird's perspective, where a clear, recurring pattern in the architecture and infrastructure of buildings and streets is observed. The SGAN tries to find the underlying structure of the pattern and recreates it in new instances.  
For our project, the paper of Jetchev and colleagues served as a blueprint to our network. Any code and additional information to their approach which was used in this project can be found [here](https://github.com/zalandoresearch/spatial_gan).


# Theoretical Background

Before dealing with the actual implementation of a spatial approach to generative modeling, a short theoretical background is given about the general idea behind GANs and SGANs and the problem of texture synthesis.

As described in the introduction, on the most basic level a generative network tries to model the underlying distribution which is believed to be responsible for the actually created data samples. After supposedly approaching the distribution, the model can then classify received samples or create new samples accordingly.

Besides generative networks, adversarial approaches have been implemented in machine learning before. For example, adversarial search (also minimax search) is often applied in gaming algorithms, where two players compete against each other (as implemented in AlphaGo (Silver et al., 2016)). In the setting of GANs, the term ‘adversarial’ describes the optimization process of the two networks which are competing against each other, like two players in a minimax game (Goodfellow et al., 2014). 
The adversarial implementation for GANs is not to be mistaken with adversarial machine learning, where the aim has been rather to deceive a machine learning system into accepting corrupted or altered input than to create new, original instances.

One factor for the implementation of a spatial GAN on a texture poses the understanding of the task of texture synthesis (Wei et al., 2009). In this context, the term ‘texture’ describes a visual pattern that is repeated with a varying degree of randomness on a (natural) surface or image. Hence, texture synthesis aims to generate new instances of the same pattern by learning the underlying structure of the pattern from a finite set of samples. The resulting structure could be viewed as a new sample from the same distribution from which the previous samples have been taken. In the following approach, different categories of texture samples have been fed to our SGAN with varying degrees of success. 




## GANs

The application of Generative Adversarial Networks has first been coined by Ian J. Goodfellow et al. (2014). The idea is to train a generative model G by letting it compete with a rival, discriminative network D. Usually, the generator creates new samples and passes them to the discriminator which, in turn, has to discern the generated data from the example data.
An analogy can be found in art forgery, where forgers copy the painting style of a famous artist in more sophisticated ways while art experts develop methods to discriminate the fake paintings from real art works.  
The generator model receives the evaluation of its produced instances from D in form of weight updates in a feedforward manner. This enhances G to produce input that is closer to the underlying distribution from which the original data was derived (without ever seeing the original data), thereby making the discrimination task harder for the network D. The generator model does not receive any input from actual data samples. The process of updating runs until the discriminator can no longer dicern the generated input from the original samples, thus, improving the generation ability of G to an extend where it can fool another specialized deep learning model - and occasionally even human observers. The quality evaluation of the generated data is usually performed by a human where the similarity to the original input is compared. There have been many applications of GANs since their first introduction to the machine learning landscape, with most striking successes in the generation of natural image data, the creation of portraits of human faces, or texture-to-image translations (Brownlee, 2019).



## SGANs
 
Spatial Generative Adversarial Networks are a specialized form of GANs in that they are specifically trained on visual image data of textures in which a recurrent pattern can be found (Jetchev et al., 2016). The task is not to train the generator for any classification task but solely for the production of original content. Previous to the creation of SGANs, GANs had been further enhanced by implementing deep convolutional layers with fractional stride and using batch normalization (Radford et al., 2015).   
An advantage of the SGAN approach compared to classical GANs is that the size of the generated image is not limited to a certain number of pixels but can alternate between different image sizes. For example, GANs as implemented by Goodfellow generated images of sizes like 64x64 pixels, while with SGANs varying sizes of up to 2048x2048 pixels have been created. This is realized by employing only convolutional layers in the generator model and avoiding any fully connected layers. Especially the ability of SGANs to create images of arbitrary size has been emphasised by its creators (Jetchev et al., 2016). In the original paper, the application of the SGAN was performed on 2D image data of varying pixel sizes. More details on the implementation of the SGAN by Jetchev and colleagues will follow in the next section, together with a close description of our replication.


# Implementation

The original SGAN architecture was implemented for Theano/Lasagne. Our replication of the project was written using Tensorflow 2 including the in-built version of Keras. The replication was performed entirely cloud based, with code execution performed in [Google Colab](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=zCQlZdloEPXn) and models, training samples and results stored in a [GitHub repository](https://github.com/DiGyt/iannwtf-fp). With Tensorflow 2 as a flexible and feature-rich toolbox, we were able to adequately recreate the original SGAN implementation.

The main model structure of the SGAN is based on the Deep Convolutional Generative Adversarial Network (DCGAN) which was released by Radford et al. (2015) as an improvement to ordinary GAN architectures. In general, these convolutional GANs consist of a simple CNN architecture as a discriminator and a "reverse" CNN architecture as a generator. Both architectures will be explained in detail in the course of this blog.

Jetchev et al. (2016) define their SGAN as a dynamical network architecture with variable depth. In their paper, they report results on different depths of their models, namely for 4 (SGAN4), 5 (SGAN5) and 6 (SGAN6) convolutional layers. Layer numbers for generator and discriminator are defined symmetrically, such that e.g. for SGAN4, the discriminator consists of 4 convolutional layers while the generator consists of 4 transposed convolutional layers. Other parameters such as the number of filters/kernels or the size of the generators random input matrix Z are dynamically defined to match the depth of the network. For a visual explanation of the model structure, see the image below.

Although our implementation of the model also included the possibility to create the SGAN5 and SGAN6 architecture, all our samples were generated using the SGAN4 because the limited computational resources by Google Colab made it hard to train the larger networks.

## Generator and Discriminator Architecture

In the following, we describe the generator and discriminator architecture as defined by Jetchev et al. (2016). Our reconstruction of the implementation can be seen [here (generator)](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=IDVIkLho7Zjk) and [here (discriminator)](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=dwkSTt3T7ani). Our definition of additional network parameters can be found [here](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=0Qo3NtH4_NBR).

<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/SGAN_overview.png">

_Image: Overview of the SGAN model architecture (Self-generated figure). Layer sizes are variable and symmetric for both networks. The number of filters starts with 64 and is doubled for each additional layer. The dimension constraint is defined in a way that the depth of the SGAN and the size of the image patches are interdependent and should be defined accordingly. This means, for larger patches more layers are required._
<br/>
<br/>

Radford et al. (2015) introduce several guidelines for succesful convolutional GAN architectures:
> - Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).
> - Use batchnorm in both the generator and the discriminator
> - Remove fully connected hidden layers for deeper architectures.
> - Use ReLU activation in generator for all layers except for the output, which uses Tanh.
> - Use Leaky ReLU activation in the discriminator for all layers.

For their SGAN implementation, Jetchev et al. (2016) mostly adhere to this advice. The SGAN architecture consists mainly of convolutional layers without intermitting pooling layers. Instead, strided convolutions of 2 strides are used in the SGAN discriminator and fractional-strided convolutions of factor 1/2 are used for the SGAN generator.
Batchnorm was implemented for all layers of the SGAN except for the input layer of the generator and the input and output layer of the discriminator.
Similar to Radford et al. (2015), all weigths of the SGAN are initialized as normally distributed, with a mean of 0 and a standard deviation of 0.02.
For the SGAN generator, ReLU activation functions were used in all convolutional layers and a tanH activation function in the output layer. In the SGAN discrimator, leaky ReLUs were used as activation function for all convolutional layers. Additionaly, a sigmoid activation function was used in the final layer of the discriminator, mapping the output on a range between 0 and 1 in order to classify the images as either true or false.

For the generator input, a 3-dimensional random tensor Z is passed (excluding the additional dimension for the batch size). Of the three tensor dimensions (l, m, d), the l and m dimensions can be flexibliy changed to vary the size of the image created by the generator. For our example, we used patches of (l=10, m=10) for training and later created larger textures by passing an input vector Z with dimensions (l=1000, z= 1000). The third dimension d in the input tensor gives a basis for the amount of variability passed to the generator. The higher dimension d, the more variable factors can be used as a basis for the randomly generated patterns. In their paper, Jetchev et al. (2016) define the size of dimension d as d=20 for the SGAN4, d=50 for the SGAN5 and d=100 for the SGAN6.

To match the flexible layer size of the SGAN, the number of filter kernels was implemented as 64 filters for the first layers and doubling the number of filters for each consecutive layer. The size of the filtering kernels was kept constant for all layers with a kernel size of (5, 5). Additionally, zero-padding was applied, keeping the size of all feature maps constant.

For our replication, we implemented the generator and discriminitor models in a way that allows us to dynamically create an SGANx architecture of variable size. 

## Optimizer, Loss, and Hyperparameters

For an optimal training performance, Jetchev et al. (2016) define the following features for the SGAN.

As training optimizer, an ADAM optimizer with a constant learning rate of 0.0002 (as suggested by Radford et al., 2015) is used. As a loss function, a common GAN-specific implementation of the standard binary cross entropy loss is used. Additionally, the loss is regularized using the L2 norm. Our implementations of the losses and the L2 regularization can be found [here](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=sVGJqeDEWxhk).

For each training step of the SGAN, a batch of 32 real data samples is compared to an equally sized set of generated samples of the same size. Our definition of additional hyperparameters can be found [here](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=EToWIiN2ZBlB).

# Training

Training the network was comparatively easy due to the flexible implementation of the SGAN. First of all, few data samples were needed to train the model since the SGAN4 trains on comparatively small image patches, which can be created from dividing up larger images of repetitive patterns. For our training, we exclusively relied on single (partly 3D-projected) satellite images created from Google Earth as well as license free images of natural or artificial patterns.

Due to a flexible [preprocessing function](https://colab.research.google.com/drive/12JwRiCUxADdTnR-tXoKR1ZIjxvgJN92L#scrollTo=rryYbA884dl7) we were not restricted to a specific image shape or size, but were able to train on any .jpg image.

The use of Tensorboard allowed us to constantly monitor the training progress and frequently visualize sample patterns created by the model.

However, one large obstacle during training which prevented progress for a longer period of time was the appearance of inf/NaN values in the loss tensors. These inf/NaN values were then propagated onto the gradients and model weights and lead to a complete blacking out of generated images. Due to the relatively frequent occurence of this phenomenon (~1 in 3000 steps), efficient training was made impossibe. Clipping the loss values or ommiting the gradient updates if inf/NaN values appeared did not show to be effective, since the model then converges to a certain loss level and stagnates. After more research into this problem we found that the error was produced in the (self-defined) loss function, if the sigmoid gated output values of the discriminator network reached their extreme values of 0 or 1, and then were converted into -inf by either a `log(0)` or a `log(1-1)` operation. Therefore, a solution to this training problem was to clip the sigmoid output of the discriminator between 0.0000001 and 0.9999999, so it wasn't actually able to converge to 0 or 1 (smaller clipping thresholds made the problem reappear). This small adaptation solved the problem and finally allowed us to continuously train for longer periods of time.

Generally, training a model on SGAN4 for 10000 epochs would take between 1.5 and 2.5 hours (on Google Colab's GPU-accelerated hardware). Our longest trained model is the 3D structure model of New York, which was trained for 20000 epochs. However, for most of the structures, the generator/discriminator losses never showed any sign of convergence in any direction, indicating that the quality of the generated images might still be increased by a substantial amount with more training.

# Results

A very interesting aspect of SGANs is that their spatial dimensions are "locally independent" (Jetchev et al., 2016). This means that unlike most other GANs (where input/output sizes are fixed) the size of the generated textures created with the SGAN can be arbitrarily scaled by simply varying a set of dimensions for the input noise vector. As a result, one SGAN model can produce arbitrarily large texture images of the same learned pattern, while still creating independent and non-repetitive local patterns.

<br/>
<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/samples/textures/dried_mud.jpg" width="500" height="500">
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/dry_mud_4000ep_1000_1000.png" width="500" height="500">

_Image: The training sample (left) and an example texture (right) for our model trained on an image of dry, cracked-up mud for 4000 epochs._

As seen in the above image, the patterns produced do not only look non-repetitive, but are also seamlessly integrated without any hard breaks or borders between parts of the texture.


However, you also can see that the pattern follows a comparatively low level and a local structure that does not necessarily look realistic on a global view. Another example for this is the following image, trained on a Google Earth aereal perspective photo of Barcelona:

<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/samples/earth/2D/barcelona_1.jpg" width="500" height="500">
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/barcelona_10000ep_1000_1000.png" width="500" height="500">

_Image: The training sample (left) and an example texture (right) for our model trained on a Google Earth view of Barcelona for 10000 epochs._
<br/>
<br/>

While our 4-layer-SGAN creates seemingly realistic, rectangular roofs and wall shapes, it fails to recognize higher level features such as the rectangular house blocks and the prominent rectangular street grid of Barcelona. Jetchev et al. (2016) show in their paper that this locality is due to the low perceptive field of the discriminator (and therefore also the low "projective" area of the generator) in SGAN4 models. Deeper models such as the SGAN6 perform better at generating global level structures with near realistically seeming street grids.

The strength of SGANs and their performance on repetitive patterns can be easily show when comparing the above image of Barcelona with the below image of Venice. While for Barcelona, the mainly rectangular shapes are taken into account by the generator model, the more curved cityscape of Venice (including features like the canals) are harder for the model to grasp. With the SGAN4 only perceiving smaller features, the Venice input data is merged into vague polygonal shapes that barley resemble houses.

<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/samples/earth/2D/venedig_1.jpg" width="500" height="500">
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/venice_10000ep_1000_1000.png" width="500" height="500">

_Image: The training sample (left) and an example texture (right) for our model trained on a Google Earth view of Venice for 10000 epochs._
<br/>
<br/>

The functionality of the SGAN can also be shown when looking at more abstract and well defined patterns, such as the below Turing pattern. For this relatively simple pattern, the model produced reasonable results after ~ 2000 epochs, although you can clearly see the generated image deviates from the original one by grainier and more locally curved patterns.

<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/samples/textures/turing_pattern.jpg" width="500" height="500">
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/turing_pattern_10000ep_2000_2000.png" width="500" height="500">


_Image: A simple black and white turing pattern (left) and the resulting texture after training on SGAN4 for 10000 epochs (right)._
<br/>
<br/>



## 3D Textures

As an additional application, we investigated the SGANs performance on images containing a 3D perspective, such as the below picture of a romanesco cauliflower:

<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/samples/textures/romanesco.jpg" width="500" height="500">
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/romanesco_8000ep_1000_1000.png" width="500" height="500">


_Image: A real image of a romanesco cauliflower used for training (left) and a generated texture after 8000 epochs of training on the SGAN4 (right)._
<br/>
<br/>

As you can see, the SGAN4 is generally able to capture the depth of the above image by a certain degree. The SGAN mimics the 3-dimensional structure of the original image by realistically applying brighter and darker spots, which even exhibit signs of the romanesco's fractal-like form on a very local level. However, as for the 2D images, the shallow SGAN4 fails to distinguish more global features of the above image, resulting in an overall texture that does not exactly match the original image.

Another interesting aspect of the SGANs abilites in 3-dimensional space can be seen in the below reconstruction of a 3D perspective of New York:

<br/>
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/samples/earth/3D/ny_1.jpg" width="500" height="500">
<img src="https://raw.githubusercontent.com/DiGyt/iannwtf-fp/master/gallery/new_york_3D_20000ep_1000_1000.png" width="500" height="500">

_Image: A 3D perspective sample of New York taken from Google Earth (left) and the training result after 20000 epochs of training on SGAN4 (right)._
<br/>
<br/>

In the generated pattern, the SGAN catches the general cityscape quite well, even including specific architectural features of generated buildings. However, while the original image includes a vanishing point perspective (where buildings on the left are seen from another direction than buildings on the right), the generated SGAN pattern fails to reconstruct this perspective. Since the SGAN is trained on only small fractions of this image, where each sample might either be taken from the left or from the right half of the image, the SGAN has no information on how to integrate this perspective. If you look closely at the generated sample, you can see that single building textures in the image are a little bit skewed to the left or the right, instead of strictly rising up straight. These "leaning towers of New York" are probably the result of the vanishing point perspective, which could not be integrated by the SGAN4.
<br/>
<br/>

# Retrospective and Outlook

Structures generated from SGANs can have a wide range of applications. From general image processing, over creating non-repetitive textures for computer models, to the application in clothing and fashion-related products.

During the replication of the SGAN architecture, the   documentation of the original network provided within the paper and their GitHub repository proved very valuable to us. While open science is growing more popular, our experience with papers in the field of neural networks is that they can be lacking in detail in order to replicate the network. 

Additionally, the TensorFlow 2 API provides a handy tool for the implementation of such networks, especially for handling the backpropagation during the weights updates. This also became more apparent in direct contrast to the Theano architecture used in the original paper. 

While the integration of TensorFlow and Google Colab allows for quick and effective implementation and usage of neural networks, we found the limited computational resources a challenge for the training of larger SGAN architectures. Effectively, we were unable to train networks larger than SGAN4, which hindered us from achieving better global features in the generated images. The results of the original paper show that we have not yet reached the ceiling of texture synthesis with this method. Therefore, we would have liked to train a larger model for a longer amount of time in order to explore possible limits of the SGAN architecture.

Another interesting enhancement to SGANs was published by Bergmann, Jetchev & Vollgraf (2017). With their Peridoic Spatial Generative Adversarial Network (PSGAN), they expanded the current SGAN architecture to produce reoccurring periodic patterns. Implementation for the PSGAN as well as examples for the generated periodic patterns can be found under: https://github.com/zalandoresearch/psgan.









# References

Bergmann, U., Jetchev, N., & Vollgraf, R. (2017). Learning Texture Manifolds with the Periodic Spatial GAN. _arXiv preprint._

Brownlee, J. (2019). 18 Impressive Applications of Generative Adversarial Networks (GANs). _Retrieved April 16, 2020, from https://machinelearningmastery.com/impressive-applications-of-generative-adversarial-networks/_

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., & Ozair, S. et al. (2014). Generative Adversarial Networks. _arXiv preprint._

Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A. R., Jaitly, N., ... & Kingsbury, B. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. _IEEE Signal processing magazine, 29(6), 82-97._

Jebara, T. (2004). Machine Learning: Discriminative and Generative. _The Springer International Series in Engineering and Computer Science. Kluwer Academic (Springer). ISBN 978-1-4020-7647-3._

Jetchev, N., Bergmann, U., & Vollgraf, R. (2016). Texture synthesis with Spatial Generative Adversarial Networks. _arXiv preprint._

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. _Advances in neural information processing systems (pp. 1097-1105)._

Radford, A., Metz, L., & Chintala, S. (2015). Unsupervised representation learning with deep convolutional generative adversarial networks. _arXiv preprint._

Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., ... & Berg, A. C. (2015). Imagenet large scale visual recognition challenge. _International journal of computer vision, 115(3), 211-252._

Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., ... & Dieleman, S. (2016). Mastering the game of Go with deep neural networks and tree search. _Nature, 529(7587), 484._

Wei, L. Y., Lefebvre, S., Kwatra, V., & Turk, G. (2009). State of the art in example-based texture synthesis. _HAL preprint._

