Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Commit

Permalink
fix contrib module warnings
Browse files Browse the repository at this point in the history
  • Loading branch information
szha committed Dec 4, 2020
1 parent 44f91d9 commit ec35320
Show file tree
Hide file tree
Showing 3 changed files with 37 additions and 38 deletions.
1 change: 0 additions & 1 deletion docs/python_docs/python/api/gluon/rnn/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,6 @@ Build-in recurrent neural network layers are provided in the following two modul
:nosignatures:

mxnet.gluon.rnn
mxnet.gluon.contrib.rnn

.. currentmodule:: mxnet.gluon

Expand Down
68 changes: 34 additions & 34 deletions docs/python_docs/python/tutorials/packages/gluon/image/info_gan.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@

# Image similarity search with InfoGAN

This notebook shows how to implement an InfoGAN based on Gluon. InfoGAN is an extension of GANs, where the generator input is split in 2 parts: random noise and a latent code (see [InfoGAN Paper](https://arxiv.org/pdf/1606.03657.pdf)).
The codes are made meaningful by maximizing the mutual information between code and generator output. InfoGAN learns a disentangled representation in a completely unsupervised manner. It can be used for many applications such as image similarity search. This notebook uses the DCGAN example from the [Straight Dope Book](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html) and extends it to create an InfoGAN.
This notebook shows how to implement an InfoGAN based on Gluon. InfoGAN is an extension of GANs, where the generator input is split in 2 parts: random noise and a latent code (see [InfoGAN Paper](https://arxiv.org/pdf/1606.03657.pdf)).
The codes are made meaningful by maximizing the mutual information between code and generator output. InfoGAN learns a disentangled representation in a completely unsupervised manner. It can be used for many applications such as image similarity search. This notebook uses the DCGAN example from the [Straight Dope Book](https://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html) and extends it to create an InfoGAN.


```{.python .input}
Expand Down Expand Up @@ -93,7 +93,7 @@ def get_files(data_dir):
img_arr = transform(img_arr)
images.append(img_arr)
filenames.append(path + "/" + fname)
return images, filenames
return images, filenames
```

Load the dataset `lfw_dataset` which contains images of celebrities.
Expand Down Expand Up @@ -123,7 +123,7 @@ class Generator(gluon.HybridBlock):
self.prev = nn.HybridSequential()
self.prev.add(nn.Dense(1024, use_bias=False), nn.BatchNorm(), nn.Activation(activation='relu'))
self.G = nn.HybridSequential()
self.G.add(nn.Conv2DTranspose(64 * 8, 4, 1, 0, use_bias=False))
self.G.add(nn.BatchNorm())
self.G.add(nn.Activation('relu'))
Expand Down Expand Up @@ -167,7 +167,7 @@ class Discriminator(gluon.HybridBlock):
self.D.add(nn.LeakyReLU(0.2))
self.D.add(nn.Dense(1024, use_bias=False), nn.BatchNorm(), nn.Activation(activation='relu'))
self.prob = nn.Dense(1)
self.feat = nn.HybridSequential()
self.feat.add(nn.Dense(128, use_bias=False), nn.BatchNorm(), nn.Activation(activation='relu'))
Expand All @@ -182,14 +182,14 @@ class Discriminator(gluon.HybridBlock):
feat = self.feat(x)
category_prob = self.category_prob(feat)
continuous_mean = self.continuous_mean(feat)
return prob, category_prob, continuous_mean
```

The InfoGAN has the following layout.
<img src="https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/InfoGAN.png" style="width:800px;height:250px;">

Discriminator and Generator are the same as in the DCGAN example. On top of the Disciminator is the Q model, which is estimating the code `c` for given fake images. The Generator's input is random noise and the latent code `c`.
Discriminator and Generator are the same as in the DCGAN example. On top of the Disciminator is the Q model, which is estimating the code `c` for given fake images. The Generator's input is random noise and the latent code `c`.

## Training Loop
Initialize Generator and Discriminator and define correspoing trainer function.
Expand Down Expand Up @@ -231,13 +231,13 @@ if os.path.isfile('infogan_d_latest.params') and os.path.isfile('infogan_g_lates
There are 2 differences between InfoGAN and DCGAN: the extra latent code and the Q network to estimate the code.
The latent code is part of the Generator input and it contains mutliple variables (continuous, categorical) that can represent different distributions. In order to make sure that the Generator uses the latent code, mutual information is introduced into the GAN loss term. Mutual information measures how much X is known given Y or vice versa. It is defined as:

![gif](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/entropy.gif)
![infogan entropy](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/entropy.gif)

The InfoGAN loss is:

![gif](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/loss.gif)
![infogan loss](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/loss.gif)

where `V(D,G)` is the GAN loss and the mutual information `I(c, G(z, c))` goes in as regularization. The goal is to reach high mutual information, in order to learn meaningful codes for the data.
where `V(D,G)` is the GAN loss and the mutual information `I(c, G(z, c))` goes in as regularization. The goal is to reach high mutual information, in order to learn meaningful codes for the data.


Define the loss functions. `SoftmaxCrossEntropyLoss` for the categorical code, `L2Loss` for the continious code and `SigmoidBinaryCrossEntropyLoss` for the normal GAN loss.
Expand All @@ -254,7 +254,7 @@ This function samples `c`, `z`, and concatenates them to create the generator in

```{.python .input}
def create_generator_input():
#create random noise
z = nd.random_normal(0, 1, shape=(batch_size, z_dim), ctx=ctx)
label = nd.array(np.random.randint(n_categories, size=batch_size)).as_in_context(ctx)
Expand All @@ -265,7 +265,7 @@ def create_generator_input():
return nd.concat(z, c1, c2, dim=1), label, c2
```

Define the training loop.
Define the training loop.
1. The discriminator receives `real_data` and `loss1` measures how many real images have been identified as real
2. The discriminator receives `fake_image` from the Generator and `loss1` measures how many fake images have been identified as fake
3. Update Discriminator. Currently, it is updated every second iteration in order to avoid that the Discriminator becomes too strong. You may want to change that.
Expand All @@ -275,38 +275,38 @@ Define the training loop.

```{.python .input}
with SummaryWriter(logdir='./logs/') as sw:
epochs = 1
counter = 0
for epoch in range(epochs):
print("Epoch", epoch)
starttime = time.time()
d_error_epoch = nd.zeros((1,), ctx=ctx)
g_error_epoch = nd.zeros((1,), ctx=ctx)
for idx, data in enumerate(train_dataloader):
#get real data and generator input
real_data = data.as_in_context(ctx)
real_data = data.as_in_context(ctx)
g_input, label, c2 = create_generator_input()
#Update discriminator: Input real data and fake data
with autograd.record():
output_real,_,_ = discriminator(real_data)
d_error_real = loss1(output_real, real_label)
# create fake image and input it to discriminator
fake_image = generator(g_input)
output_fake,_,_ = discriminator(fake_image.detach())
d_error_fake = loss1(output_fake, fake_label)
# total discriminator error
d_error = d_error_real + d_error_fake
d_error_epoch += d_error.mean()
#Update D every second iteration
if (counter+1) % 2 == 0:
d_error.backward()
Expand All @@ -320,7 +320,7 @@ with SummaryWriter(logdir='./logs/') as sw:
g_error.backward()
g_error_epoch += g_error.mean()
g_trainer.step(batch_size)
q_trainer.step(batch_size)
Expand All @@ -332,15 +332,15 @@ with SummaryWriter(logdir='./logs/') as sw:
%(d_error_epoch.asscalar()/count,g_error_epoch.asscalar()/count, count, epoch))
g_input,_,_ = create_generator_input()
# create some fake image for logging in MXBoard
fake_image = generator(g_input)
sw.add_scalar(tag='Loss_D', value={'test':d_error_epoch.asscalar()/count}, global_step=counter)
sw.add_scalar(tag='Loss_G', value={'test':d_error_epoch.asscalar()/count}, global_step=counter)
sw.add_image(tag='data_image', image=((fake_image[0]+ 1.0) * 127.5).astype(np.uint8) , global_step=counter)
sw.flush()
discriminator.save_parameters("infogan_d_latest.params")
generator.save_parameters("infogan_g_latest.params")
```
Expand Down Expand Up @@ -384,12 +384,12 @@ Take some images from the test data, obtain its feature vector from `discriminat


```{.python .input}
feature_size = 8192
feature_size = 8192
features = nd.zeros((len(test_images), feature_size), ctx=ctx)
for idx, image in enumerate(test_images):
feature = discriminator(nd.array(image, ctx=ctx))
feature = feature.reshape(feature_size,)
features[idx,:] = feature.copyto(ctx)
Expand All @@ -407,22 +407,22 @@ for image in test_images[:100]:
plt.subplot(1,10,1)
visualize(image)
for i in range(2,9):
for i in range(2,9):
if indices[i-1][1] < 1.5:
plt.subplot(1,10,i)
sim = test_images[indices[i-1][0]].reshape(3,64,64)
visualize(sim)
plt.show()
plt.clf()
```
![png](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/output.png)<!--notebook-skip-line-->
![png](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/output.png)<!--notebook-skip-line-->

## How the Generator learns
We trained the Generator for a couple of epochs and stored a couple of fake images per epoch. Check the video.
![alt text](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/infogan.gif)
![infogan infogan](https://raw.githubusercontent.com/dmlc/web-data/master/mxnet/doc/tutorials/info_gan/infogan.gif)


The following function computes the TSNE on the feature matrix and stores the result in a json-file. This file can be loaded with [TSNEViewer](https://ml4a.github.io/guides/ImageTSNEViewer/)
The following function computes the TSNE on the feature matrix and stores the result in a json-file. This file can be loaded with [TSNEViewer](https://ml4a.github.io/guides/ImageTSNEViewer/)


```{.python .input}
Expand All @@ -437,15 +437,15 @@ tsne = TSNE(n_components=2, learning_rate=150, perplexity=30, verbose=2).fit_tra
data = []
counter = 0
for i,f in enumerate(test_filenames):
point = [float((tsne[i,k] - np.min(tsne[:,k]))/(np.max(tsne[:,k]) - np.min(tsne[:,k]))) for k in range(2) ]
data.append({"path": os.path.abspath(os.path.join(os.getcwd(),f)), "point": point})
with open("imagetsne.json", 'w') as outfile:
json.dump(data, outfile)
```

Load the file with TSNEViewer. You can now inspect whether similiar looking images are grouped nearby or not.
Load the file with TSNEViewer. You can now inspect whether similiar looking images are grouped nearby or not.

<img src="https://raw.githubusercontent.com/NRauschmayr/web-data/master/mxnet/doc/tutorials/info_gan/tsne.png" style="width:800px;height:600px;">

Expand Down
6 changes: 3 additions & 3 deletions python/mxnet/gluon/rnn/rnn_cell.py
Original file line number Diff line number Diff line change
Expand Up @@ -1432,9 +1432,9 @@ def dynamic_unroll(cell, inputs, begin_state, drop_inputs=0, drop_outputs=0,
>>> state_shape = (batch_size, input_size)
>>> states = [mx.nd.normal(loc=0, scale=1, shape=state_shape) for i in range(2)]
>>> valid_length = mx.nd.array([2, 3])
>>> output, states = mx.gluon.contrib.rnn.rnn_cell.dynamic_unroll(cell, rnn_data, states,
valid_length=valid_length,
layout='TNC')
>>> output, states = mx.gluon.rnn.rnn_cell.dynamic_unroll(cell, rnn_data, states,
... valid_length=valid_length,
... layout='TNC')
>>> print(output)
[[[ 0.00767238 0.00023103 0.03973929 -0.00925503 -0.05660512]
[ 0.00881535 0.05428379 -0.02493718 -0.01834097 0.02189514]]
Expand Down

0 comments on commit ec35320

Please sign in to comment.