Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New tutorial/example for qGAN #339

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

oliverob
Copy link

@oliverob oliverob commented Aug 12, 2020

Closes #325

I have written a new tutorial/example jupyter notebook for implementing a qGAN using TFQ and TF-GAN. However, I have been stuck for a while trying to actually get it to work. My network doesn't seem to be learning correctly, and I am not sure how to fix it. If anyone could give me any pointers that would be brilliant. Currently, my discriminator loss is growing out of control, whilst my generator loss is shrinking. I am worried that the problem might actually be caused by a bug in the implementation that I can't find rather than something like the learning rates but I am not sure.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@jaeyoo
Copy link
Member

jaeyoo commented Aug 12, 2020

Thank you for this awesome contribution. First of all, I saw that tensorflow_gan is not installed. could you install? in notebook, we can get it by !pip install -U tensorflow_gan. Please let me review this notebook in a week, due to limited bandwidth. Thanks!

@@ -0,0 +1,988 @@
{
Copy link
Member

@jaeyoo jaeyoo Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor typos:

arbitrary

an n-qubit

A qGAN is a quantum version of ...

Well, you said $p^j_\theta$ is related to the probability of the state j, but it is the probability of the state j.


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initialize qubits.

qubits = cirq.GridQubit.rect(1, num_qubits)


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

@jaeyoo jaeyoo Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mu = 1

sigma = 1

why do you -0.5 to 2**num_qubits? why don't use just < rather thatn <= with arbitrary number 0.5?

There are many typos.

Please replace all continous to continuous

hardardman --> Hadamard

intialise --> initialize

Lint issues. please keep the number of characters in a line under 80. For example,

# Discretize the remaining samples so the continous distribution can be
# approximated by a discrete distribution

discrete_data = tf.convert_to_tensor(np.digitize(
continous_data, [i - 0.5 for i in range(1, 2**num_qubits)]),
dtype=tf.dtypes.int32)

what's the purpose of this line? it uses the list of bin values [0.5, 1.5, 2.5, 3.5] for num_qubits = 2. why not use [0., 1.0, 2.0, 3.0, 4.0] with right=True option in np.digitize ?

# Convert the decimal into binary tensor
discrete_data = tf.cast(tf.math.mod(tf.bitwise.right_shift(
tf.expand_dims(discrete_data, 1), tf.range(num_qubits)), 2),
dtype=tf.float32)

 

Also, we can clean up the last for-loop code.

noise = tfq.convert_to_tensor(
[cirq.Circuit(cirq.H.on_each(qubits)) for _ in range(size)])

 



Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the result have 3 qubits even if num_qubits = 2 and you mentioned "2-qubit example"


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

@jaeyoo jaeyoo Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use format

theta = sympy.symbols(f'a0:{num_qubits}')

Lint please.

inputs = tf.keras.Input(shape=(), dtype=tf.dtypes.string)

Simplify it

 parameterized_circuit = cirq.Circuit([cirq.ry(t)(q) for t, q in zip(theta, qubits)])

Line please. < 80 characeters

# Entangle all the qubits by applying CZ in a circular fashion 
# except when there are only two qubits and then just apply one CZ

Simplify it.

entangle_circuit = cirq.Circuit(
    [cirq.CZ(q1, q2) for q1, q2 in zip(qubits[0:-1], qubits[1:])])

if(num_qubits > 2):
entangle_circuit = cirq.Circuit(cirq.CZ(qubits[0], qubits[-1]))

 

Lint please.

# Add this circuit layer to the network with an output on measurements on

in the Z component. Manipulate the output so it maps the -1, 1 outputs

to 0, 1 like the binary discrete data generated by generate_data

 

Simplify it. and saying "Important to have repetition = 1" is not helpful at all. could you describe the real reason?

observables = [(cirq.Z(q)+1)/2 for q in qubits]

[describe the real reason why it is important to have repetition = 1]

layer = tfq.layers.PQC(layer_circuit, observables, repetitions=1)(inputs)
model = tf.keras.Model(inputs=[inputs], outputs=[layer])



Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not a good place to have this function. discriminator_gen_outputs is used before assignment. could you describe the definition in the new text block for users?

Also, please DO NOT use the python keyword sum as a local variable. please rename it as loss for example.

Also, could you use tf.math.reduce_mean. it will help you simplify the code. e.g. you don't have to use 1/m.


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, you have input layer with 2 input nodes, 2 hidden layers with 50 and 20 nodes respectively, and output layer with 1 output node


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove all the commented lines if you don't use anymore.

As you can see, make_discriminator_model doesn't exist, so uncommenting the line causes error, which is not helpful to users for their debug.


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint please / use tf.math.reduce_mean, sum->loss, and please describe the definition of discriminator_gen_outputs in the new text block.


Reply via ReviewNB

@@ -0,0 +1,988 @@
{
Copy link
Member

@jaeyoo jaeyoo Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, discriminator_real_outputs and discriminator_gen_outputs are used before their assignments. Please describe them in text block.

Also, I don't understand why you require the following 4 tensors here.

zeros = tf.zeros(sum_tensor.shape)
ones = tf.ones(sum_tensor.shape)
twos = tf.ones(sum_tensor.shape) * 2
threes = tf.ones(sum_tensor.shape) * 3

Please don't create this tensor inside this function, just create and use directly in the main training loop.

Simplify the code.

sum_tensor = tf.reduce_sum(tf.map_fn(lambda t: t * 2 ** tf.range(tf.cast(gan_model.generated_data.shape[1], dtype=tf.int64)), tf.cast(tf.reverse(tensor=gan_model.generated_data, axis=[1]), dtype=tf.int64)), axis=1)

-->

# Convert binary to decimal
def binary_to_decimal(discrete_data):
  return tf.cast(tf.math.reduce_sum(tf.bitwise.left_shift(
      tf.cast(discrete_data, dtype=tf.int64), tf.range(num_qubits)), axis=1),
      dtype=tf.float32)

sum_tensor = binary_to_decimal(gan_model.generated_data)



Reply via ReviewNB

@jaeyoo
Copy link
Member

jaeyoo commented Aug 22, 2020

I left review comments inside ReviewNB. @oliverob Please check it. For the training, could you lower your learning rate? I see you set it as 0.5 for two adam optimizers. please lower it to 0.05 or 0.01.

@oliverob
Copy link
Author

Thanks for all the comments. Sorry about all the lint and comment issues. I didn't mean for this to be production code yet - I was hoping to get try and figure out why it doesn't work before making the surrounding explaination correct and understandable. I'll make your suggested changes, and add a few of my own from work I did last week. I still cannot get it to learn correctly, I think I am approaching batches wrong. I'll let you know when I commit my newest version with my confused attempt at using batches.

@@ -0,0 +1,2644 @@
{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I am generating the correct format of data. I have tried a whole mix of options from averaging over a number of repetitions to the frequency of occurance of each number but I am not sure what is correct. This version seemed to make the most sense to me as the discriminator will see a sample which is a set of outputs of a circuit with the generator weights and so should be able to deduce the distribution from that. I thought that averaging might lose information about the distribution.


Reply via ReviewNB

@@ -0,0 +1,2644 @@
{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to change this to reflect the fact I removed the layer with 50 nodes. Though I'm not sure if I should add it back as removing it didn't seem to make a difference.


Reply via ReviewNB

@@ -0,0 +1,2644 @@
{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add an explaination of how this all works. However, it currently doesn't work, so I cannot write it yet


Reply via ReviewNB

@@ -0,0 +1,2644 @@
{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it was learning correctly these graphs would show it tending towards a lognormal distribution. However, as you can see it isn't learning correctly. I have tried fiddling around with the learning rate etc. but I can't seem to fix it. Also, I don't think the entropy calculation is working correctly but it isn't necessary - I might just remove it.


Reply via ReviewNB

@oliverob
Copy link
Author

@jaeyoo I have made the changes you suggested and tried to get it working but it still isn't learning correctly. Please could you have a look at the comments I have left in reviewnb and suggest anything I could try to make it work.

@oliverob oliverob requested a review from jaeyoo August 30, 2020 08:51
@oliverob
Copy link
Author

@jaeyoo Any ideas?

@jaeyoo
Copy link
Member

jaeyoo commented Aug 31, 2020

@jaeyoo Any ideas?

Sorry, I am on the long vacations... please let me check it later.. I apologize for the belated messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

qGANs
2 participants