Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with trainer.fit(), operands of different shape #14

Closed
Stephenito opened this issue Jun 6, 2022 · 10 comments
Closed

Problem with trainer.fit(), operands of different shape #14

Stephenito opened this issue Jun 6, 2022 · 10 comments

Comments

@Stephenito
Copy link

Stephenito commented Jun 6, 2022

Hi,
I am trying to run the quantum trainer algorithm. When running the following line:

trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

i get the following error:

ValueError                          Traceback (most recent call last)
Input In [17], in <cell line: 1>()
----> 1 trainer.fit(train_dataset, val_dataset, evaluation_step=1, logging_step=100)

File c:\python38\lib\site-packages\lambeq\training\trainer.py:365, in Trainer.fit(self, train_dataset, val_dataset, evaluation_step, logging_step)
    363 step += 1
    364 x, y_label = batch
--> 365 y_hat, loss = self.training_step(batch)
    366 if (self.evaluate_on_train and
    367         self.evaluate_functions is not None):
    368     for metr, func in self.evaluate_functions.items():

File c:\python38\lib\site-packages\lambeq\training\quantum_trainer.py:149, in QuantumTrainer.training_step(self, batch)
    133 def training_step(
    134         self,
    135         batch: tuple[list[Any], np.ndarray]) -> tuple[np.ndarray, float]:
    136     """Perform a training step.
    137 
    138     Parameters
   (...)
    147 
    148     """
--> 149     y_hat, loss = self.optimizer.backward(batch)
    150     self.train_costs.append(loss)
    151     self.optimizer.step()

File c:\python38\lib\site-packages\lambeq\training\spsa_optimizer.py:126, in SPSAOptimizer.backward(self, batch)
    124 self.model.weights = xplus
    125 y0 = self.model(diagrams)
--> 126 loss0 = self.loss_fn(y0, targets)
    127 xminus = self.project(x - self.ck * delta)
    128 self.model.weights = xminus

Input In [13], in <lambda>(y_hat, y)
----> 1 loss = lambda y_hat, y: -np.sum(y * np.log(y_hat)) / len(y)  # binary cross-entropy loss
      3 acc = lambda y_hat, y: np.sum(np.round(y_hat) == y) / len(y) / 2  # half due to double-counting
      4 eval_metrics = {"acc": acc}

ValueError: operands could not be broadcast together with shapes (30,2) (30,)

I have just fixed the .py file in the lib following #12. The algorithm raised an error even before. I can't recall exactly, but i don't think it was the same error.

What can i do to solve this?
Thank you for your time.

@Thommy257
Copy link
Contributor

Hi,
it seems as if your model output shape doesn't match the shape of the labels. How do you generate your diagrams?
Also, the loss function from the tutorial notebooks is designed for 2-d outputs. If your model yields a scalar value, you need to modify it.

@Stephenito
Copy link
Author

Hi,
Labels are 2-d arrays, like in the documentation's example.
I tried to change the labels to a 1-d array (changing the read_data function), and the program has a strange behaviour. It got stuck after the first epoch, with 0 loss function of both validation and training datasets.
In order not to make any mistake in my code, i tried also with your full-code example of trainer_quantum, but still i got the same behaviour.
The data is in this format:
1 woman teaches simple categories
1 woman describes simple maths
I think it got parsed correctly, as the dataset arrays match your runs.

I will try to look at it in the next few days. Thanks for your help!

@dimkart
Copy link
Contributor

dimkart commented Jun 9, 2022

@Stephenito Hi -- As @Thommy257 said, the problem is that while your labels are 2-D (as you confirm), the output of the model is 1-D (a scalar). After getting the output of the model, you need to convert it into 2-d before passing it to the loss function. Hope this helps.

@Stephenito
Copy link
Author

Hi, instead of working with the model I made the labels 1-d. As i said before, the error is not there anymore, but the training is working with a strange behaviour.
Each epoch is 40 seconds long, and the output is like this:

Epoch 1: train/loss: 0.0000 valid/loss: 0.0000 train/acc: 0.2458 valid/acc: 0.3000
Epoch 2: train/loss: 0.0000 valid/loss: 0.0000 train/acc: 0.2458 valid/acc: 0.3000

Training completed!

I tried with the following samples:

  • 120 train, 10 test;
  • 70 train, 30 test.
    but the train loss is still 0. What am i missing? Is it a problem of the dataset? Should i have more samples? Or do you think it's still a problem of the procedure?

Thanks again!

@dimkart
Copy link
Contributor

dimkart commented Jun 16, 2022

Have you also adjusted your loss function? Or it still assumes your labels are 2-D?

@Stephenito
Copy link
Author

Yes, i adjusted it for scalar values, but it returns the same behaviour.
I tried to modify the Ansatz to make the output 2-d (and work like in the beginning with 2-d labels) and now it's giving normal values. Even though i haven't really understood what an Ansatz is and how to design it.
I have a last question: why is it so slow? What should i modify to make it faster?

@ACE07-Sev
Copy link
Contributor

Hi I am getting the same error

@ACE07-Sev
Copy link
Contributor

The resolve to this error is due to one or more diagrams having 2 output wires, one way to resolve this is to manually check for all the diagrams and see which sentences have 2 output wires instead of one S wire output. Based on experience usually it's the sentences which start with a verb such as :
Do not come here, Learn how to drive, kill the traitors, Love your neighbors, etc. If you have too many instances to check just make sure they all start with a noun, like "I, you, he, she, they, man, woman, it, person, names,etc.".

@y-richie-y
Copy link
Collaborator

@ACE07-Sev your issue is different, it arises from Bobcat correctly parsing imperative sentences to pregroup type n.r @ s. For example:

     Tell      me      what     you    think                                                                                         
─────────────  ──  ───────────  ───  ─────────
n.r·s·n.l·n.l  n   n·n.l.l·s.l   n   n.r·s·n.l
 │  │  │   ╰───╯   │   │    │    ╰────╯  │  │
 │  │  ╰───────────╯   │    ╰────────────╯  │
 │  │                  ╰────────────────────╯

@y-richie-y
Copy link
Collaborator

@Stephenito since the original issue has been resolved, I will close the issue.

The TketModel is typically used with IBM's Aer simulator which is much slower in comparison to NumpyModel.
If you have problems with performance, please open a new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants