Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding a neural topic model baseline #30

Closed
farinamhz opened this issue Mar 24, 2023 · 11 comments
Closed

Adding a neural topic model baseline #30

farinamhz opened this issue Mar 24, 2023 · 11 comments
Assignees
Labels

Comments

@farinamhz
Copy link
Member

Here we are going to add the CTM baseline to the pipeline as a neural topic model.

@farinamhz farinamhz self-assigned this Mar 24, 2023
@farinamhz
Copy link
Member Author

Hi @hosseinfani,
I added the CTM baseline and added the percentages of the hide function for the evaluation section.
However, there is a problem with this new model that its evaluation takes too much time. In their paper, they said much lesser time for each epoch. But we have ~16 minutes for training.
At the end of the day, we can handle the training, but the evaluation is taking unusual time.
For example, we are going to evaluate 15% of 350 reviews that each of them has avg ~3 documents or sentences, and inference for each of these reviews takes almost 2 minutes.
It means that if we have 5 folds and 11 different evaluations for 0, 10, 20,...,100 percent of hide the aspect, in total, it takes almost 4 days to evaluate just the results before back-translation!
I am running on GPU, and for sure, if it takes this amount of time, we would not have time to test different values for each param!
Finally, I think that there is a problem somewhere that is taking too much time, even when I have done it from their document.
This was the whole problem, and I would appreciate it if you had time for a meeting to talk about this.

@hosseinfani
Copy link
Member

hosseinfani commented Mar 24, 2023

@farinamhz
We'll talk tomorrow.

nonetheless, it's time to switch your experiments to computecanada then. we have a doc in General > Files > Library > Compute Canada guide that helps you.

@smh997 did you convert that doc into https://github.com/fani-lab/Library/blob/main/ComputeCanada.md?

@smh997
Copy link
Member

smh997 commented Mar 24, 2023

@hosseinfani It is still in progress and still needs to be finalized. I am adding the GPU part. I expect to finish it by tomorrow (at least the first version as a draft). However, I can share my experience with @farinamhz before I update the repo.

@farinamhz
Copy link
Member Author

farinamhz commented Mar 25, 2023

Hi @hosseinfani,
Results and code for the CTM model and changes in the evaluation have been added.
Also, all the results with their aggregation have been added, and you can see it in ../tree/main/output/English/Semeval-2016/25
#31

@farinamhz
Copy link
Member Author

@hosseinfani
Result for CTM:

image

@farinamhz
Copy link
Member Author

Fortunately, we have reasonable results like other baselines for the first 5 selections, which is good news!

image

@farinamhz
Copy link
Member Author

farinamhz commented Mar 25, 2023

But in comparison with other models, the results of CTM are less than other baselines.

@farinamhz
Copy link
Member Author

Hi @hosseinfani,
These are the results for epoch = 10 and epoch = 100.
Unfortunately, with increasing epoch to 100, although we have an improvement in success values, the results after back-translation decrease!
(10 means epoch=10 and 100 means epoch=100)

image

@hosseinfani
Copy link
Member

hosseinfani commented Mar 31, 2023

@farinamhz
Interesting! Can you do [10, 100, 200, 300, 400, 500, 1000] epochs and draw the same diagram?

@hosseinfani
Copy link
Member

We found https://github.com/MIND-Lab/OCTIS that includes neural and non-neural topic modeling.

There is an issue installing scikit-learn == 0.24.2 when installing on python 3.10. I reduced the python to 3.7 and it's been installed with no issue.

b.py
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for scikit-learn
Failed to build scikit-learn
ERROR: Could not build wheels for scikit-learn, which is required to install pyproject.toml-based projects

@hosseinfani
Copy link
Member

@farinamhz
we can close this issue. let me know otherwise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants