Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluation part #5

Closed
un-lock-me opened this issue Dec 13, 2017 · 9 comments
Closed

evaluation part #5

un-lock-me opened this issue Dec 13, 2017 · 9 comments

Comments

@un-lock-me
Copy link

Thank u very much , you helped alot,
Hopefully, I runned the code for Nips dataset, and its working,
One question is that why it just show one level of the output?
for example part of my output is like this:
`0.278 cortex stimulus

0.228 firing spike

0.205 mixture expert

0.421 pixel theorem character cluster energy

0.205 speech classifier

0.202 circuit voltage`

and lastly, how can I see the evaluation part? you used coherence for evaluation, may I ask you to provide me with step by step direction how can I get that result also?

Thanks for taking the time :)

@un-lock-me un-lock-me changed the title validation part evaluation part Dec 13, 2017
@kmpoon kmpoon assigned kmpoon and unassigned kmpoon Dec 14, 2017
@kmpoon
Copy link
Owner

kmpoon commented Dec 14, 2017

  1. The output does not really show the topic hierarchy. It only shows the keywords of a topic in each line. You may extract the topic hierarchy using tm.hlta.ExtractTopics and then open the HTML file to see the tree structure of the topic model.

  2. It seems that we have only uploaded the code for computing the log-likelihood of test data. The topic coherence code was run by my colleague Peixian Chen. I may not have integrated the code to the repository yet. Perhaps you may ask @clairepchen for sending you the code for computing topic coherence.

@un-lock-me
Copy link
Author

As you mentioned in your example in the last part we run the tm.hlta.ExtractTopic, so my output is the output of this file.
I think for showing the hierarchy I have to run another file?

Yea then I will contact him,
Can you please let me know the file of logLikelihood also?(there is some file in org.latlab but Im not sure if they are)

Thanks again :)

@kmpoon
Copy link
Owner

kmpoon commented Dec 14, 2017

There should be an HTML file (e.g. sample.html) after running tm.hlta.ExtractTopic. You may open that HTML file in a browser to see the hierarchy.

The step for computing loglikelihood is shown in the Testing section.

@un-lock-me
Copy link
Author

un-lock-me commented Dec 14, 2017

I have already opened the outputfile which contains two html file, the output of this html file is the same as the sample html file.
I mean like this:
0.379 node tree hidden-unit module 0.136 analog chip 0.242 channel filter source 0.202 circuit voltage 0.228 firing spike 0.255 synaptic fig_ recurrent 0.278 cortex stimulus 0.293 orientation object
the same as the previous which was available in the output file

@kmpoon
Copy link
Owner

kmpoon commented Dec 14, 2017

You may try to open the HTML file in the base directory but not in that topic_output directory. For example, if you run the Quick Example, you will find the sample.html in the base directory:

├── extracted
│   ├── Chen2016-Latent\ Tree\ Models\ for\ Hierarchical\ Topic.txt
│   ├── Chen2016-Progressive\ EM\ for\ Latent\ Tree\ Models\ and\ Hierarchical.txt
│   ├── Poon2010-Variable\ Selection\ in\ Model-Based\ Clustering\ To\ Do\ or\ To\ Facilitate.txt
│   ├── Poon2013-Model-Based\ Clustering\ of\ High-Dimensional\ Data\ Variablea.txt
│   ├── Poon2017-Clustering\ with\ Multidimensional\ Mixture\ Modelsa.txt
│   ├── Poon2017-Topic\ Browsing\ System\ for\ Research\ Papersa.txt
│   ├── Poon2018-UC-LTM\ Unidimensional\ Clustering\ Using\ Latenta.txt
│   ├── Zhang2017-Latent\ Tree\ Analysis.txt
│   └── liu-n-ecml14.txt
├── fonts
│   ├── glyphicons-halflings-regular.eot
│   ├── glyphicons-halflings-regular.svg
│   ├── glyphicons-halflings-regular.ttf
│   ├── glyphicons-halflings-regular.woff
│   └── glyphicons-halflings-regular.woff2
├── lib
│   ├── 32px.png
│   ├── 40px.png
│   ├── bootstrap.min.css
│   ├── bootstrap.min.js
│   ├── custom.css
│   ├── custom.js
│   ├── ie10-viewport-bug-workaround.css
│   ├── ie10-viewport-bug-workaround.js
│   ├── jquery.min.js
│   ├── jquery.tablesorter.min.js
│   ├── jquery.tablesorter.widgets.js
│   ├── jstree.min.js
│   ├── magnific-popup.css
│   ├── style.min.css
│   ├── tablesorter.css
│   └── throbber.gif
├── model.beforeGlobalEM.bif
├── model.bif
├── pdfs
│   ├── Chen2016-Latent\ Tree\ Models\ for\ Hierarchical\ Topic.pdf
│   ├── Chen2016-Progressive\ EM\ for\ Latent\ Tree\ Models\ and\ Hierarchical.pdf
│   ├── Poon2010-Variable\ Selection\ in\ Model-Based\ Clustering\ To\ Do\ or\ To\ Facilitate.pdf
│   ├── Poon2013-Model-Based\ Clustering\ of\ High-Dimensional\ Data\ Variablea.pdf
│   ├── Poon2017-Clustering\ with\ Multidimensional\ Mixture\ Modelsa.pdf
│   ├── Poon2017-Topic\ Browsing\ System\ for\ Research\ Papersa.pdf
│   ├── Poon2018-UC-LTM\ Unidimensional\ Clustering\ Using\ Latenta.pdf
│   ├── Zhang2017-Latent\ Tree\ Analysis.pdf
│   └── liu-n-ecml14.pdf
├── sample.arff
├── sample.dict-0.csv
├── sample.dict-1.csv
├── sample.dict-2.csv
├── sample.files.txt
├── sample.html
├── sample.nodes.js
├── sample.sparse.txt
├── sample.txt
├── sample.whole_dict-0.csv
├── sample.whole_dict-1.csv
├── sample.whole_dict-2.csv
└── topic_output
    ├── TopicBase.txt
    ├── TopicsTable-Level-1.html
    └── TopicsTable.html

@un-lock-me
Copy link
Author

I got your point, you are talking about the sample.html available in the base directory.
My point is that the content of the sample.html is the same as the content of topicstables.html and topicstable-level-1.html,

Sorry for taking your time,
if you think it may be because of my dataset and parameters which created just one level of data, I will try to change them

@kmpoon
Copy link
Owner

kmpoon commented Dec 14, 2017

Yes, it is possibly due to the data set. For example, on the NIPS dataset with 1k to 10k words, Peixian got a hierarchy with 4 to 6 levels (according to the AAAI-2016 paper).

@un-lock-me
Copy link
Author

thank you very much, I will go through that paper also,

I will work on that :)
Many thanks for your time,

@kmpoon
Copy link
Owner

kmpoon commented Jan 29, 2018

I have uploaded a main method for running the computing of topic coherence score. See Readme for more details.

@kmpoon kmpoon closed this as completed Feb 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants