-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract text for paper machines not working #22
Comments
Glad to hear it's working now! Out of curiosity, where was Python before? |
konini22, can you explain where you relocated to/from? I'm having the same problem. Thanks! |
Hi Chris and mebrett, |
It ought to have been able to find Python there if you gave it the path explicitly; I'll have to do some testing on Windows to sort that one out. Classification is currently marked as an "experimental feature"; it's not quite ready for use yet. Currently, it takes documents that are manually organized into subcollections as training data for a Maximum Entropy classifier. This classifier can then be used on another, larger set of documents to give the probability that each one belongs to a given category from the training data. I've been meaning to have it create a new hierarchy of subcollections with the documents reorganized accordingly, but I haven't gotten to it yet. If you're curious what the raw data looks like, you can enable it from "Paper Machines Preferences..." by checking "Enable Experimental Features"; running the trainer and then testing it will give you the raw probabilities. What did you specifically want to do with it? Were you hoping to do unsupervised clustering, or to do some other kind of classification than the one I describe? I'd be happy to expand Paper Machines to be useful for your application if it seems doable. P.S. mebrett, can you please tell me:
|
I'm running OS X 10.7.5, Python 2.7.3 ( I was running 2.7.1 but recently updated. I'm running in the standalone (version 3.0.8). I found once I updated Zotero yesterday I'm no longer having problems with extract text not working. I did not check what version I was running before, but I think it must have been a very early 3.0.x |
Glad to hear it! Zotero 3.0.9 is soon to come out, as is Firefox 17, so I'll be sure to test with those versions. Please let me know if you run into any other problems. |
Thank you, Chris. I'm searching for a collection analysis/development tool that is compatible with Zotero. I'm keeping book requests from academics (sorted into collections by academics and departments). Each entry contains LC classification as call number and LC subject headings in tags (headings and sub headings are tagged separately, ex. Romanticism, 19th century, History and Criticism, Great Britain,etc). I was hoping that Paper Machines create word cloud with phrases from tags, but haven't figured out how to do it (if it does at all). That would help glance at the collection. I haven't tried SEARS because of the warning. If I could group LC classification numbers in some ways, that would help me target specific classification ranges to build library collections on, but at the moment I have to manually analyse classification numbers. If I could get my Zotero collections to talk to WorldCat or Open Library through tags and call number and get suggestions (like Amazon's recommendations) on what else to buy based on my collections, that would streamline my collection development tasks hugely. |
I'm running Zotero 3.0.13 for Firefox (I have Firefox 18.0.2), , on a Windows machine. I have a collection with about 900 attached PDF's but when I right-click on my collection folder and click Extract Text for Paper Machines, nothing happens. The Word Cloud, Phrase Net, etc. menus remain grayed out. Restarting Firefox does not help. Question: where would I find out what version of Python I am running? If I am not running Python, how do I get Python? I'm a librarian and excited about teaching a graduate student workshop on Paper Machines in mid-March, but I need to make it work on my own collection first! |
So I got help installing Python 2.7.3. Then I removed and reinstalled Paper Machines. This time, I was able to extract the texts - or at least, I got a progress bar saying "Searching for files to extract" for a few minutes; then a Firefox tab opened and I got Extracting [my collection's name] and a new progress bar, for another few minutes. Then a message appeared in the tab saying "Extracted 878 out of 878 new texts. This window can now be closed." I closed the window. But the World Cloud, Phrase Net, etc. are still grayed out in the right-click menu when I right-click my collection folder. |
Hi there, (sorry for the accidental close of issue) In general, clicking on another collection (or the trash, etc.) in the left-hand pane, then back on the original collection will make the visualization options appear. Failing this, a restart of the application will usually enable them; sometimes the database doesn't update properly on the first extraction. Please let me know if this doesn't work -- for troubleshooting purposes, it'd be great to see any output in the Error Console that mentions Paper Machines (it is located under the Tools menu -> Web Developer). Also, given the difficulties of getting the correct version of Python installed, I have been working on a version that requires only Java to run. The release is pending, but you are welcome to try it by installing the following: https://www.dropbox.com/s/rwooxwwlls991w0/papermachines-0.4.0pre2.xpi There are several other improvements in this version that clarify the user interface for topic modeling in particular. It will receive an "official" release once our new website is launched. |
Thanks! The tools were no longer grayed out when I restarted Firefox this morning. I am now happily applying them to the extracted text. I will mention the Java-only download to the grad students during my workshop March 15-16 - do you know how soon the "official" release and new website will launch? Are you thinking days, weeks, or months? Thanks again for your assistance. |
Hi Chris,
I'm using the latest version of Zotero for Firefox (Firefox version 16.0.2). Once it finishes extracting text, Firefox opens a blank new tab, but all visualization options remain greyed out. I looked into past issues #5 and #12 and tried to replicate the solutions, but it still doesn't work. I extracted text of My Library and also a collection, but both attempts failed.
I got numerous error messages, saying Error in parsing value for ... Declaration dropped.
Can you help me with this, please?
Sorry, Chris,
The problem's been solved. I just had to relocate the phython file. Cheers.
The text was updated successfully, but these errors were encountered: