You can clone with
These two phrase nets did not tell me very much about my texts...is there a way to avoid this kind of result when working with PDFs with a lot of embedded text/metadata?
By adding your own stop words (1 per line) to the file "stopwords.txt" in the Paper Machines data folder, you should be able to get a clearer picture of your data. I will shortly add the ability to add stop words through a comma-separated list in the preferences.
When I open the text files (stopwords, stopwords_en, stopwords_pt, search_stopwords) that come up when I search my computer for files called stopwords.txt and select results from the Paper Machines data folder, I don't see "lines" that would allow me to add 1 stopword per line. I just see a sort of unbroken stream of stopwords that don't even have spaces between them.. Should I just go to the end and start typing additional stopwords? If so, how will it know where I mean to delimit them? Thanks, and sorry to be ignorant!
Ah, the line endings are in Unix format rather than Windows, so it shows up for you without line breaks. I've already implemented a preference pane that will allow additional entries, one per line, so you won't have to navigate to the file or anything. That will be released probably tonight, or as soon as I figure out a bug with geodict (it's about 90% there).
Terrific! Meanwhile, I'll try writing to the Austrian National Library, which maintains http://europeana-geo.isti.cnr.it/geoparser, in German, and ask them if/when they're planning to bring that back online.