Skip to content
This repository has been archived by the owner on May 10, 2022. It is now read-only.

Some crm_pdf/cr_text tests failing #46

Closed
sckott opened this issue May 8, 2020 · 6 comments
Closed

Some crm_pdf/cr_text tests failing #46

sckott opened this issue May 8, 2020 · 6 comments
Milestone

Comments

@sckott
Copy link
Contributor

sckott commented May 8, 2020

Probably related to many recent changes in crm_text and crm_pdf, but haven't been able to sort out whats going on. Seems fine when commenting out the vcr usage though, so maybe something to do with file caching/writing to disk.

@sckott sckott added this to the v0.4 milestone May 8, 2020
@fangzhou-xie
Copy link

I wonder if the following example falls into this category? My package version is: [1] crminer_0.3.4.95.

> l <- crm_links("10.1023/a:1009865221699")
> l
$unspecified
<url> http://academic.oup.com/rof/article-pdf/3/3/343/26321080/3-3-343.pdf
> t <- crm_text(l, "pdf", overwrite_unspecified = T, useragent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_4) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1 Safari/605.1.15")
Downloading pdf...
> t
[1] "~/Library/Caches/R/crminer/3-3-343.pdf"

The return value of crm_text becomes the path of cached PDF file, and no longer the full text extracted content.

@sckott
Copy link
Contributor Author

sckott commented May 10, 2020

Yeah, i think thats related. Its not surprising this is happening due to complexity of the problem. thanks for the comment

@fangzhou-xie
Copy link

I understand. There are lots of changes that happened recently and we all certainly hope that this package could be more helpful for everyone potentially.

@sckott
Copy link
Contributor Author

sckott commented Jun 16, 2020

I can no longer replicate the problem you're having above - let me know if you have the same problem after installing the latest version from here

@fangzhou-xie
Copy link

As mentioned in #49, the problem disappeared once I removed all the cached PDFs and updated the package to the newest development version (though I'm not sure what caused the problem to begin with).

Anyway, thanks a lot! I closed #49 as well.

@sckott
Copy link
Contributor Author

sckott commented Jun 16, 2020

Great, glad it works now

@sckott sckott closed this as completed Jun 16, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants