New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching Large Objects #572

Closed
yihui opened this Issue Jul 17, 2013 · 10 comments

Comments

4 participants
@yihui
Owner

yihui commented Jul 17, 2013

https://groups.google.com/forum/#!topic/knitr/3mPn2neMdrk

I can provide a new option cache.lazy = TRUE/FALSE, if R core is not going to fix the problem

@RussianImperialScott

This comment has been minimized.

Show comment
Hide comment
@RussianImperialScott

RussianImperialScott Oct 27, 2013

I would love to see this happen! I repeatedly work on very large datasets and things balloon to over 3 GB very easily.

RussianImperialScott commented Oct 27, 2013

I would love to see this happen! I repeatedly work on very large datasets and things balloon to over 3 GB very easily.

@yihui

This comment has been minimized.

Show comment
Hide comment
@yihui

yihui Oct 27, 2013

Owner

Thanks for the interest! I'll certainly fix this issue since I see two real cases now :)

Owner

yihui commented Oct 27, 2013

Thanks for the interest! I'll certainly fix this issue since I see two real cases now :)

@RussianImperialScott

This comment has been minimized.

Show comment
Hide comment
@RussianImperialScott

RussianImperialScott Oct 27, 2013

My current workaround would have been something like this (not sure if this would invite disaster):

```{r sample-chunk, cache = TRUE, cache.vars = setdiff(ls(), 'large_object'), eval.after = 'cache.vars'}

```

Followed by a save(large_object) at the end of the chunk.

Thanks a lot! It would be amazing to not think about this.

RussianImperialScott commented Oct 27, 2013

My current workaround would have been something like this (not sure if this would invite disaster):

```{r sample-chunk, cache = TRUE, cache.vars = setdiff(ls(), 'large_object'), eval.after = 'cache.vars'}

```

Followed by a save(large_object) at the end of the chunk.

Thanks a lot! It would be amazing to not think about this.

@yihui

This comment has been minimized.

Show comment
Hide comment
@yihui

yihui Oct 28, 2013

Owner

This should be easy to fix, but I'll be super busy for the next few weeks. I'll try, but no guarantee at the moment. If you can work out it and send me a pull request, that will make the progress much faster :)

Owner

yihui commented Oct 28, 2013

This should be easy to fix, but I'll be super busy for the next few weeks. I'll try, but no guarantee at the moment. If you can work out it and send me a pull request, that will make the progress much faster :)

@RussianImperialScott

This comment has been minimized.

Show comment
Hide comment
@RussianImperialScott

RussianImperialScott Oct 28, 2013

I totally understand. Yeah I'll give it a try!

Scott

On Mon, Oct 28, 2013 at 1:54 AM, Yihui Xie notifications@github.com wrote:

This should be easy to fix, but I'll be super busy for the next few weeks.
I'll try, but no guarantee at the moment. If you can work out it and send
me a pull request, that will make the progress much faster :)


Reply to this email directly or view it on GitHubhttps://github.com/yihui/knitr/issues/572#issuecomment-27192229
.

RussianImperialScott commented Oct 28, 2013

I totally understand. Yeah I'll give it a try!

Scott

On Mon, Oct 28, 2013 at 1:54 AM, Yihui Xie notifications@github.com wrote:

This should be easy to fix, but I'll be super busy for the next few weeks.
I'll try, but no guarantee at the moment. If you can work out it and send
me a pull request, that will make the progress much faster :)


Reply to this email directly or view it on GitHubhttps://github.com/yihui/knitr/issues/572#issuecomment-27192229
.

@Mattrition

This comment has been minimized.

Show comment
Hide comment
@Mattrition

Mattrition Nov 24, 2013

Hi,

Just to let you know that I encountered this problem today after trying to cache ~5.9GB of data, so I'm also eager to see a fix for this! My understand of "Large Objects" are those whose length exceed 2^31 - 1, where their lengths cannot be stored due to integer overflow. My tables are all around 12-16 million rows long, which is big but does not approach that limit. So this error is a bit confusing.

Will be trying the workarounds suggested until this is fixed.

Thanks,
Matt

Mattrition commented Nov 24, 2013

Hi,

Just to let you know that I encountered this problem today after trying to cache ~5.9GB of data, so I'm also eager to see a fix for this! My understand of "Large Objects" are those whose length exceed 2^31 - 1, where their lengths cannot be stored due to integer overflow. My tables are all around 12-16 million rows long, which is big but does not approach that limit. So this error is a bit confusing.

Will be trying the workarounds suggested until this is fixed.

Thanks,
Matt

yihui added a commit that referenced this issue Nov 26, 2013

@yihui yihui closed this in 06e80d6 Nov 26, 2013

@yihui

This comment has been minimized.

Show comment
Hide comment
@yihui

yihui Nov 26, 2013

Owner

@ScottSimpkins @Mattrition I added a new chunk option cache.lazy; in your cases, you should be able to use cache.lazy=FALSE to avoid lazy loading now. Please test the development version if you can. Thanks!

Owner

yihui commented Nov 26, 2013

@ScottSimpkins @Mattrition I added a new chunk option cache.lazy; in your cases, you should be able to use cache.lazy=FALSE to avoid lazy loading now. Please test the development version if you can. Thanks!

@gforge

This comment has been minimized.

Show comment
Hide comment
@gforge

gforge Aug 27, 2014

Just want you to know that the fix works. If possible a more informative error would be nice, it took me a while to find this bug-fix

gforge commented Aug 27, 2014

Just want you to know that the fix works. If possible a more informative error would be nice, it took me a while to find this bug-fix

@yihui

This comment has been minimized.

Show comment
Hide comment
@yihui

yihui Aug 27, 2014

Owner

@gforge Thanks! Unfortunately I have no control over the error message long vectors not supported yet: connections.c. It is from base R.

Owner

yihui commented Aug 27, 2014

@gforge Thanks! Unfortunately I have no control over the error message long vectors not supported yet: connections.c. It is from base R.

@gforge

This comment has been minimized.

Show comment
Hide comment
@gforge

gforge Aug 27, 2014

I was thinking that you could catch it, augment it with a suggestion, and then throw it again. Thus it would be the same error but with a useful suggestion of where to start. When working with these large datasets it is painful to debug and any help can be useful.

gforge commented Aug 27, 2014

I was thinking that you could catch it, augment it with a suggestion, and then throw it again. Thus it would be the same error but with a useful suggestion of where to start. When working with these large datasets it is painful to debug and any help can be useful.

petrelharp added a commit to petrelharp/landscape_geometry that referenced this issue Jan 4, 2016

yihui added a commit that referenced this issue Oct 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment