ordLORgee got killed due to memory issue #1

lzhtom · 2017-10-06T20:59:42Z

I have a dataset with 12 patients, each has one explanatory variable and one response variable, and each patient was measured 2000 times (I know it's huge!). When running ordLORgee the whole R session crashed, presumably because of a memory issue. I was wondering if there's any way to side step this memory issue when the repeat number is huge.
Thanks a lot!

AnestisTouloumis · 2017-10-09T11:47:08Z

It is hard for me to comment on the nature of the presumably memory related issue without the code/data that cause R session to crash.

My advice is first to use the independence working model instead of a more complicated odds ratios structure in order to fit a GEE model. If the R session continues to crash, then you should either try to use a more powerful PC and/or HPC or use a subset of your data (if there is a reasonable way to do this).

lzhtom · 2017-10-09T12:57:22Z

Hi Anestis Thanks a lot for your reply. I felt that our problem is more about the choice of methods than the implementation (code etc). Actually we tried to use a supercomputer with huge memory etc but still it couldn’t fit the model with 2000 repeat measures with independence structure. If I may, please allow me to more formally describe the problem. Here the “patients” are actually drugs. At FDA we are developing a method to predict a drug’s safety based on in vitro measurement (testing a drug’s effect on a dozen cultured cells) and an in silico cardiac model (systems biology type model to “translate” in vitro effects to in vivo effects). For the in vitro measurement, even though typically we only use 10-20 cells per drug, it doesn’t mean we only have 10-20 “repeats”. These measurements will have to go through some Hill equation to estimate some in vitro parameters (IC50 and h), and we use MCMC to estimate the joint distribution of these two in vitro parameters. In the process we will generate 2000 IC50-h parameter pairs to represent this joint distribution. Now we feed these 2000 parameter pairs (sets) into the cardiac model, and naturally we will get 2000 predicted in vivo measurement per drug. And finally we want to use a ordinal logistic regression model to use the in vivo measurement to explain drug safety: each drug has a known safety label (low, intermediate, or high), and each drug has 2000 in vivo “measurement”. Apparently these 2000 are correlated (actually perfectly correlated because they all have the same safety class label for the same drug). I was wondering, other than GEE, if there’s another way to do such regression (safety class ~ in vivo measurement) accounting for the internal correlation. Or there’s no need to account for such correlation since it is always perfect correlation within each drug? Any suggestion is highly appreciated. Thanks! Zhihua From: Anestis Touloumis [mailto:notifications@github.com] Sent: Monday, October 09, 2017 7:47 AM To: AnestisTouloumis/multgee Cc: Li, Zhihua; Author Subject: Re: [AnestisTouloumis/multgee] ordLORgee got killed due to memory issue (#1) It is hard for me to comment on the nature of the presumably memory related issue without the code/data that cause R session to crash. My advice is first to use the independence working model instead of a more complicated odds ratios structure in order to fit a GEE model. If the R session continues to crash, then you should either try to use a more powerful PC and/or HPC or use a subset of your data (if there is a reasonable way to do this). — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub<#1 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AfEqW_nFWqESl0HNTp0NxDWCjrTlbcu-ks5sqge8gaJpZM4PxBtT>.

AnestisTouloumis · 2017-10-16T10:01:55Z

Hi Zhihua,

As I mentioned in my previous response, if you think that there is a reasonable way to reduce the data then you might want to try multgee again. However, I trust that you are in a better position than me to judge whether any "data reduction" is feasible.

I will now close this issue as this is rather a limitation of R rather than the package multgee.

AnestisTouloumis added the question label Oct 9, 2017

AnestisTouloumis closed this as completed Oct 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ordLORgee got killed due to memory issue #1

ordLORgee got killed due to memory issue #1

lzhtom commented Oct 6, 2017

AnestisTouloumis commented Oct 9, 2017

lzhtom commented Oct 9, 2017 via email

AnestisTouloumis commented Oct 16, 2017

ordLORgee got killed due to memory issue #1

ordLORgee got killed due to memory issue #1

Comments

lzhtom commented Oct 6, 2017

AnestisTouloumis commented Oct 9, 2017

lzhtom commented Oct 9, 2017 via email

AnestisTouloumis commented Oct 16, 2017