Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluating copula density --> NaN and inaccuracies #230

Closed
mcwachter opened this issue Nov 23, 2020 · 3 comments
Closed

Evaluating copula density --> NaN and inaccuracies #230

mcwachter opened this issue Nov 23, 2020 · 3 comments

Comments

@mcwachter
Copy link

Hello,
I got problems evaluating the copula density of vines. To be precise, it occurs that dvinecop returns a NaN. I wrote a small reproducible example:

ncol=30
nrow=10
set.seed(123)
data=matrix(rnorm(nrow*ncol),ncol=ncol)
vine=vinecop(pseudo_obs(data))
res=dvinecop(rep(0,ncol),vine)
print(res)

settingncol=20 or nrow=20 leads to result of 0 and not NaN. So this might be a problem for large p/small n-data. The dataset I am using and where this occured first is also a large p/small n dataset. But the result of exactly 0 is still not satisfactory because it is far away from the correct value 1 (the columns are independent) and it cancels the joint density aswell, since this is the product of copula density and marginal densities.

Many thanks in advance!

@tnagler
Copy link
Collaborator

tnagler commented Nov 23, 2020

There's are two things at play here. The first is a numerical issue related to the bb7 family, when dependence is strong and the evaluation point near the boundary. That's what caused the NaN and should be fixed here. You can install a package version containing all recent fixes with

devtools::install_github("vinecopulib/rvinecopulib@new-vcl")

The second is a statistical issue not really related to the implementation of our library. If you look at your fitted vine, you will find many non-independence copulas. And if you look at pairwise scatter plots (or correlations) of the data, this shouldn't come at a surprise. With so little data, it's basically impossible to identify the independence copula as the correct model.

The problem is amplified if you have a vine with d > n, where all implemented model selection criteria will provably fail, no matter how large the data set. This issue is discussed in one of my papers and a more suitable criterion is proposed for moderately large d, but it still fails for d > n. So more research is needed...

@tnagler
Copy link
Collaborator

tnagler commented Nov 24, 2020

New version is on CRAN, I'll close this.

@tnagler tnagler closed this as completed Nov 24, 2020
@mcwachter
Copy link
Author

Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants