Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Multivariate Distance Covariance metric and the corresponding test of independence #62

Open
wants to merge 23 commits into
base: develop
Choose a base branch
from

Conversation

Palash123-4
Copy link

@Palash123-4 Palash123-4 commented Jan 25, 2024

As we discussed, I have added the code corresponding to the random projection based multivariate distance covariance, as discussed in https://www.frontiersin.org/articles/10.3389/fams.2021.779841/full.
I spent some time regarding how to use the gamma function in the best possible way so that the correct p_value can be calculated. (One need to install the "mpmath" python module for gamma_cdf function, for details: https://mpmath.org/doc/current/functions/expintegrals.html)
Please take a look and let me know whether it is working fine or not.

Copy link
Owner

@vnmabus vnmabus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your effort making this PR! I did not check the math yet (I assume you implemented it well), but I have several code suggestions.

dcor/_dcor.py Outdated Show resolved Hide resolved
dcor/_dcor.py Outdated Show resolved Hide resolved
dcor/_dcor.py Outdated Show resolved Hide resolved
dcor/_dcor.py Outdated Show resolved Hide resolved
dcor/_dcor.py Outdated
np.bool = np.bool_


def gamma_ratio(p): return np.exp(gammaln((p+1)/2) - gammaln(p/2)) # For Calculating C_p and C_q
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The function body needs to be in its own line.
  • Functions should be type-annotated.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added the type-annotation

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I was not very clear: I meant to use Python type hints, not to add the types in the docstring. Types in a docstring cannot be easily verified, while type hints can.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made the changes

dcor/_dcor.py Outdated Show resolved Hide resolved
dcor/independence.py Outdated Show resolved Hide resolved
dcor/independence.py Outdated Show resolved Hide resolved
dcor/independence.py Outdated Show resolved Hide resolved
dcor/independence.py Show resolved Hide resolved
@Palash123-4
Copy link
Author

Palash123-4 commented Feb 1, 2024

I found some issues regarding the cyclic recalling of rowwise module in _dcor.py. Maybe you can help me to solve it.

dcor/_dcor.py Outdated
return X_new


def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ='mergesort'):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ='mergesort'):
def u_dist_cov_sqr_mv(X, Y, n_projs = 500, method ="mergesort"):

Please, add double quotes everywhere.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made the changes

dcor/_dcor.py Outdated


"""
A Statistically and Numerically Efficient Independence Test Based on
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring needs to be inside the corresponding function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completed the task

dcor/_dcor.py Outdated

References
----------
.. bibliography:: ../refs.bib
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use :footcite: and .. footbibliography instead, as it is easier to maintain.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really appreciate it if you kindly guide me how to do that, I did search this names on the internet, but I am not quite certain how to do it. It would be really nice if you guide me step-by-step process.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just use :footcite: instead of :cite: inside the text to place the footnote, and .. footbibliography:: instead of .. biblioggraphy::. You do not need to specify the .bib file, as it is done in the global config.

You can use this docstring as an example:
https://github.com/GAA-UAM/scikit-fda/blob/3eb8ace81220011559df83be0ffb9154a11fb001/skfda/exploratory/stats/_fisher_rao.py#L66-L99

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I implemented the exact format that you have mentioned

dcor/_dcor.py Outdated Show resolved Hide resolved
dcor/_dcor.py Outdated
np.bool = np.bool_


def gamma_ratio(p): return np.exp(gammaln((p+1)/2) - gammaln(p/2)) # For Calculating C_p and C_q
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I was not very clear: I meant to use Python type hints, not to add the types in the docstring. Types in a docstring cannot be easily verified, while type hints can.

@njit(fastmath=True, parallel=True, cache=True)
def dist_sum(X): # It is nothing but the parallelized version of pdist function-----------------------
'''
Parameters
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You closed the conversation without changing it, please don't do that.

dcor/_dcor.py Outdated
'''

# X_std = multivariate_normal.rvs( np.zeros(dim_X), np.identity(dim_X), size = 1)
X_std = np.random.standard_normal(dim_X)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know what random projections are. What I meant, is that it should have a random_state argument, similar to how distance_covariance_test works. This way, it is still possible to generate random data, but you can also pass a fixed seed to obtain reproducible results.

dcor/_dcor.py Outdated
'''

# X_std = multivariate_normal.rvs( np.zeros(dim_X), np.identity(dim_X), size = 1)
X_std = np.random.standard_normal(dim_X)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, do not resolve conversations until the comments have been addressed.

dcor/_dcor.py Outdated
X_std = np.random.standard_normal(dim_X)
X_norm = np.sum([i**2 for i in X_std])
U_sphere = np.array(X_std / np.sqrt(X_norm))
if dim_X > 1:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but does the general code not work in that case?

@vnmabus
Copy link
Owner

vnmabus commented May 6, 2024

Note that the last commit shows a lot of whitespace changes. Please, make sure that your editor is not changing the Unix newlines (LF, or "\n") to Windows ones (CRLF, or "\r\n").

@Palash123-4
Copy link
Author

Note that the last commit shows a lot of whitespace changes. Please, make sure that your editor is not changing the Unix newlines (LF, or "\n") to Windows ones (CRLF, or "\r\n").

I use actually "spyder", not sure why it did happen. But I can use "autopipe" for better indentation if it's necessary.
Can you please tell me why am I getting this compilation error?

Removed White space
@Palash123-4
Copy link
Author

Note that the last commit shows a lot of whitespace changes. Please, make sure that your editor is not changing the Unix newlines (LF, or "\n") to Windows ones (CRLF, or "\r\n").

I have removed the white space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants