Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add correlation heatmap functions #67

Merged
merged 16 commits into from
Aug 6, 2021
Merged

Add correlation heatmap functions #67

merged 16 commits into from
Aug 6, 2021

Conversation

simonhkswan
Copy link
Contributor

@simonhkswan simonhkswan commented Aug 3, 2021

  • add boolean Kruskal-Wallis H Test for proxy detection
  • update non-linear metrics to suppress some warnings (so they can be used in docs)
  • fix of compute_correlation_metrics
  • addition of helper function for compute_correlation_metrics to use in list comprehensions
  • addition of multiprocessing to compute_correlation_metrics to increase speed
  • update of two_column_heatmap type signature and docstring
  • scale heatmap size based on input

@simonhkswan simonhkswan force-pushed the heatmap-update branch 3 times, most recently from 8e43b56 to 966199d Compare August 3, 2021 02:53
@bogdansurdu bogdansurdu marked this pull request as ready for review August 5, 2021 13:47
@rob-tay rob-tay self-requested a review August 5, 2021 13:47
Copy link
Contributor

@rob-tay rob-tay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks! Just some tiny docstring formatting to resolve.

Args:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need space after "Args"

Suggested change

@@ -22,8 +22,8 @@ def find_sensitive_correlations(
"""Looks at the columns that are not considered to be immediately sensitive and finds if any is strongly
correlated with a sensitive column, specifying both the sensitive column name and the sensitive category
it is a part of.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a space here before "Args"

@@ -91,8 +91,8 @@ def find_column_correlation(
or the column corresponding to the given name.
If matches are found, a list containing the correlated
column names and its associated sensitive category, respectively, is returned.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need a space here before "Args"

Args:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need space after "Args"

Suggested change

@rob-tay rob-tay self-requested a review August 6, 2021 09:21
@sonarcloud
Copy link

sonarcloud bot commented Aug 6, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Contributor

@rob-tay rob-tay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 LGTM!

@bogdansurdu bogdansurdu merged commit 73de2b9 into main Aug 6, 2021
@bogdansurdu bogdansurdu deleted the heatmap-update branch August 6, 2021 09:46
@rob-tay rob-tay mentioned this pull request Aug 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:feature New API or feature.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants