Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IGNITE-13386 [ML]: Add new distances (BrayCurtis, Canberra, JensenShannon and etc) #8197

Merged
merged 1 commit into from
Oct 9, 2020

Conversation

mrk-andreev
Copy link
Contributor

add distances

  • BrayCurtis
  • Canberra
  • JensenShannon
  • WeightedMinkowski

Issue: https://issues.apache.org/jira/browse/IGNITE-13386

Copy link
Member

@zaleslaw zaleslaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionality is good, need to add more comments to avoid problems with merging and TC

private final Double base;

public JensenShannonDistance() {
base = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add here the default value which could make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Math.E as default value because Math.log(Math.E) == 1 and js /= Math.log(Math.E) is equal to js /= 1

import org.apache.ignite.ml.math.util.MatrixUtil;

/**
* Calculates the JensenShannonDistance distance between two points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a link to the Wiki or paper or put the formula in pseudocode (will be useful for understanding)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added link to wikipedia.

import org.apache.ignite.ml.math.util.MatrixUtil;

/**
* Calculates the Canberra distance between two points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a link to the Wiki or paper or put the formula in pseudocode (will be useful for understanding)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added link to wikipedia.

import org.apache.ignite.ml.math.util.MatrixUtil;

/**
* Calculates the Bray Curtis distance between two points.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add a link to the Wiki or paper or put the formula in pseudocode (will be useful for understanding)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added link to wikipedia.

import static org.junit.Assert.assertEquals;

@RunWith(Parameterized.class)
public class BrayCurtisDistanceTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment for this test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment like

/**
 * Evaluate BrayCurtisDistance in multiple test datasets
 */


private final TestData testData;

public CanberraDistanceTest(TestData testData) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, comment this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment like

/** */

new BrayCurtisDistance(),
new CanberraDistance(),
new JensenShannonDistance(),
new WeightedMinkowskiDistance(4, new DenseVector(new double[]{1, 1, 1})),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these parameters could become default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can set default value for p (power) parameter as 2, but weight can not be default because weight's is dimensional dependent. If we want to evaluate distance between d(a,b), then a.size() == b.size() && a.size() == weight.size().
Or I can use null for weight and use them only when weight != null.

import static org.junit.Assert.assertEquals;

@RunWith(Parameterized.class)
public class JensenShannonDistanceTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment this class please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment like

/** */

import static org.junit.Assert.assertEquals;

@RunWith(Parameterized.class)
public class WeightedMinkowskiDistanceTest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment this class please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment like

/** */

}

@Test
public void test() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change the test name to test+, because it could create problems with next migration to next versions of JUNIt framework

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I renamed my tests like test<distanceName.

@mrk-andreev mrk-andreev force-pushed the IGNITE-13386 branch 3 times, most recently from fde8a4b to 7c86935 Compare September 7, 2020 16:47
@zaleslaw zaleslaw changed the title IGNITE-13386: add BrayCurtis,Canberra,JensenShannon,WeightedMinkowski… IGNITE-13386 [ML]: add BrayCurtis,Canberra,JensenShannon,WeightedMinkowski… Oct 6, 2020
@zaleslaw zaleslaw changed the title IGNITE-13386 [ML]: add BrayCurtis,Canberra,JensenShannon,WeightedMinkowski… IGNITE-13386 [ML]: Add new distances (BrayCurtis, Canberra, JensenShannon and etc) Oct 6, 2020
@zaleslaw zaleslaw merged commit e302fdb into apache:master Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants