Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cookbook for chi_square distance #4324

Merged
merged 2 commits into from
Jun 12, 2018

Conversation

FaroukY
Copy link
Contributor

@FaroukY FaroukY commented Jun 2, 2018

No description provided.

Chi Square Distance
===================

The Chi Square Distance for real valued features x and x' extends the concept of :math:`\chi^{2}` metric to negative values.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you are using x and x' downstream, pls put them into (inline) math mode here as well

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need more clarity here :-) The distance extends the concept of the metric, what does this mean? I think distance and metric are commonly used interchangeably. Explain specifically what is meant by extending to negative values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have some updates here pls? :)

Copy link
Contributor Author

@FaroukY FaroukY Jun 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@iglesias @karlnapf The classical chi square distance is defined as:

screenshot from 2018-06-08 15-30-10

This is for only positive values. However, the implementation in shogun is for both positive and negative values, since the denominator is (|x_i|+|y_i|) so it extends the chi square distance to the negative values too. Is this more clear?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To some extent. What reference are you using for this classical definition?

From I have read after checking shortly out there, a use case of this Chi-2 distance is to calculate distance between histograms. In that use case the values are positive of course. I suppose that people found other use cases where values are not necessary positive. I suspect that this dichotomy you have observed (definitions with and without the absolute value) may come from there.

It is fine for me if you want to leave the text as it is. I would not play with saying the distance extends the concept of metric to negative values since because I find it rather vague and potentially misleading.


.. math::

d(\bf{x},\bf{x'}) = \sum_{i=1}^{n}\frac{(x_{i}-x'_{i})^2}{|x_{i}|+|x'_{i}|} \quad \bf{x},\bf{x'} \in R^{n}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

n is undefined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I add "for any natural number n" ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be fine for defining n if you just put that the features belong to R^n the first time you introduce them. By the way, for extra mathy happiness, use \mathbb with R 🤓

Copy link
Member

@karlnapf karlnapf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!
@shubham808 @vinx13 @FaroukY Pls let this be the last "distance" cookbook. We have enough of those already. (meta examples still need to be ported)

@iglesias
Copy link
Collaborator

iglesias commented Jun 6, 2018

Can RealMatrix distance_matrix_aa = d.get_distance_matrix() be ported to something like d.get('distance_matrix')? @karlnapf

Chi Square Distance
===================

The Chi Square Distance for real valued features x and x' extends the concept of :math:`\chi^{2}` metric to negative values.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need more clarity here :-) The distance extends the concept of the metric, what does this mean? I think distance and metric are commonly used interchangeably. Explain specifically what is meant by extending to negative values.

@karlnapf
Copy link
Member

karlnapf commented Jun 6, 2018

@iglesias no get_distance_matrix() is actually not a getter but it does compute something. We will need some hacks from @lisitsyn on registering callback functions as parameters accessible via get before that works

@karlnapf karlnapf merged commit 4d46c05 into shogun-toolbox:develop Jun 12, 2018
@karlnapf
Copy link
Member

thx

ktiefe pushed a commit to ktiefe/shogun that referenced this pull request Jul 30, 2019
* updated csv_file in chi_square.sg to new api and wrote a cookbook for the chi_square distance
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants