Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Type or Category to LevelOfMeasurement #139

Closed
keilw opened this issue Nov 2, 2018 · 13 comments
Closed

Add Type or Category to LevelOfMeasurement #139

keilw opened this issue Nov 2, 2018 · 13 comments

Comments

@keilw
Copy link
Member

keilw commented Nov 2, 2018

Some LevelOfMeasurement entries are Quantitative, while others are Qualitative, see https://www.thoughtco.com/definition-of-qualitative-data-3126330.

We may want to distinguish those two types while not removing those that are qualitative. As with the Unit, which can also be used independent from the Quantity type itself (not subtypes like Dimensionless, Mass, etc.)

@keilw
Copy link
Member Author

keilw commented Nov 2, 2018

To avoid an extra enum (they cannot be nested) a simple isNumerical() flag seems the easiest way to tell them apart.

@keilw keilw closed this as completed Nov 2, 2018
@keilw keilw removed the in progress label Nov 2, 2018
@desruisseaux
Copy link
Contributor

Maybe another name than isNumerical(), given that ORDINAL and even NOMINAL scales may use numbers too? Furthermore isn't ORDINAL in a gray area between quantitative and qualitative?

@keilw
Copy link
Member Author

keilw commented Nov 3, 2018

Well https://en.wikipedia.org/wiki/Quantitative also calls it "Numerical data", but https://en.wikipedia.org/wiki/Level_of_measurement#Overview talks about quantitative vs. qualitative, so it being mostly meta information there's no harm with a name like isQuantitative(). Especially inside an enum it would require either another separate enum which would be an overkill, or hardcode strings. The Wikipedia article counts ordinal among the quantitative levels, and ordinal() of the enum itself is also numeric, so it makes sense to include it.

@desruisseaux
Copy link
Contributor

In my understanding, we have:

Level of Measurement Type of information
NOMINAL Qualitative
ORDINAL Gray area - could be one or or the other
INTERVAL Quantitative
RATIO Quantitative

Qualitative information are defined as "qualities that are descriptive, subjective or difficult to measure." One could argue that opinion "measurements" like "completely agree, mostly agree, mostly disagree, completely disagree" could fall in this category and classify ORDINAL as qualitative. Other peoples could classify ORDINAL as quantitative.

Given that the classification of ORDINAL seems debatable, do we need to take a decision ourselves? We could let it to users. Furthermore if we classify it as quantitative like Wikipedia seems to do, that would left only NOMINAL with qualitative classification. Then what would be the benefit of an level.isQuantative() method compared to level != LevelOfMeasurement.NOMINAL?

But before this discussion, I would like to know, what is the problem that we are trying to solve by providing a qualitative versus quantitative classification?

@keilw
Copy link
Member Author

keilw commented Nov 3, 2018

It describes how Steven's categorization grouped them as also described in Wikipedia. The LevelOfMeasurement similar to e.g. Unit may be used independent from other types like Quantity, so it is a meta-information. Since we use the ordinal() for the rank #141 could mean it makes sense to stick to the compareTo() or equals() methods.

@desruisseaux
Copy link
Contributor

It describes a categorization, yes, but this categorization may not fit all needs (different peoples may have different interpretation of ORDINAL category). It would make sense to commit ourselves to a particular interpretation if we have a need for it, but do we have a need for it?

@keilw
Copy link
Member Author

keilw commented Nov 3, 2018

Renaming it from isNumeric() to isQuantitative() makes the point here. NOMINAL cannot be sorted or ordered, at least not in a statistical or mathematical context (we don't care about sorting by name here) I don't think it was mentioned in some of the other tables you did in different threads, but that is the major difference and should be here to take into consideration.

@desruisseaux
Copy link
Contributor

isQuantitative() seems better than isNumeric() to me. But I still wonder why we need a method which does nothing more than != NOMINAL, especially since this method implies that we decided to classify ORDINAL as quantitative despite its "gray area" position. Given that I'm not aware of any problem that this method addresses, why taking a decision that users may disagree with?

@keilw
Copy link
Member Author

keilw commented Nov 4, 2018

Some systems also merge RATIO and INTERVAL into something like SCALE, so you can never make it right for everyone. Especially when mapping and transforming data into other systems it is good to be aware of that restriction and difference. And should there ever be a need to introduce further levels, it is also better to distinguish.

@desruisseaux
Copy link
Contributor

I agree that we can never make it right for everyone, which is why a safe approach is to not introduce concepts that we don't need. The introduction of LevelOfMeasurement is an attempt to address #95. But the introduction of isQuantitative() is an attempt to address what?

@keilw
Copy link
Member Author

keilw commented Nov 4, 2018

JSR 363 has already been used a lot in areas like Big Data, Analytics, etc. So this flag (with no overhead in the size) helps for decision making whether some data points can be used for certain purposes or not.
https://en.wikipedia.org/wiki/Statistical_data_type was mentioned in discussions about the levels, and while we may not define those types directly, if such collection of data types was backed by the LevelOfMeasurement the flag indicates the column "Scale of relative differences" where !isQuantitative() means incomparable.

@desruisseaux
Copy link
Contributor

The overhead is size is not my concern. My concern is whether this flag is relevant. I don't think isQuantitative help for decision making we force a decision (ORDINAL classified as quantitative) that may not fit users need.

However isComparable would address my concern. While the classification of ORDINAL as "quantitative" seems controversial to me, a classification as "comparable" does not seem problematic.

@keilw
Copy link
Member Author

keilw commented Nov 5, 2018

Ok, then let's rename it to avoid misunderstandings. The JavaDoc may and should still explain, that the flag is "related" to the category of measurement level aka what it stands for, but if this name is more neutral, sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants