New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Number formatting for automatic dataset names #432
Comments
I'd tag this as |
Interesting and valid proposal. The questions are whether this
Would you just want to exchange (or omit) the number format or to exchange the string in its entirity (which can be done also via user-level codes). We wanted to refactor this interface anyway ie. adding float parameters to the presently int-parameter-only functions. Thus, it's a good time to propose changes/additions. 👍 |
That sounds great! It its current state, for our program it would be sufficient to be able to set the formatting once for all names generated inside the math classes (so per class / class-level). In fact, I've spent considerable time on this exactly to make sure that everything uses the same format, a string conversion that supports scientific notation but also always uses a fixed number of characters, so one could say I'm a bit biased here 😅 This is also why
Given that the math functions do a good bit of work already in describing the operation, the dataset(s) they are derived from and the intervals they operate on, in my opinion with user-readable numbers they are very much suited enough for at least a first version of what we are building. Perhaps later based on feedback it could happen that we special case some representations, but it's way too early for me to say that with any confidence. So I would like to keep the names in general, since also the two options I mentioned that I could think of to reproduce these names with different number substrings feel like considerable effort compared to other places in For other things, like some plugins, I was able to extend them in a relatively straightforward manner to replace the formatting, because it is often encapsulated in one or two methods that use something like |
@domenicquirl sorry for getting back late. Needed to finish some other stuff first and we discussed this also internally since this would be a nice feature and got some interesting ideas. My initial reaction to
Also, I noticed that it's actually However, since the performance of the function call overhead isn't that important (compared to the internal processing), an alternative option for the first two points would be to add an additional variadic argument* to these and similar or related functions. For example:
static DataSet integrateFunction(DataSet, double, double, StringConverter<double>... converter) { /*...*/}
static DataSet integrateFunction(DataSet, double, double, String... converter) { /*...*/} In case there isn't a variadic argument provided, the format would fall back to the existing format. I'd tentatively prefer the latter option (if there isn't a native existing JDK interface) because this would keep the similarity to our twin projects. However, this may need some more thought as -- while {fmt} is available for C/C++/Python/... --- I couldn't readily find a similar native Java-based implementation for this. @domenicquirl since you mentioned that you worked on this in detail before. Do you have any suggestions or recommendations? *N.B. why preference variadic arguments: there are a lot of very similar function signatures to provide, modify, and ultimately to maintain... function overloading becomes quite a bit of boiler-plate in Java. It's a pity that Java still doesn't support default/optional parameter values or generics over primitive types ... which would make this part of the library more intuitive, concise, and thus much simpler for users/maintainers. Hope some Java-gods will pick up one of these JEPs... EDIT: |
No worries, Your still timely and always thoughtful responses to issues are very much appreciated from my end!
That's a good point, I didn't pay too much attention to that since inside our project we obviously use javafx anyways, but it is very reasonable to try to keep such dependencies away from core / more back-end projects.
This, too, is something I hadn't thought about and I agree that the propagating effect of a static extension is undesirable.
Here, however, I think in many cases where a user-defined formatting across all functions (of the instance) is applicable, that format will also be invariant over many calls. So a singleton To phrase that differently, your point holds exactly if you expect users to want a different formatting between many pairs of calls, less so if you expect them to re-use few custom formats in many places. For the latter, I could perfectly see this approach as a solution as described above. Now, which of these is true I cannot speak to, or only as far as I have already said that for me one re-useable format would be sufficient. A per-function argument is certainly more flexible, but also a lot more verbose in the cases where one does re-use the same custom formatting a lot. Since there will probably be a default formatter anyways which is used if no custom one is provided, maybe it is possible to do both? I.e. allow instances to override the default formatter member, but then also allow passing a formatter to all methods. Then the "custom default" formatter can be used with a singleton instance of the math classes, but can be overridden per-function still.
I must say that I would strongly prefer if the user could provide a fully custom implementation with a but I would have needed many different So to me it seems preferable to have some interface for this and provide good defaults for it, e.g. maybe the default formatter could be an instance of a utils class Unrelated: better support for primitive types is a dream I would like to see come true... I also work with Rust a good bit, which has primitive generics, and it's wonderful! Java is sadly very discriminatory against non-objects... |
I can relate to that. What about either exposing the Unless one limits the formatting to the floating-point value to string conversion, it might be hard to defined a class-wide formatter that applies to all different function sub-versions and symbols. At least an optional Food for though/comment ... |
Ah, you are thinking about formatting the entire data set name here? Cause
this conversion is what I originally wanted from this issue. I agree that an interface that can distinguish between all the different math operations would be unwieldy. On the other hand I also still think that all of these string-based formatters are not flexible enough for the formatting of numbers specifically. The scope of configurability seems to grow more and more, so I'll say again that I think a formatting solution for the decimals can already be a good start. If indeed you want to allow formatting / providing the entire data set name, then imo the format string for that would need to be able to reference properties of the operation such as
at least if it is to be avoided that users need to re-write the code that exists for that already. I don't quite know what to think of this, since many of these points are different for the different operations. Some operations operate on one signal, some on many. Some work on an interval, some on a whole dataset. It feels hard to me to do this generically: either the math methods must spend significant effort on parsing the format string, validating it and substituting elements that apply; or the format string cannot contain such references, which then means the user must create and combine the name components themself (at which point they could also format the numbers themself, so I see less advantage here). In particular, it is already possible to rename the data set if an entirely custom name is wanted. In case of the former (format strings that can reference properties of the operation), I continue to favour an interface that allows custom implementations of the decimal formatting. Eg. public interface DataSetNameFormatter {
/**
* Responsible for the decimal to string conversion
*/
String numberToString(Number n);
/**
* Can return a format string for the entire name like
* { @code "${opsym}(${sets[0]})|_{${position[0]}}" } or whatever format.
* Any arguments that are numbers are passed through { @link numberToString }.
* If this returns { @code null }, the default name is used.
*/
String formatString(/* optionally op info */);
} but that might be overkill and is a lot to implement. |
Basically yes. The proposal would be to have the following signature: static DataSet integrateFunction(DataSet, double, double, String... converter) { /*...*/} With the default formatter looking something like public final static String DEFAULT_DATASET_NAME_INTEGRAL = "{0}({1})dyn|_{{2,number}}^{{3,number}}"; with folloing the function argument syntax order as, for example
/*...*/ = DataSetMath.integrateFunction(DataSet, double, double, "{0}({1})dyn|_{{2,number,#.0}}^{{3,number,#.0}}");
/*...*/ = DataSetMath.integrateFunction(DataSet, double, double, "{0}({1})dyn|_{{2,number,0.0E0#}}^{{3,number,0.0E0#}}"); The format strings may be constant and globally defined, of course. Obviously, the caveat is that the standard JDK number formats follow some conventions that are a bit weird in some cases (e.g. not being allowed to force a Defining a specific number formatter interface isn't an issue per se, The advantage I see with using 'MessageFormat' is that it is already a JDK-defined function and that the calling function can provide all the information that is known from the function signature and that the user can define which arguments to use/drop/order/ ... etc. The underlying question is of course is always - 'is this extra amount of flexibility worth on the long run ...' 🤔 @domenicquirl and @wirew0rm what do you think? |
Ok so from what I can tell, available
Depending on whether the range is finite, the index 2 and 3 arguments may or may not be a double. Would you expect users to specify multiple formats for all cases, or only handle formatting one case, or how else do you imagine handling this?
New suggestion: public interface DataSetNameFormatter {
/**
* Responsible for the decimal to string conversion
*/
String numberToString(Number n);
/**
* Can return a user-generated name (not format string)
* for the data set.
* The entries in { @code args } correspond to what you
* suggested for the indices of the format string:
* - [0] is the kind of operator
* - [1..] are the function arguments
*
* If this returns { @code null }, the default name is used
* and formatted with { @link numberToString }.
*/
String getName( Object[] args );
}
public class DataSetMath {
static DataSet integrateFunction(DataSet ds, double xMin, double xMax, DataSetNameFormatter... f) {
Object[] args = new Object[] { "integrateFunction", ds, xMin, xMax};
String newName = f.getName(args);
if (newName == null) {
// default formatting, but with `numberToString`
}
// computation
}
} This captures any |
@domenicquirl I see your points. An interface it is then ... Will need to think/iterate a bit about the exact interface definition (and opportunities to leverage this also in other parts of the code). One thing is that might be useful is -- in addition to a Let's sleep over it but we will make it happen. 👍 |
@domenicquirl and @wirew0rm: I wrote a small prototype interface and demo as a proof-of-concept, open for comments and review. I pushed large parts as 'default' implementation rather than defining another abstract class since most users probably just want to re-implement/specify only the number formatter. If in doubt the public interface Formatter<T extends Number> {
/**
* Converts the number provided into its string form.
*
* @param number the number to convert
* @return a string representation of the Number passed in.
*/
@NotNull
String toString(@NotNull final T number);
/**
* Converts the string provided into an number defined by the specific converter.
*
* @param string the {@code String} to convert
* @param pos a ParsePosition object with index and error index information
* @return a number representation of the string passed in.
* @throws NumberFormatException in case of parsing errors
*/
default T fromString(@NotNull final String string, @NotNull final ParsePosition pos) {
final int end = string.indexOf(' ', pos.getIndex());
if (end == -1) {
return fromString(string.substring(pos.getIndex()));
}
return fromString(string.substring(pos.getIndex(), end));
}
/**
* Converts the string provided into an number defined by the specific converter.
*
* @param string the {@code String} to convert
* @return a number representation of the string passed in.
* @throws NumberFormatException in case of parsing errors
*/
@NotNull
T fromString(@NotNull final String string);
/**
* @param pattern the pattern for this message format
* @param arguments an array of objects to be formatted and substituted - Numbers are formatted by @see #toString
* @return formatted string
*/
@NotNull
default String format(@NotNull final String pattern, @NotNull final Object... arguments) {
final MessageFormat formatter = new MessageFormat(pattern);
final Format numberFormat = new Format() {
@Override
public StringBuffer format(@NotNull final Object obj, @NotNull final StringBuffer toAppendTo, @NotNull final FieldPosition pos) {
assert obj instanceof Number : (" object is a " + obj.getClass());
return toAppendTo.append(Formatter.this.toString((T) obj)); // NOSONAR NOPMD - cannot check this due to type erasure
}
@Override
public Object parseObject(final String source, @NotNull final ParsePosition pos) {
return Formatter.this.fromString(source, pos);
}
};
for (int i = 0; i < arguments.length; i++) {
if (arguments[i] instanceof Number) {
formatter.setFormatByArgumentIndex(i, numberFormat);
}
}
return formatter.format(arguments);
}
} A mini test-example (N.B. overriding public class FormatterTests {
@Test
void basicTests() {
Formatter<Number> testFormatter = new Formatter<>() {
@Override
public @NotNull String toString(@NotNull final Number number) {
return number.toString();
}
@Override
public @NotNull Number fromString(final @NotNull String string) {
// alt: return Double.valueOf(string)
throw new NumberFormatException("not implemented");
}
@Override
public @NotNull String format(@NotNull String pattern, Object... args) {
if (args.length <= 3) {
return Formatter.super.format(pattern, args);
} else {
// override with custom format definition
return "too many arguments: " + args.length;
// alt: return Arrays.toString(args) ....
}
}
};
final String defaultPattern = "{1} with {2}, {0} and {0, number, #.0}";
System.out.println("A: " + testFormatter.format(defaultPattern, 3.141, "testing", 10));
System.out.println("B: " + testFormatter.format(defaultPattern, 1, 2, 3, 4));
}
} Demo output
|
That definitely covers my use case and probably many others! I wonder, though, whether it would hurt to allow even more flexibility by not limiting the interface to In places in As an example, I could supply a |
@domenicquirl well, we/I have been down that road many times and was thinking about the same thing -- albeit initially dropping it because for most non-number/specific types you could also overwrite the The fundamental problem is that Java applies a 'type erasure' to generics. Thus, checks like @Override
public Class<T> getClassInstance() {
return /* insert your 'CustomClassOrInterface.class' here */;
} Using this, the loop inserting the specific type-based formatter can be rewritten as: Class<T> test = getClassInstance();
for (int i = 0; i < arguments.length; i++) {
if (test.isAssignableFrom(arguments[i].getClass())) {
formatter.setFormatByArgumentIndex(i, numberFormat);
}
} The above would be a viable solution at the cost of the (I presume advanced) user having to provide an additional method implementation (2 -> 3) which makes (even if this is very simple) inlining this in -- e.g. lambdas -- a bit awkward. But this is more of a style question and I do not have a strong opinion on this, as long as this is intuitive enough for most other users (80/20-rule). The latter could be achieved by having a good default implementation in the first place that makes it less likely that someone has to overwrite this, and -- to a lesser extend -- documentation and examples. @domenicquirl BTW: do you and how do you determine the optimal number representation dynamically? What is your use case? This is an ongoing quest for improvement especially within the axis code where the perfect solution has eluded us so far. We work in a mixed physics/engineering/operation domain where the number of significant digits (or exponential form) largely depends on the context. Sometimes I even wonder whether the 'perfect formatter' would need secondary information (e.g. min/max range) to better assess the best suitable option ... food for though. |
Apologies, sometimes I forget things don't always work like you would want them to 😅 If the interface is fully generic, going with Somewhat aside, I will suggest having the
The plot I am working on currently is for both simulated and recorded hardware signals in a testing scenario. Some constraints I would like to end up with are
At the moment, I use several For some situations, such as measuring a value, it may be desirable to have more precision. Usually, such values are displayed in their own places in the UI, so there may be exceptions to my guidelines where it makes sense. For many of these, though, I'd use the same formatting logic, just without the magnitude adjustment. This makes the format conditional on the context in which it is used, so I guess that also falls under "secondary information" ^^ |
@domenicquirl thanks for the info. I gather that you are using static definitions as 'secondary infos' rather than adapting them dynamically during run-time (ie. based on the actual data range). Your usage of fixed-String length avoids dynamic axis dimension changes which -- in some cases -- cause secondary issues we are looking into. Unfortunately, I reckon that we should not impose global limits on label widths due the many different/undefined ranges of values users are putting into the axes and that are a priori unknown from a generic lib-point-of-view to the lib ... unless we make the width an external constraint and adapt the number/label format, size accordingly ... one of the Chart-FX conundrums (most of them are
My thought was that one could provide some additional layouting constraints or ranges to the String toString(@NotNull final T value, T... range); with
There wouldn't be a default implementation of If one wants to provide a customised override for lost of types, then the generic public Class<Object> getClassInstance() { return Object.class; } and something like @Override
public @NotNull String toString(@NotNull final Object obj) {
if (obj instanceof DataSet) {
return ((DataSet)obj).getName();
}
// [..] and so on
return obj.toString();
} Overriding Again, I hope that the default implementation that would be provided is useful for at least 80% of the users. |
I do both: the underlying "consistent formatter" makes run-time decisions based on the value to display, e.g. in regards to whether to use scientific notation or how to adapt to the value's magnitude. Any use sites of this formatter may choose to apply some exceptions (I gave some examples for this already). These "contextual on where it is used" adaptations are then static decisions.
This is more your domain I think. From the top of my head I can't come up with additional info that is both useful to the formatter, flexible and consistently available in that format (situations where
That makes a lot more sense to me, there just isn't a type that could reasonably be expected to apply as the default 👍🏻 |
addresses issue #432 and targeted to replace other formatter definitions
addresses issue #432 and targeted to replace other formatter definitions
@domenicquirl I finally got some time and gave it a shot. If you could have a look... |
I've left some comments on the PR 👍🏻 |
* new Formatter<T> interface definition addresses issue #432 and targeted to replace other formatter definitions
Hi again,
I continue to enjoy building something up with
chart-fx
. One thing that has bugged me recently are the auto-generated names for datasets created from some computation, in particular things likeDataSetMath
. They are very informative, but a lot of them directly stringifydouble
values such as interval bounds. Because of this, the names quickly become unwieldy to show to humans:(remaining name is cut off)
I have already written a formatter that is used for axis labels/ticks, indicators, hover info, info about measurements and so on. My trouble is in applying this format to the names of the mentioned datasets. If I am not missing something, at the moment I'd have to either duplicate and adapt the naming logic to change the series' names when they are calculated, or try to parse out any numeric values from the names, re-format them and put the name back together when I display it. Both of these are theoretically possible, but are also time-consuming and don't sound like the best solutions.
Ideally, I would want
DataSetMath
andMultiDimDataSetMath
to use a configurable formatter for numbers in the names they assign. They could hold a default "identity"StringConverter< Number >
that matches the current behaviour (or maybe uses a defaultNumberFormat
/DecimalFormat
like is used in other places) and expose a method to change this converter to a user-supplied one.Do you think this would be a reasonable addition?
Cheers,
Domenic
The text was updated successfully, but these errors were encountered: