Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Number formatting for automatic dataset names #432

Closed
domenicquirl opened this issue Jun 21, 2021 · 18 comments · Fixed by #444
Closed

Number formatting for automatic dataset names #432

domenicquirl opened this issue Jun 21, 2021 · 18 comments · Fixed by #444
Assignees
Labels
enhancement New feature or request

Comments

@domenicquirl
Copy link
Contributor

Hi again,

I continue to enjoy building something up with chart-fx. One thing that has bugged me recently are the auto-generated names for datasets created from some computation, in particular things like DataSetMath. They are very informative, but a lot of them directly stringify double values such as interval bounds. Because of this, the names quickly become unwieldy to show to humans:

example
(remaining name is cut off)

I have already written a formatter that is used for axis labels/ticks, indicators, hover info, info about measurements and so on. My trouble is in applying this format to the names of the mentioned datasets. If I am not missing something, at the moment I'd have to either duplicate and adapt the naming logic to change the series' names when they are calculated, or try to parse out any numeric values from the names, re-format them and put the name back together when I display it. Both of these are theoretically possible, but are also time-consuming and don't sound like the best solutions.

Ideally, I would want DataSetMath and MultiDimDataSetMath to use a configurable formatter for numbers in the names they assign. They could hold a default "identity" StringConverter< Number > that matches the current behaviour (or maybe uses a default NumberFormat / DecimalFormat like is used in other places) and expose a method to change this converter to a user-supplied one.

Do you think this would be a reasonable addition?
Cheers,
Domenic

@domenicquirl
Copy link
Contributor Author

I'd tag this as enhancement but I don't think I can 🤔

@RalphSteinhagen RalphSteinhagen added the enhancement New feature or request label Jun 21, 2021
@RalphSteinhagen
Copy link
Member

Interesting and valid proposal. The questions are whether this

  • is a by function (<-> doubling of function signature) or by class extension, and
  • specific API (StringConverter sound reasonable).

Would you just want to exchange (or omit) the number format or to exchange the string in its entirity (which can be done also via user-level codes).

We wanted to refactor this interface anyway ie. adding float parameters to the presently int-parameter-only functions. Thus, it's a good time to propose changes/additions. 👍

@domenicquirl
Copy link
Contributor Author

That sounds great! It its current state, for our program it would be sufficient to be able to set the formatting once for all names generated inside the math classes (so per class / class-level). In fact, I've spent considerable time on this exactly to make sure that everything uses the same format, a string conversion that supports scientific notation but also always uses a fixed number of characters, so one could say I'm a bit biased here 😅

This is also why StringConverter came to mind, because it is also used e.g. in the axis label formatter (cf. the corresponding AbstractFormatter). If we stick with class-level, this could be implemented both with subclassing and/or a setNumberFormatter configuration method.

Would you just want to exchange (or omit) the number format or to exchange the string in its entirity (which can be done also via user-level codes).

Given that the math functions do a good bit of work already in describing the operation, the dataset(s) they are derived from and the intervals they operate on, in my opinion with user-readable numbers they are very much suited enough for at least a first version of what we are building. Perhaps later based on feedback it could happen that we special case some representations, but it's way too early for me to say that with any confidence. So I would like to keep the names in general, since also the two options I mentioned that I could think of to reproduce these names with different number substrings feel like considerable effort compared to other places in chart-fx such as the axis labels.

For other things, like some plugins, I was able to extend them in a relatively straightforward manner to replace the formatting, because it is often encapsulated in one or two methods that use something like DecimalFormat(s). A good example of this is setMarkerValue in YWatchValueIndicator. It's just that in the math classes no such method exists that numbers go through to be printed.

@RalphSteinhagen
Copy link
Member

RalphSteinhagen commented Jun 24, 2021

@domenicquirl sorry for getting back late. Needed to finish some other stuff first and we discussed this also internally since this would be a nice feature and got some interesting ideas.

My initial reaction to StringConverter was 'why not -- easy enough' and started looking into implementing this either as

  • a per-static-class extension -- which isn't great because this would affect all DataSetMath-derived DataSets in the UI, or
  • a per-non-static-class instance -- which isn't great either because then the user would need to create/modify an instance for each "simple" DataSetMath operation where presently a simple call to static functions suffices.

Also, I noticed that it's actually javafx.util.StringConverter and would pull in some JavaFX dependencies for this non-UI functionality (not ideal for our and other known use-cases). Am searching/looking for alternatives...

However, since the performance of the function call overhead isn't that important (compared to the internal processing), an alternative option for the first two points would be to add an additional variadic argument* to these

https://github.com/GSI-CS-CO/chart-fx/blob/f72b16aa85f7c50ed17aab3b5c1650a2ffbbcca9/chartfx-math/src/main/java/de/gsi/math/DataSetMath.java#L633

and similar or related functions.

For example:

  1. via StringConverter (to be replaced by a non-UI JDK class):
static DataSet integrateFunction(DataSet, double, double, StringConverter<double>... converter) { /*...*/}
  1. a more generic string-format type interface with a similar syntax to this or perhaps a {fmt}-like native Java library:
static DataSet integrateFunction(DataSet, double, double, String... converter) { /*...*/}

In case there isn't a variadic argument provided, the format would fall back to the existing format. I'd tentatively prefer the latter option (if there isn't a native existing JDK interface) because this would keep the similarity to our twin projects. However, this may need some more thought as -- while {fmt} is available for C/C++/Python/... --- I couldn't readily find a similar native Java-based implementation for this.

@domenicquirl since you mentioned that you worked on this in detail before. Do you have any suggestions or recommendations?

*N.B. why preference variadic arguments: there are a lot of very similar function signatures to provide, modify, and ultimately to maintain... function overloading becomes quite a bit of boiler-plate in Java. It's a pity that Java still doesn't support default/optional parameter values or generics over primitive types ... which would make this part of the library more intuitive, concise, and thus much simpler for users/maintainers. Hope some Java-gods will pick up one of these JEPs...

EDIT:
What about using something like MessageFormat. Haven't worked with it before but seems to be usable ... feedback is welcome.

@domenicquirl
Copy link
Contributor Author

No worries, Your still timely and always thoughtful responses to issues are very much appreciated from my end!

Also, I noticed that it's actually javafx.util.StringConverter and would pull in some JavaFX dependencies for this non-UI functionality (not ideal for our and other known use-cases).

That's a good point, I didn't pay too much attention to that since inside our project we obviously use javafx anyways, but it is very reasonable to try to keep such dependencies away from core / more back-end projects.

a per-static-class extension -- which isn't great because this would affect all DataSetMath-derived DataSets in the UI

This, too, is something I hadn't thought about and I agree that the propagating effect of a static extension is undesirable.

a per-non-static-class instance -- which isn't great either because then the user would need to create/modify an instance for each "simple" DataSetMath operation where presently a simple call to static functions suffices

Here, however, I think in many cases where a user-defined formatting across all functions (of the instance) is applicable, that format will also be invariant over many calls. So a singleton DataSetMath instance that is configured once could then be used for all math operations across one or multiple classes.

To phrase that differently, your point holds exactly if you expect users to want a different formatting between many pairs of calls, less so if you expect them to re-use few custom formats in many places. For the latter, I could perfectly see this approach as a solution as described above.

Now, which of these is true I cannot speak to, or only as far as I have already said that for me one re-useable format would be sufficient. A per-function argument is certainly more flexible, but also a lot more verbose in the cases where one does re-use the same custom formatting a lot. Since there will probably be a default formatter anyways which is used if no custom one is provided, maybe it is possible to do both? I.e. allow instances to override the default formatter member, but then also allow passing a formatter to all methods. Then the "custom default" formatter can be used with a singleton instance of the math classes, but can be overridden per-function still.

since you mentioned that you worked on this in detail before. Do you have any suggestions or recommendations?

I must say that I would strongly prefer if the user could provide a fully custom implementation with a StringConverter-like interface. DecimalFormat or MessageFormat are both something I could see the chart-fx default math formatter use, they are probably reasonable in many cases with a fitting format string. However, I found it hard to achieve the behaviour I needed for our use case, namely a very consistent-length representation, with DecimalFormat. I tried, because axis formatting does this:

https://github.com/GSI-CS-CO/chart-fx/blob/f72b16aa85f7c50ed17aab3b5c1650a2ffbbcca9/chartfx-chart/src/main/java/de/gsi/chart/axes/spi/format/DefaultFormatter.java#L21-L26

but I would have needed many different DecimalFormats plus a good amount of condition checking around them.

So to me it seems preferable to have some interface for this and provide good defaults for it, e.g. maybe the default formatter could be an instance of a utils class MessageFormatFormatter with a fitting format string, such that the class can be re-used by users who want a custom format and are happy to just plug in another format string. If there is no good interface low enough in the dependency hierarchy, the required interface might be sufficiently simple to introduce yourselves? Like chart.utils has NumberFormatter, re-create the two methods of StringConverter somewhere? I am heavily in favour of anything that allows me to implement the thing myself and not have to try to work around format strings or similar.


Unrelated: better support for primitive types is a dream I would like to see come true... I also work with Rust a good bit, which has primitive generics, and it's wonderful! Java is sadly very discriminatory against non-objects...

@RalphSteinhagen
Copy link
Member

I am heavily in favour of anything that allows me to implement the thing myself and not have to try to work around format strings or similar.

I can relate to that.

What about either exposing the MessageFormat format string or the MessageFormat object directly as a variadic argument?

Unless one limits the formatting to the floating-point value to string conversion, it might be hard to defined a class-wide formatter that applies to all different function sub-versions and symbols. At least an optional MessageFormat definition per function type would leave the maximum flexibility to the user (e.g. in case somebody doesn't like the greek, integral or other math UTF-8 symbols).

Food for though/comment ...

@domenicquirl
Copy link
Contributor Author

Ah, you are thinking about formatting the entire data set name here? Cause

Unless one limits the formatting to the floating-point value to string conversion

this conversion is what I originally wanted from this issue. I agree that an interface that can distinguish between all the different math operations would be unwieldy. On the other hand I also still think that all of these string-based formatters are not flexible enough for the formatting of numbers specifically.

The scope of configurability seems to grow more and more, so I'll say again that I think a formatting solution for the decimals can already be a good start. If indeed you want to allow formatting / providing the entire data set name, then imo the format string for that would need to be able to reference properties of the operation such as

  • the kind of operation
  • the name of the signals involved
  • depending on the operator, which signal is which (eg. for non-commutative operations)
  • interval bounds that are operated on

at least if it is to be avoided that users need to re-write the code that exists for that already. I don't quite know what to think of this, since many of these points are different for the different operations. Some operations operate on one signal, some on many. Some work on an interval, some on a whole dataset. It feels hard to me to do this generically: either the math methods must spend significant effort on parsing the format string, validating it and substituting elements that apply; or the format string cannot contain such references, which then means the user must create and combine the name components themself (at which point they could also format the numbers themself, so I see less advantage here). In particular, it is already possible to rename the data set if an entirely custom name is wanted.

In case of the former (format strings that can reference properties of the operation), I continue to favour an interface that allows custom implementations of the decimal formatting. Eg.

public interface DataSetNameFormatter {
   /**
    * Responsible for the decimal to string conversion
	*/
	String numberToString(Number n);

   /**
    * Can return a format string for the entire name like
    * { @code "${opsym}(${sets[0]})|_{${position[0]}}" } or whatever format.
    * Any arguments that are numbers are passed through { @link numberToString }.
    * If this returns { @code null }, the default name is used.
	*/
	String formatString(/* optionally op info */);
}

but that might be overkill and is a lot to implement.

@RalphSteinhagen
Copy link
Member

Ah, you are thinking about formatting the entire data set name here?

Basically yes. The proposal would be to have the following signature:

static DataSet integrateFunction(DataSet, double, double, String... converter) { /*...*/}

With the default formatter looking something like

 public final static String DEFAULT_DATASET_NAME_INTEGRAL = "{0}({1})dyn|_{{2,number}}^{{3,number}}";

with folloing the function argument syntax order as, for example

  • {0} denoting the operation name (e.g. INTEGRAL_SYMBOL),
  • {1} the first DataSet name,
  • {2} the minimum integration bound, and
  • {3} the maximum integration bound,
    which could also include, for example, a Date, other available information (that may be suppressed by default) and be modified with a specific DecimalFormatter format like:
  /*...*/ = DataSetMath.integrateFunction(DataSet, double, double, "{0}({1})dyn|_{{2,number,#.0}}^{{3,number,#.0}}");
  /*...*/ = DataSetMath.integrateFunction(DataSet, double, double, "{0}({1})dyn|_{{2,number,0.0E0#}}^{{3,number,0.0E0#}}");

The format strings may be constant and globally defined, of course. Obviously, the caveat is that the standard JDK number formats follow some conventions that are a bit weird in some cases (e.g. not being allowed to force a + symbol in positive exponents). But this could be also enhanced using one of the proper number format converters that can be supplied via MessageFormat.setFormatByArgumentIndex(..) which accepts any java.text.Format derived formatter. In the latter case, it might be beneficial to directly pass MessageFormat as a variadic parameter.

Defining a specific number formatter interface isn't an issue per se, The advantage I see with using 'MessageFormat' is that it is already a JDK-defined function and that the calling function can provide all the information that is known from the function signature and that the user can define which arguments to use/drop/order/ ... etc.

The underlying question is of course is always - 'is this extra amount of flexibility worth on the long run ...' 🤔

@domenicquirl and @wirew0rm what do you think?

@domenicquirl
Copy link
Contributor Author

Ok so from what I can tell, available Formats are MessageFormat, NumberFormat and DecimalFormat and DateFormat maybe. I'd like to mention two things towards your proposal:

  1. Generally, there are some cases currently where inputs to the math functions don't directly end up in the output and/or are formatted conditionally. Integration, which you mentioned, is a good example of this:

https://github.com/GSI-CS-CO/chart-fx/blob/f72b16aa85f7c50ed17aab3b5c1650a2ffbbcca9/chartfx-math/src/main/java/de/gsi/math/DataSetMath.java#L654-L667

Depending on whether the range is finite, the index 2 and 3 arguments may or may not be a double. Would you expect users to specify multiple formats for all cases, or only handle formatting one case, or how else do you imagine handling this?

  1. Personally, I'd be a bit sad if this ends up introducing big changes, but my original issue is not solved by the result. A single number format to my knowledge cannot reproduce the consistent formatting that we have implemented for the other labels. Multiple could, with I would estimate more implementation effort for dynamically creating format strings than the still brief toString that I have now - mostly, if I have to compute a fitting number format, I could also use the existing code to compute the stringified value and put that in the name instead.

New suggestion:

public interface DataSetNameFormatter {
   /**
    * Responsible for the decimal to string conversion
	*/
	String numberToString(Number n);

   /**
    * Can return a user-generated name (not format string)
    * for the data set.
    * The entries in { @code args } correspond to what you 
    * suggested for the indices of the format string:
    *  - [0] is the kind of operator
    *  - [1..] are the function arguments
    * 
    * If this returns { @code null }, the default name is used 
    * and formatted with { @link numberToString }.
	*/
	String getName( Object[] args );
}
public class DataSetMath {
	static DataSet integrateFunction(DataSet ds, double xMin, double xMax, DataSetNameFormatter... f) { 
		Object[] args = new Object[] { "integrateFunction", ds, xMin, xMax};
		String newName = f.getName(args);
		if (newName == null) {
			// default formatting, but with `numberToString`
		}
		// computation
    }
}

This captures any Format-based implementation, because such an implementation can exists inside getName. It also allows something like my implementation, or generally an implementation that uses more info about the data set ds than it's name. Further, the issue with multiple cases is solved because a case distinction like the existing one in integrateFunction can be written in getName as well. A MessageFormatDataSetNameFormatter (optionally with some shorter name) could exist that implements getName correspondingly.

@RalphSteinhagen
Copy link
Member

@domenicquirl I see your points. An interface it is then ...

Will need to think/iterate a bit about the exact interface definition (and opportunities to leverage this also in other parts of the code). One thing is that might be useful is -- in addition to a toString(...)? -- to also add a fromString(...) method. 🤔

Let's sleep over it but we will make it happen. 👍

@RalphSteinhagen
Copy link
Member

RalphSteinhagen commented Jun 25, 2021

@domenicquirl and @wirew0rm: I wrote a small prototype interface and demo as a proof-of-concept, open for comments and review.

I pushed large parts as 'default' implementation rather than defining another abstract class since most users probably just want to re-implement/specify only the number formatter. If in doubt the format(...) can be overridden by advanced users.

public interface Formatter<T extends Number> {
    /**
     * Converts the number provided into its string form.
     *
     * @param number the number to convert
     * @return a string representation of the Number passed in.
     */
    @NotNull
    String toString(@NotNull final T number);

    /**
     * Converts the string provided into an number defined by the specific converter.
     *
     * @param string the {@code String} to convert
     * @param pos a ParsePosition object with index and error index information
     * @return a number representation of the string passed in.
     * @throws NumberFormatException  in case of parsing errors
     */
    default T fromString(@NotNull final String string, @NotNull final ParsePosition pos) {
        final int end = string.indexOf(' ', pos.getIndex());
        if (end == -1) {
            return fromString(string.substring(pos.getIndex()));
        }
        return fromString(string.substring(pos.getIndex(), end));
    }

    /**
     * Converts the string provided into an number defined by the specific converter.
     *
     * @param string the {@code String} to convert
     * @return a number representation of the string passed in.
     * @throws NumberFormatException in case of parsing errors
     */
    @NotNull
    T fromString(@NotNull final String string);

    /**
     * @param pattern the pattern for this message format
     * @param arguments an array of objects to be formatted and substituted - Numbers are formatted by @see #toString
     * @return formatted string
     */
    @NotNull
    default String format(@NotNull final String pattern, @NotNull final Object... arguments) {
        final MessageFormat formatter = new MessageFormat(pattern);
        final Format numberFormat = new Format() {
            @Override
            public StringBuffer format(@NotNull final Object obj, @NotNull final StringBuffer toAppendTo, @NotNull final FieldPosition pos) {
                assert obj instanceof Number : (" object is a " + obj.getClass());
                return toAppendTo.append(Formatter.this.toString((T) obj)); // NOSONAR NOPMD - cannot check this due to type erasure
            }

            @Override
            public Object parseObject(final String source, @NotNull final ParsePosition pos) {
                return Formatter.this.fromString(source, pos);
            }
        };

        for (int i = 0; i < arguments.length; i++) {
            if (arguments[i] instanceof Number) {
                formatter.setFormatByArgumentIndex(i, numberFormat);
            }
        }
        return formatter.format(arguments);
    }
}

A mini test-example (N.B. overriding format(..) is optional):

public class FormatterTests {
    @Test
    void basicTests() {
        Formatter<Number> testFormatter = new Formatter<>() {
            @Override
            public @NotNull String toString(@NotNull final Number number) {
                return number.toString();
            }

            @Override
            public @NotNull Number fromString(final @NotNull String string) {
                // alt: return Double.valueOf(string)
                throw new NumberFormatException("not implemented");
            }

            @Override
            public @NotNull String format(@NotNull String pattern, Object... args) {
                if (args.length <= 3) {
                    return Formatter.super.format(pattern, args);
                } else {
                    // override with custom format definition
                    return "too many arguments: " + args.length;
                    // alt: return Arrays.toString(args) ....
                }
            }
        };
        final String defaultPattern = "{1} with {2}, {0} and {0, number, #.0}";
        System.out.println("A: " + testFormatter.format(defaultPattern, 3.141, "testing", 10));
        System.out.println("B: " + testFormatter.format(defaultPattern, 1, 2, 3, 4));
    }
}

Demo output

A: testing with 10, 3.141 and 3.141
B: too many arguments: 4

@domenicquirl
Copy link
Contributor Author

That definitely covers my use case and probably many others! I wonder, though, whether it would hurt to allow even more flexibility by not limiting the interface to Number types?

In places in chart-fx where you require that a supplied formatter only messes with numbers, a ? extends Number bound can still exist in the type that is accepted by that field/method. A general Formatter<T> could be used for much more in places where this can work. The loop in format would check something like T.class.isAssignableFrom(arguments[i].class) (unsure about instanceof T for super-/subclasses, but may also work) to decide whether to apply the formatter based on the user-defined toString.

As an example, I could supply a Formatter<Number> to the math classes to change the decimal formatting, or I could supply a Formatter<DataSet> to adapt the representation of the datasets used in the computation. If I want total control over any parameters (while keeping the general format string), I can supply a Formatter<Object> and handle multiple cases myself in toString (if I want the format string changed as well, I override format).

@RalphSteinhagen
Copy link
Member

I wonder, though, whether it would hurt to allow even more flexibility by not limiting the interface to Number types?

@domenicquirl well, we/I have been down that road many times and was thinking about the same thing -- albeit initially dropping it because for most non-number/specific types you could also overwrite the toString() method, for Date this is already handled well enough by the MessageFormatter and/or it requires some less "clean" tricks from an API point-of-view.
But this hasn't stopped us before. 😏

The fundamental problem is that Java applies a 'type erasure' to generics. Thus, checks like T.class.isAssignableFrom(...) and the specific class type(s) are unfortunately not available during run-time and T is always reduced to Object unless the specific type Class<T> is explicitly provided, e.g. via an additional interface method like, for example:

@Override
public Class<T> getClassInstance() {
 return /* insert your 'CustomClassOrInterface.class' here */;
}

Using this, the loop inserting the specific type-based formatter can be rewritten as:

Class<T> test = getClassInstance();
for (int i = 0; i < arguments.length; i++) {
    if (test.isAssignableFrom(arguments[i].getClass())) {
        formatter.setFormatByArgumentIndex(i, numberFormat);
    }
}

The above would be a viable solution at the cost of the (I presume advanced) user having to provide an additional method implementation (2 -> 3) which makes (even if this is very simple) inlining this in -- e.g. lambdas -- a bit awkward.

But this is more of a style question and I do not have a strong opinion on this, as long as this is intuitive enough for most other users (80/20-rule). The latter could be achieved by having a good default implementation in the first place that makes it less likely that someone has to overwrite this, and -- to a lesser extend -- documentation and examples.

@domenicquirl BTW: do you and how do you determine the optimal number representation dynamically? What is your use case?

This is an ongoing quest for improvement especially within the axis code where the perfect solution has eluded us so far. We work in a mixed physics/engineering/operation domain where the number of significant digits (or exponential form) largely depends on the context. Sometimes I even wonder whether the 'perfect formatter' would need secondary information (e.g. min/max range) to better assess the best suitable option ... food for though.

@domenicquirl
Copy link
Contributor Author

Apologies, sometimes I forget things don't always work like you would want them to 😅
I'm a bit torn on this; part of me thinks that the additional method is easy enough to provide that it's worth the extra flexibility and possible uses of the interface. On the other hand, I'm having a hard time with the default implementation of getClassInstance.

If the interface is fully generic, going with Number (which is probably the primary case where this is used) feels off because then the default implementation of the method is more restrictive than the rest of the interface. But if the default implementation is Object then anyone who does not override the method needs to handle all possible types in toString... What were you imagining as the default implementation @RalphSteinhagen ?

Somewhat aside, I will suggest having the getClassInstance method's name include what the class it returns is for, in case you decide it will be included. E.g. classToFormat or customFormatAnySubclassOfThisClass (not really that particular name, but you get the idea).

do you and how do you determine the optimal number representation dynamically? What is your use case?

The plot I am working on currently is for both simulated and recorded hardware signals in a testing scenario. Some constraints I would like to end up with are

  • have the same formatting anywhere in and around the plot. Formatting should agree between axis labels, tooltips, etc. Some values/results computed by plugins may be displayed in a separate, but thematically related, UI component and should also share the same representation.
  • all numbers should have a String representation of the same length. This representation may still be adapted by UI components displaying the numbers, e.g. to remove whitespace if undesired at that position, but the primary String output should be uniform. This can help tremendously with avoiding small misalignments in layout. For example, together with a monospaced font this also automatically keeps chart axes from changing position on zoom, and has them aligned between multiple charts containing signals with different value ranges.
  • scientific formatting should be available for very small or large numbers.

At the moment, I use several DecimalFormats to achieve this that are all set to very strict format strings (e.g. no #, which could produce different lengths). I switch between these formatters based on numeric value and further process the String produced by them to account for global sign, sign of the exponent in scientific notation, and magnitude of the value (so atm larger values get less fractional digits). 0 is an exception because 0.0000 reads weird in a lot of places.

For some situations, such as measuring a value, it may be desirable to have more precision. Usually, such values are displayed in their own places in the UI, so there may be exceptions to my guidelines where it makes sense. For many of these, though, I'd use the same formatting logic, just without the magnitude adjustment. This makes the format conditional on the context in which it is used, so I guess that also falls under "secondary information" ^^

@RalphSteinhagen
Copy link
Member

@domenicquirl thanks for the info. I gather that you are using static definitions as 'secondary infos' rather than adapting them dynamically during run-time (ie. based on the actual data range).

Your usage of fixed-String length avoids dynamic axis dimension changes which -- in some cases -- cause secondary issues we are looking into. Unfortunately, I reckon that we should not impose global limits on label widths due the many different/undefined ranges of values users are putting into the axes and that are a priori unknown from a generic lib-point-of-view to the lib ... unless we make the width an external constraint and adapt the number/label format, size accordingly ... one of the Chart-FX conundrums (most of them are O(n^3) <-> affecting the overall performance). 🤔

Sometimes I even wonder whether the 'perfect formatter' would need secondary information (e.g. min/max range)

My thought was that one could provide some additional layouting constraints or ranges to the toString(...) methods, for example (very crude/specific):

String toString(@NotNull final T value, T... range);

with range not actually being plotted but used to condition the optimal range (e.g. whether to use exponential form or number of significant digits). 🤔 aloud ...

[...] What were you imagining as the default implementation [...]

There wouldn't be a default implementation of public Class<T> getClassInstance()per se (possibly renamed getTypeClassInstance()) but this would be a mandatory interface method the user would need to always provide in addition to toString(...) and fromString(...). The only constraints is that the return type needs to be compatible with the generic type.
For the use in DataSetMath the generic T would happen to be Number.

If one wants to provide a customised override for lost of types, then the generic T would be Object and the user-implementation of the interface

public Class<Object> getClassInstance() { return Object.class; }

and something like

@Override
public @NotNull String toString(@NotNull final Object obj) {
    if (obj instanceof DataSet) {
        return ((DataSet)obj).getName();
    }
   // [..] and so on
    return obj.toString();
}

Overriding format(...) I'd leave for the experts that are not satisfied with the options and/or want something more specific or optimised for performance.

Again, I hope that the default implementation that would be provided is useful for at least 80% of the users.
N.B. it's very hard to get any quantitative feedback on the '80% user' since many of our users do not publish their code using our little lib ...

@domenicquirl
Copy link
Contributor Author

I gather that you are using static definitions as 'secondary infos' rather than adapting them dynamically during run-time (ie. based on the actual data range)

I do both: the underlying "consistent formatter" makes run-time decisions based on the value to display, e.g. in regards to whether to use scientific notation or how to adapt to the value's magnitude. Any use sites of this formatter may choose to apply some exceptions (I gave some examples for this already). These "contextual on where it is used" adaptations are then static decisions.

My thought was that one could provide some additional layouting constraints or ranges to the toString(...) methods

This is more your domain I think. From the top of my head I can't come up with additional info that is both useful to the formatter, flexible and consistently available in that format (situations where chart-fx or even a user would need to provide extra info to toString, but doesn't have that info or has different info that maybe they would even like to use, but can't, because it doesn't fit the format should be avoided imo). But yeah, I'm way less familiar with this library than you, so if you can come up with something good here then I'm surely not gonna say no. Complexity is worth keeping in mind though too I think, the new interface already does quite a lot for being new and "just" doing formatting.

There wouldn't be a default implementation of public Class<T> getClassInstance() per se [...] but this would be a mandatory interface method the user would need to always provide

That makes a lot more sense to me, there just isn't a type that could reasonably be expected to apply as the default 👍🏻

RalphSteinhagen added a commit that referenced this issue Jul 3, 2021
addresses issue #432 and targeted to replace other formatter definitions
@RalphSteinhagen RalphSteinhagen linked a pull request Jul 3, 2021 that will close this issue
@RalphSteinhagen RalphSteinhagen self-assigned this Jul 3, 2021
RalphSteinhagen added a commit that referenced this issue Jul 3, 2021
addresses issue #432 and targeted to replace other formatter definitions
@RalphSteinhagen
Copy link
Member

@domenicquirl I finally got some time and gave it a shot. If you could have a look...

@domenicquirl
Copy link
Contributor Author

I've left some comments on the PR 👍🏻

RalphSteinhagen added a commit that referenced this issue Jul 6, 2021
* new Formatter<T> interface definition

addresses issue #432 and targeted to replace other formatter definitions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants