Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data formatting #444

Open
LeaVerou opened this issue Jan 23, 2019 · 23 comments
Open

Data formatting #444

LeaVerou opened this issue Jan 23, 2019 · 23 comments

Comments

@LeaVerou
Copy link
Member

@LeaVerou LeaVerou commented Jan 23, 2019

Currently data formatting is built-in, and cannot really be changed. There are a few exceptions, e.g. <time> elements, where the format is specified via an expression in their contents (because their data is in an attribute), but overall there is no flexibility and in some cases in the past this has even caused bugs. E.g. the only way to format a number as a currency is to use CSS or an expression to prepend a currency symbol to it.

It would be nice to have a feature for this that doesn't affect the data but only their presentation. Perhaps something like Vue filters but with a keyword instead of an obscure pipe symbol. I do like the idea of chaining, and of using these filters without parentheses if no arguments are required.
We could also have built-in filters for commonly needed formats.

We need both an attribute-based syntax for properties, as well as a syntax for expressions. Ideally based on the same keyword. mv-format is already taken, any other ideas? We could of course rename mv-format to something else, if there are good ideas about that instead, e.g. mv-filetype? mv-dataformat?

Edit: A detailed writeup of all decisions to be made for this is in #444 (comment)

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Feb 9, 2019

As far as I know, the same functionality exists in spreadsheets for ages. And in the majority of them exists the TEXT function (e.g. in Microsoft Excel, Google Sheets, OpenOffice Calc, etc) that lets change the way a number appears by applying formatting to it with format codes.

What about to let users solve the same problem the way they are used to and name the attribute, that lets format data, mv-text and a corresponding function for expressions—text(value, format)?

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 19, 2019

A number of other decisions we need to make before we go about implementing this follow.

Special syntax or just expressions?

Vue filters are just function names, e.g. {{ message | filterA | filterB }}. Extra arguments can be passed to the functions like {{ message | filterA('arg1', arg2) }}. Essentially, these functions are called with the value (message in this case) as their first argument. However, since it's a special syntax, it limits what can be done, which is not a problem in Vue because developers can always define new functions, but in Mavo that requires going outside of Mavo and using JS, which is suboptimal. Therefore, I think they should be normal expressions. However, this means the current value needs to be referenced somehow. Perhaps with $this?

Name bikeshedding

Ideally the attribute name and the expression keyword would be the same, e.g. <span property="number" mv-format="round($this, 2)"> and [number format round($this, 2)] would produce the same result.

There are two ways to go about with naming. I'm gonna use the same example throughout to test readability.

1. We come up with a new name, since mv-format is taken

Potential names:

  • display: <span property="number" mv-display="round($this, 2)"> and [number display round($this, 2)]
  • presentation: <span property="number" mv-presentation="round($this, 2)"> and [number presentation round($this, 2)]
  • dataformat: <span property="number" mv-dataformat="round($this, 2)"> and [number dataformat round($this, 2)]
  • text: <span property="number" mv-text="round($this, 2)"> and [number text round($this, 2)] as suggested by @DmitrySharabin. Not a huge fan, as I don't think it reads very naturally.
  • Any other ideas?

2. We rename mv-format to something else and use that

Potential names for renaming mv-format to (with example use for context):

  • mv-dataformat="csv" The most descriptive but maybe too long?
  • mv-type="csv" Perhaps too generic?
  • mv-filetype="csv" I'd rather avoid this since it's not always coming from a file.
  • Any other ideas?

Do note that this also needs to work as a mv-storage, mv-source, mv-init sub-attribute (as we currently have mv-storage-format, mv-source-format, mv-init-format

Implementation

Probably the easiest implementation would be to implement the keyword as a new operator that returns an object with two properties: value and presentation (name TBB). Then, Mavo code handles these appropriately. One issue with this implementation is what happens when not used at the topmost level (e.g. what happens with (number format '$' & round($this, 2)) * 2.

There are two possible solutions:

  1. We can make the format "bubble up", i.e. rewrite it as number * 2 format '$' & round($this, 2) by transforming the AST.
  2. In that case it just returns the formatted string, and the raw value is lost. This can still be useful, e.g. to replace the format parameter of some date functions, such as month()

One issue with 2 is that currently the way to specify these formats is pretty compact (e.g. '00' or 'name') because they are specific to the function, but with month(date) format X, the X is resolved independently of the expression it formats, so how do we maintain this compactness?

Current (hardcoded) formatting rules

The current formatting rules that are hardcoded and need to be expressible with this feature are as follows (From Mavo.Primitive.format() and Mavo.Primitive.formatNumber()):

  • Properties edited via a <select> use this as a map for presentation of properties. This will be the hardest to express with the new syntax, since it's DOM-based.
  • If the value is a number:
    • null is presented as "" (empty string)
    • It's formatted based on the top-most locale (English by default), and by default with 2 decimals and commas as a thousand separator
    • Infinity is formatted as -∞ or ∞ respectively.
    • Numbers in attributes or in <style> or <pre> are not formatted
  • If the value is an array, the values are printed comma-separated.
  • If the value is an object, it's JSON-serialized

The main issues with these are that they depend on the type of value being formatted, which can change from evaluation to evaluation. For example, the last rule can be expressed in MavoScript as json($this), but this will transform non-object values as well. We can solve this in two ways: either wrap them in if() with a type check for the default formatting, or design new functions (or arguments of the existing functions) that only apply conditionally and are meant to be used in formatting.

Perhaps this should also cover things like Markdown presentation (via the Markdown plugin).

Inheritance

It may be useful if mv-format (or whatever we call it) is an inherited attribute, to avoid having to add it to every property for the default formatting to apply. Also, that is more DRY in case someone wants to apply the same format on multiple expressions and properties.

How to handle multiple formats applying to the same value

Based on our decisions on the previous issues, collisions can appear in a number of ways. I don't think these should be handled the same, since they happen in different ways.

  1. Multiple expression-based formats: [number format X format Y]. These should probably be all applied in order.
  2. If mv-format is inherited do descendant values override or combine? E.g. <span mv-format="X"><span property="foo" mv-format="Y">. Override probably makes sense here.
  3. Expression-based formatting vs attribute: <span mv-format="X">[number format Y]</span>. This could go either way, I wonder what the use cases are?

Editing of formatted properties

Currently, when a property is edited all formatting goes away. This is certainly the easiest thing to do here as well: when editing is triggered, it's the raw value that's being edited and then upon exiting edit mode the new value is transformed via the format. The benefit of this is that it only requires one-way transformations (raw value to formatted value), which is what most people are used to specifying.

However, wouldn't it be most convenient if one could edit the formatted value as well? I've often seen people enter numbers like "1,000,000" and being surprised that they were invalid (because they were entering formatted numbers that were invalid as raw values). I wonder if it would make sense to optionally be able specify a reverse transformation on a property, which would enable editing formatted values as well. However, this can be done after the main implementation, since the syntax is an add-on and doesn't affect the main syntax, so this is fairly low priority.

Interestingly, it appears that spreadsheets use a hybrid of reversible formatting: If I enter "$100" into a cell, it understands the value, formats it as a currency, but when I click again to edit it, it's the "100" that I'm editing.

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 19, 2019

What about to let users solve the same problem the way they are used to and name the attribute, that lets format data, mv-text and a corresponding function for expressions—text(value, format)?

Do note that this should not be implemented as an expression function, as it makes it look like the formatted value is what's passed around in expressions. If you have <span property=foo>[text(value, format)]</span> then you expect [foo] to print out the formatted value, and not the raw value. However, one typically wants the raw value to be passed around, since different formats may be required in different places.

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Aug 19, 2019

However, this means the current value needs to be referenced somehow. Perhaps with $this?

If $this is a mandatory part of a formatting function, why don't we just add it implicitly and do not force an end-user to write it explicitly? So an end-user simply writes <span property="number" mv-format="round(2)"> or [number format round(2)] but we interpret it like <span property="number" mv-format="round($this, 2)"> and [number format round($this, 2)] accordingly.

  1. We come up with a new name, since mv-format is taken

Do we have any usage stats for the mv-format attribute by the devs? It might be that it is not a big problem if we rename mv-format because nobody would be harmed. :) I personally like both variants: mv-format and mv-display. The first one is common for spreadsheets, and the latter reflects the aim of the newly added attribute—display raw values in another (user-friendly) format.

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Aug 19, 2019

Multiple expression-based formats: [number format X format Y]. These should probably be all applied in order.

What if they contradict each other? E.g., X is a percentage format and Y is a date/time format. Do we really need to handle chained formats?

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Aug 19, 2019

Interestingly, it appears that spreadsheets use a hybrid of reversible formatting: If I enter "$100" into a cell, it understands the value, formats it as a currency, but when I click again to edit it, it's the "100" that I'm editing.

I can also add that Excel preserves formatting for percentages (if you enter 3% in a cell you will be editing the 3% value) and dates and time (end-users will be editing the raw value if only they change the cell format to General).

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Aug 19, 2019

I also have a question if you don't mind: do you have any idea about the way to describe the format? Will it be separate functions or a formatting string like in Excel?

@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 19, 2019

You frame this discussion around input, but in Mavo I don't think you can disentangle that from output. Suppose for example have . That's fine for display, but what happens when I click on the number to edit it? Does the formatting suddenly vanish? In this case it's plausible for mavo to provide editing for a 3-decimal place number, but we can't expect general formatting functions to be "invertible" in ways that also express how to edit the formatted it.

It would be nice if whatever we did for formatting was completely parallel to what we do for editing. So for example just as we now offer an indirect pointer to an editing component, perhaps we should offer an indirect pointer to a formatting component?

Pushing this to an extreme, imagine specifying a "serializer" that transforms the to-be-presented data object into a structure that captures the different elements you might want to format, and a "deserializer" that transforms the format-structure back to the data object. Then you could use mavo to display and edit the format-structure. So for example the weight-number above could be serialized as . Now integer and fractional part are both integers to be displayed/edited as we like. We don't want this everywhere in the mavo, where we'd rather just think of the weight as a number, but we want it where we display/edit the weight.

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 20, 2019

@karger

I actually discussed this exact thing in the last section of my long comment ("Editing of formatted properties"). I think if we do something for that, it should definitely not get in the way of allowing people to specify one-way formatting functions, but it would be nice as extra functionality.
What are your thoughts on the other issues I mentioned?

@DmitrySharabin:

If $this is a mandatory part of a formatting function, why don't we just add it implicitly and do not force an end-user to write it explicitly?

Because that restricts the kinds of formatting we do. Perhaps the user wants to use a function that doesn't have $this as its first argument, but such a heuristic prevents this. E.g. As a somewhat contrived example if($this = 0, "nothing", round($this)). You don't want this to become if($this, $this = 0, "nothing", round($this)). Or, as another example, duration($this - day()). Basically, adding $this as the first argument of every function used makes too many assumptions about the formatting expression, and doesn't even cover all use cases. E.g. imagine mv-format="'$' & $this", there is no function to augment, so how would that be written? So we still need a notation to specify the current value anyway.

One heuristic that might work could be to do that only if $this has not been used anywhere in the expression. But I can still see use cases where this would mess up, e.g. if you have property A that depends on B, and you use A in B's mv-format. But that seems pretty rare and easy to override the heuristic by just adding a pointless $this somewhere (e.g. B + $this * 0). It reminds me of how Sass nesting works: if you haven't used & in your nested selector it's prepended, otherwise it just follows what you did.

Do we have any usage stats for the mv-format attribute by the devs?

The quickest way to look is a Github code search for mv-format. It is used a bit, but not so extensively that we can't change it. Let's try to decide independently of backwards compat concerns which name is best, and then we can examine the backwards compat concerns if we actually decide that format is best.

What if they contradict each other? E.g., X is a percentage format and Y is a date/time format. Do we really need to handle chained formats?

That's up to the formatting functions used. If the result is nonsensical, then hopefully the user will realize. You can always do nonsensical things with any sufficiently powerful syntax. But chaining is important and should be allowed, I think, as it is useful in some cases. E.g. `[number format round($this, 2) format '$' & $this].

I also have a question if you don't mind: do you have any ideas about the way to describe the format? Will it be separate functions or a formatting string like in Excel?

I'm not a huge fan of formatting strings. They optimize compactness over readability and learnability. It's a whole different microsyntax, whereas users already know how to use functions. Also, even if you look up the cryptic format identifiers and painstakingly write out your format, other people reading your code can't understand what you're doing without also looking up the documentation.

@LeaVerou LeaVerou added the discussion label Aug 20, 2019
@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 21, 2019

Like @LeaVerou I would rather not introduce new syntax for formatting, and like @DmitrySharabin I would rather not force explicit use of $this. But going back to something @LeaVerou discarded, I am actually attracted to starting from <span property="foo">[formattingFunction(foo)]</span>. Because mavo already incorporates all we need to do formatting---we can use a single formatting function; we can incorporate html and then use a variety of functions that extract "pieces" of foo to go into different slots in the html, we can use mv-if for conditional formatting, etc.

The problem as @LeaVerou noted is that this collides with the way we want to use the given syntax to define a value for foo. So let's focus on fixing that overload. Define an mv-editable attribute; adding that attribute to a <property=foo> element says "there may be all sorts of rich content and expressions in here, which may or may not be assigned as a value to foo. But if the user clicks in here, what you get is an editing widget for the foo property. Which could be the default or could be specified with the mv-edit="editorID" attribute.

An alternative I like less is to apply a heuristic that says the property should be editable on click whenever the contents of the property tag don't make sense as specifying a value for foo. Namely, when the content includes html, and/or when the content expression references the property (which would result in a circular dependency if it were used to assign a value to the property).

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 21, 2019

@karger: I've read your comment several times and I'm not sure I understand what you're proposing. Would mv-editable be a boolean attribute? If not, what values does it accept? How would formattingFunction(foo) be inverted so that the formatted value is editable? Perhaps you could provide an example?

Do note that formatting is already possible in Mavo, just clumsy:

<div property=foo mv-attribute="data-value">[formattingFunction(foo)]</div>
@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 21, 2019

My thinking was boolean, as a modifier on the property attribute. But now that you mention it, we could generalize it so that on any element <... mv-editable="foo" (or maybe mv-edit-for or mv-edits) would mean "whenever I click in this element, ignore whatever's inside and turn it into an editor for the foo property. With default value being the property on the element.

Contrived example:

<div property="price" mv-editable>
$<span property="dollars">[round(price/100)]</span>.[price-100*dollars]
</div>

If I click on the div, the contents are replaced by an editor for the number value of price (an integer number of cents). '

I could even allow nesting; if I have an mv-editable inside an mv-editable, then clicking on the inner gives me an editor for the inner property, while clicking on the outer (outside the inner) gives me an editor for the outer property.

@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 21, 2019

Pushing farther, what I'm describing seems related to the :focus pseudoclass---show something different when element has focus. which suggests offering a similar opportunity to change appearance/functionality in hover. or maybe it's more like mv-action---do something when you click an element. e.g. we could just say mv-action="edit" if we want the property to become editable, and mv-action="edit(foo)" if we want an editor for the (closest in scope) foo property.

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 21, 2019

I'm sorry, I still don't understand what mv-editable is, and I've really tried 😢
Is it for editing parts of a property and storing it in a different property? How does it relate to formatting? What's the value of price in the example above?
Also, with this feature you're proposing, are users editing formatted or unformatted values?
Perhaps you could provide an example in both the syntax we were discussing and what you're proposing, maybe that could help make it more clear?
Though if I'm having so much trouble understanding it, not sure how we'll explain it to our users…

@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 21, 2019

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 21, 2019

Ok, I think I finally understand now! So basically instead of <div property=foo mv-format="[timeFormat(foo)]"></div> you want <div property=foo mv-editable>[timeFormat(foo)]</div>
A few comments on this:

  • Do note that expressions in Mavo currently cannot contain HTML (it's just printed out), partly for security reasons.
  • I think what you envision mv-editable to do is basically mv-mode="auto" (since computed properties have mv-mode="read" by default, whereas auto would allow them to toggle between read and edit modes. I'm not sure if it works already, but enabling it to work is basically a bugfix.
  • I do like placing the expression where it will actually be shown, there's a certain direct-manipulation-ess of sorts in this. This is what we currently do with <time> as well. Do note that that is not always innerHTML, often the property value is in an attribute, and its content is static text.
  • If the format goes where the value would normally be, then where does the value go? Would users need to manually specify an attribute with it if they need it somewhere? (e.g. data-value="[foo]")? In that case, we're just inverting the problem, going from <div property=foo mv-attribute="data-value">[timeFormat(foo)]</div> to <div property=foo data-value="[foo]">[timeFormat(foo)]</div>, which I'm not sure is a significant improvement.
  • There are a few things that this doesn't address, such as inheritance and formatting of expressions that are not also properties. You could argue that in expressions that are not properties, one can just include the format as part of the expression, however in that case how does Mavo know not to apply the default formatting?
  • Sometimes you want to specify the value with an expression and the format with a different expression. E.g. an example from this app: <span property="timeToClose" mv-attribute="content" content="[closed_at - created_at]">[duration(timeToClose)]</span>. It's unclear to me how this would work with your syntax.
@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 22, 2019

I agree that having expressions not contain HTML is nice; it avoids any risking of unbounded alternation of syntaxes in nesting. But that's one of the things I like about this approach. If I do want to use some really complicated formatting that (as in my example) shows different "pieces" of the property in different parts of an html blob, it works without having to put html inside an expr---instead you are putting exprs inside html as usual.

I think you're right about mv-mode="auto" being equivalent, which solves the naming problem---unless we decide that's too opaque?

If "the property value is an attribute and its content is static text", I don't see too many cases where we need to format. the values of attributes like href are usually intended for the computer to read, not people. I guess something like placeholder may arise here? But I can't think of a natural way (or reason) to make the placeholder text editable.

We can use mv-value to specify a value if we also want a format. I assume that's what you meant by data-value="[foo]". I don't think it's "just inverting the problem" because when you are specifying a value for the property you don't need all the richness available from putting arbitrary html inside the property's innerHTML, whereas that may be exactly what you need when specifying a format. I think data (a property value) will tend to fit pretty nicely as the value of an attribute, unlike format.

I don't understand your inheritance comment. Why are you looking expressions that are not also properties?

And I don't understand the last example. Are mv-attribute="content" and the innerHTML both specifying the content of the span? So one wins and the other is ignored?

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 23, 2019

If "the property value is an attribute and its content is static text", I don't see too many cases where we need to format.

I do. title, value, placeholder, alt, summary or custom data-* attributes that can be intended for whatever.
Also, while data formatting is primarily intended for nicer presentation of values, it can be used for any data transformation that should only be applied to the output and not the value that gets stored or passed around in expressions.

I assume that's what you meant by data-value="[foo]".

Not at all. It was just a custom attribute to hold the value, just like content in the other example.

I don't think it's "just inverting the problem" because when you are specifying a value for the property you don't need all the richness available from putting arbitrary html inside the property's innerHTML, whereas that may be exactly what you need when specifying a format. I think data (a property value) will tend to fit pretty nicely as the value of an attribute, unlike format.

I don't understand. If you put HTML inside a property, it doesn't become the property value, it's just part of the template...

I don't understand your inheritance comment.

In my proposal, mv-format could be specified at an ancestor and apply to all expressions and properties within. Your proposal doesn't allow for that.

Why are you looking expressions that are not also properties?

Because currently the built-in data formatting applies there too. E.g. if you do [1/3], you will get back 0.33, not 0.3333333333333333 that raw JS would give you.

And I don't understand the last example. Are mv-attribute="content" and the innerHTML both specifying the content of the span? So one wins and the other is ignored?

content is a property.

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Aug 25, 2019

Thinking about the formatting feature in Mavo, I came up to the idea what if we just implement that feature as a plugin using, for example, this library and not adding the feature to the core? From my point of view in the majority of cases, users need to format numeric data. And this library solves that task perfectly.
If somebody doesn't want to use plugins, we still have a workaround for formatting values similar to what Lea used in her Issue Closing app. Also, in Mavo, there is a bunch of functions (e.g., Dates and Times) that let format raw values. It needs a bit more investigation to find out if there is a way to implement inheritance with the help of this library. I am sure there is. I would love to try to write such a plugin if you don't mind.

@LeaVerou

This comment has been minimized.

Copy link
Member Author

@LeaVerou LeaVerou commented Aug 26, 2019

No need for a library for data formatting, JS has this built-in these days: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/NumberFormat

Also, this feature needs to be in the core so it can replace the current hardcoded rules.

@DmitrySharabin

This comment has been minimized.

Copy link
Member

@DmitrySharabin DmitrySharabin commented Aug 26, 2019

Got it. Thanks!

@karger

This comment has been minimized.

Copy link
Collaborator

@karger karger commented Aug 26, 2019

Pushing farther, what I'm describing seems related to the :focus pseudoclass---show something different when element has focus. which suggests offering a similar opportunity to change appearance/functionality in hover. or maybe it's more like mv-action---do something when you click an element. e.g. we could just say mv-action="edit" if we want the property to become editable, and mv-action="edit(foo)" if we want an editor for the (closest in scope) foo property.

@joyously

This comment has been minimized.

Copy link

@joyously joyously commented Aug 26, 2019

show something different when element has focus

That can get complicated if it's a collection, similar to how not all browsers have implemented :focus-within.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.