Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support expression for formatting axis #5122

Closed
domoritz opened this issue Jun 27, 2019 · 33 comments · Fixed by #5260
Closed

Support expression for formatting axis #5122

domoritz opened this issue Jun 27, 2019 · 33 comments · Fixed by #5260
Labels

Comments

@domoritz
Copy link
Member

domoritz commented Jun 27, 2019

We already have format types for number and time. We could introduce a flexible expression formatter. This would resolve vega/vega-lite-api#12.

Related issue: vega/vega#608

cc @curran

@kanitw
Copy link
Member

kanitw commented Jun 27, 2019

Can you describe more how would formatType: 'expression' would work?

I'm confused since expression is more general than format (e.g., it can include multiple fields).

@kanitw kanitw added RFC / Discussion 💬 For discussing proposed changes Need Clarification ❔ Needs clarification before we can proceed. labels Jun 27, 2019
@curran
Copy link

curran commented Jun 28, 2019

Related vega/vega#1853

@curran
Copy link

curran commented Jun 28, 2019

Motivating context:

image

At the end of the day, what I'd really like to see is a fork of https://observablehq.com/@mbostock/working-with-wikipedia-data that formats the X ticks of the bar chart using "$90B" rather than "90G".

@arvind
Copy link
Member

arvind commented Jun 28, 2019

Copying over my comment from the other issue, here's an example of resultant Vega that we'd ideally want to generate:

  "axes": [
    {
      "scale": "x",
      "orient": "top",
      ...
      "format": "s",
      "encode": {
        "labels": {
           "update": {
             "text": {"signal": "replace(datum.label, 'G', 'B')"}
           }
        }
      }
    },

@domoritz
Copy link
Member Author

domoritz commented Jun 28, 2019

The idea is that you would write a spec like

{
  "$schema": "https://vega.github.io/schema/vega-lite/v3.json",
  "data": {
    "values": [
      {"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
      {"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
      {"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
    ]
  },
  "mark": "bar",
  "encoding": {
    "x": {"field": "a", "type": "ordinal"},
    "y": {
		"field": "b",
		"type": "quantitative",
		"axis": {
			"formatType": "expression",
			"format": "+datum.label * 10"
		}
	}
  }
}

We would then use the provided format in the encode block.

In the implementation, we will need to distinguish format types that vega supports (number and time) from the expression format type.

@kanitw kanitw added P3 Should be fixed at some point and removed Need Clarification ❔ Needs clarification before we can proceed. RFC / Discussion 💬 For discussing proposed changes labels Jun 28, 2019
@arvind
Copy link
Member

arvind commented Jun 28, 2019

Question -- in your spec example, @domoritz, how would you invoke the regular formatting options? E.g., in @curran's example spec, he wanted to be able to post-process the output of the .3s formatter.

@kanitw
Copy link
Member

kanitw commented Jun 28, 2019

Thanks for explanation. I think this makes sense.

One case that the expression (without extension) in Vega wouldn't support is when we want to add unit to only the topmost labels, like this:

$90M
 80M
 70M
 ...

I'll file issue in Vega to see if we can expose more info in the datum besides label and value.

@kanitw
Copy link
Member

kanitw commented Jun 28, 2019

Question -- in your spec example, @domoritz, how would you invoke the regular formatting options? E.g., in @curran's example spec, he wanted to be able to post-process the output of the .3s formatter.

Good point. This should just be a part of the guide encoding (which is a secret feature right now) then we don't have to introduce this as a conflicting formatType.

@domoritz
Copy link
Member Author

domoritz commented Jun 28, 2019

how would you invoke the regular formatting options

@arvind You can invoke formatting in expressions. For example replace(format(datum.label, 's'), 'G', 'B') (see https://vega.github.io/vega/docs/expressions/#format-functions)

@kanitw kanitw added RFC / Discussion 💬 For discussing proposed changes and removed P3 Should be fixed at some point labels Jun 28, 2019
@kanitw kanitw changed the title Add formatType: 'expression' Support expression for formatting axis/legend/headers Jun 28, 2019
@kanitw
Copy link
Member

kanitw commented Jun 28, 2019

replace(format(datum.label, 's'), 'G', 'B')

I think you mean replace(format(datum.value, 's'), 'G', 'B') (value, not label).


One problem with formatType: 'expression' is that it will still be either inconsistent or weird for text channel's "format" property.

If we say, text's formatType is only 'number' | 'time', then it's inconsistent with formatType in axis/legend.

But if text's formatType supports 'expression', it will be weird as we have to refer to datum.<field_name> instead of datum.value or datum.label.

For example, imagine:

encoding: {
  ...
  text: {field: 'a', type: 'quantitative', format: 'replace(format(datum.a, 's'), 'G', 'B')', formatType: 'expression'}
}

Basically, the field: "a" part is becoming redundant at this point.


Also, the transition from

format: 's'

to

format: "replace(format(datum.value, 's'), 'G', 'B')"
formatType: 'expression'

is a bit drastic (not incremental).

Perhaps, if we make allow title, labels, ticks, domain, and grid to be an encoding object (as we may wanna do in #5056) for setting underlying encode block (#2907), we can do something like:

format: 's'
labels: {
  text: {expr: "replace(datum.label, 'G', 'B')"}
}

which may have a smoother transition as some of the original part is still kept.

However, the con of this approach is that the format would be separate from the label text and labels.text.expr is still a bit obscure.

Alternatively, we could introduce a separate formatExpr:

format: 's',
formatExpr: "replace(datum.label, 'G', 'B')"

which is simpler, but still split formatExpr from format.

Given they should be together, I wonder if we should eliminate formatType and group everything in a format object that can combine number: string or time: string with expr like:

format: {
 number: 's',
 expr: "replace(datum.label, 'G', 'B')"
}

This seems nicer but still have a inconsistency issue for text encoding's format.

Perhaps, we can get around that by always replacing datum.value in text encoding's format expression with datum.<field_name> (and similarly replace datum.label with format(datum.<field_name>, ...)).

@kanitw
Copy link
Member

kanitw commented Jun 28, 2019

FWIW, none of our examples currently use formatType, but it's probably useful for formatting time data that got casted to ordinal type.

@kanitw kanitw mentioned this issue Jun 28, 2019
1 task
@kanitw
Copy link
Member

kanitw commented Jun 28, 2019

Btw, as I think about the need for axis/legend encode block, I think the secret encoding block that we have is probably an overkill and introduce unnecesssary complexity. (See #2907 (comment))

@g3o2
Copy link
Contributor

g3o2 commented Jun 28, 2019

Maybe this could be a use case for vega-label?

For data exploration, the SI system in vega-lite really just works fine. For data explanation, repeating the measurement unit in each axis label instead of mentioning it in the axis title is hardly more readable.

@curran
Copy link

curran commented Jun 29, 2019

What do you think about passing in a JavaScript function as part of the Vega spec? This would allow arbitrary JavaScript (e.g. post-process the result from the format function using any other function). It also feels simpler than anything I've seen proposed here.

@arvind
Copy link
Member

arvind commented Jun 29, 2019

What do you think about passing in a JavaScript function as part of the Vega spec?

Hi @curran, that's precisely the purpose of Vega's expression functions — giving us an escape hatch rather than continuously expanding the surface area of the visualization language itself. And, importantly, having our own expression parser allows us to control and sandbox allowable functions — a necessary feature for deploying Vega/Vega-Lite specifications in security-concious environments like Wikipedia. If the expression functions do not cover a desired feature, our preference would be to introduce new functions (rather than enable wholly arbitrary JavaScript). Hope that makes sense!

@domoritz
Copy link
Member Author

Plus, you can use expressions in other languages such as Python with Altair.

@curran
Copy link

curran commented Jul 1, 2019

Excellent! It's great to know the "escape hatch" is there. It's also great to hear the reasoning behind the desire to introduce new functions to the sandboxed environment, which makes total sense.

However, I still would love to be able to write this, from the Vega-Lite JS API:

const xAxisTickFormat = number =>
  d3.format('.3s')(number)
    .replace('G', 'B');

vl.markBar()
  .data({values: d3.zip(names, totals)})
  .encode(
    vl.y().fieldN("0").sort(null).axis({title: null}),
    vl.x().fieldQ("1").axis({orient: "top", format: xAxisTickFormat, title: "Total revenue (est.)"})
  )
  .width(width)
  .autosize({type: "fit-x", contains: "padding"})
  .render()

Perpaps the JS API internally could re-write or transform the spec such that the function passed in is invoked via Vega's expression functions. This would make the JS API more usable, as developers could use what they already know, rather than learning an entirely different language that they do not already know. Although, this development would only make the JS API more usable, and would not improve the core Vega-Lite spec at all, so the audience for this improvement would be limited (would exclude Altair users for example), so I can understand it would not be a high priority in the grand scheme of things.

Food for thought! Thanks all for your time here. I really appreciate your efforts.

@g3o2
Copy link
Contributor

g3o2 commented Jul 1, 2019

Another strategy to handle this specific use case could be to file a feature request with d3.format to allow the localising of the SI letters via the (d3.formatLocale function)[https://github.com/d3/d3-format/blob/master/README.md#formatLocale]. This would then apply downstream to the vega ecosystem.

@kanitw
Copy link
Member

kanitw commented Jul 2, 2019

this would make the JS API more usable, as developers could use what they already know, rather than learning an entirely different language that they do not already know.

It's worth noting that Vega expression is just a subset of Javascript. Thus, supporting arbitrary JS format function won't be a high priority for us for now.

@kanitw kanitw added this to the x.x Visual Encoding milestone Jul 6, 2019
@kanitw
Copy link
Member

kanitw commented Jul 31, 2019

The format expression is simply for labels.

I think a better alternative is to make labels: boolean become

labels: boolean | {expr: ...}

Then we don't have to mess with format (and also can reuse results from format via datum.label).

We can then do:

"axis": {
  "labels": {"expr": "replace(datum.label, 'G', 'B')"}
}

to replace G with B as discussed above.

@kanitw
Copy link
Member

kanitw commented Jul 31, 2019

Or we can even make it labels: boolean | string and do:

"axis": {
  "labels": "replace(datum.label, 'G', 'B')"
}

though it's a bit less obvious that we have expression support here.

@domoritz
Copy link
Member Author

I see. Yes, I agree that expression should be a separate property then. However, I don't think we want to use the existing labels property. It currently means A boolean flag indicating if labels should be included as part of the axis.. If we make it an expression, I would read it as though the expression returns a boolean to either show or hide a particular label.

How about we add a new property value, text, or expr?

@kanitw
Copy link
Member

kanitw commented Jul 31, 2019

From a Slack converation, @domoritz and I settled on adding a new property named labelExpr, so we can do:

"axis": {
  "labelExpr": "replace(datum.label, 'G', 'B')"
}

@kanitw kanitw changed the title Support expression for formatting axis/legend/headers Support expression for formatting axis Aug 4, 2019
@kanitw
Copy link
Member

kanitw commented Aug 4, 2019

Fixed in #5260

@kanitw kanitw closed this as completed Aug 4, 2019
domoritz pushed a commit that referenced this issue Aug 5, 2019
…als (#5260)

* refactor: VG_AXIS/LEGEND_PROPERTIES =>  AXIS/LEGEND_COMPONENT_PROPERTIES

* feat: support axis `labelExpr` + add example with month initial

Fix #5122
Fix #5249

* feat: support legend `labelExpr`
@a10k
Copy link

a10k commented May 29, 2020

Will labelExpr also work with text, for formatting the text mark layer? The link in the documentation: https://vega.github.io/vega-lite/usage/config.html#custom-format-type is broken on how to use custom format types

@domoritz
Copy link
Member Author

Here are the docs: https://vega.github.io/vega-lite/docs/config.html#custom-format-type. I'm fixing the links right now.

@jbleich89
Copy link

Does this work in the tooltip as well? If not, possible to add?

                "tooltip": [
                    {
                        "field": "cumulative_downloads",
                        "format": ".4s",
                        "type": "quantitative",
                        "labelExpr": "replace(datum.label, 'M', 'A')"
                    },

@domoritz
Copy link
Member Author

Tooltip is not a guide so the property isn't called labelExpr. You can use a formatExpr.

@jbleich89
Copy link

Any examples I could reference? Thanks!

@domoritz
Copy link
Member Author

domoritz commented Nov 13, 2020

SOrry, I misremembered and spoke too soon. There is no formatExpr and I don't know why labelExpr shows up in the autocomplete since there is no axis.

What you need to do is to either create a custom formatter (https://vega.github.io/vega-lite/docs/config.html#custom-format-type) or derive a new field using the calculate transform and use that field in the tooltip.

@domoritz
Copy link
Member Author

Thank you for raising the issue @jbleich89. You found a bug that will be fixed in #7039.

@stenzengel
Copy link
Contributor

SOrry, I misremembered and spoke too soon. There is no formatExpr and I don't know why labelExpr shows up in the autocomplete since there is no axis.

What you need to do is to either create a custom formatter (https://vega.github.io/vega-lite/docs/config.html#custom-format-type) or derive a new field using the calculate transform and use that field in the tooltip.

@domoritz
How could I use a custom format expression in tooltip (see #5122 (comment)) and the d3 formatter (see #5122 (comment))?

Actually, my original use case is to add a unit to the end of a formatted number value. A custom format expression could do the job, but seems to be overkill for this simply and frequent use case. The formatExpr, you mentioned before, would also do the job, but is not implemented, and is still more more complicated than necessary. For my use case (only appending a postfix), it would suffice, if the number format could contain arbitrary text as for temporal formats (https://vega.github.io/vega-lite/docs/format.html#temporal-data). But, as far as I know, number formats cannot contain additional text. A simple and generic solution would be to add a new formatType as originally proposed, but allow that the number format specification can countain additional simple text (and not powerful expressions). I'm not sure, what would be the best way to distinguish the d3-format specification from the arbitrary text, but if you think, if it's worth it, I can create a new issue where this could be discussed.

@stenzengel
Copy link
Contributor

stenzengel commented Jan 17, 2023

@domoritz
I have managed to use both a custom format epression and the d3 formatter. It was a problem with importing d3 and naming.

The proposition to create an issue for a new numberAndText (or something similar) value for formatType still exists to simplify the "append unit to number" use case

BTW: I get an error for vl2svg when using custom format expressions. (I do not really need vl2svg, so this is not a big problem for me, but this is perhaps an additional advantage of a new formatType.)

Error: Unrecognized function: numberAndText
    at Object.error (...\node_modules\vega-util\build\vega-util.js:39:11)     
    at CallExpression (...\node_modules\vega-expression\build\vega-expression.js:1731:27)
    at visit (...\node_modules\vega-expression\build\vega-expression.js:1688:14)
    at Property (...\node_modules\vega-expression\build\vega-expression.js:1744:26)
    at visit (...\node_modules\vega-expression\build\vega-expression.js:1688:14)
    at Array.map (<anonymous>)
    at ObjectExpression (...\node_modules\vega-expression\build\vega-expression.js:1739:49)
    at visit (...\node_modules\vega-expression\build\vega-expression.js:1688:14)
    at codegen (...\node_modules\vega-expression\build\vega-expression.js:1750:15)
    at Object.parser [as parseExpression] (...\node_modules\vega-functions\build\vega-functions.js:1798:17)
``

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants