Skip to content

Load tabular data files for chart cells #355

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 3, 2023
Merged

Load tabular data files for chart cells #355

merged 1 commit into from
Mar 3, 2023

Conversation

lileeyuh
Copy link
Contributor

@lileeyuh lileeyuh commented Mar 3, 2023

This resolves an error where chart cells were fetching the table schema for a CSV, TSV, JSON, or SQLite file and receiving no columns back.

@lileeyuh lileeyuh requested review from mbostock and annie March 3, 2023 14:54
@lileeyuh lileeyuh merged commit dc5965e into main Mar 3, 2023
@lileeyuh lileeyuh deleted the lilia/load-chart branch March 3, 2023 16:02
@mbostock

This comment was marked as off-topic.

@lileeyuh
Copy link
Contributor Author

lileeyuh commented Mar 3, 2023

In case this helps: https://www.caniemail.com/search/?s=align-items

@mbostock was this meant for a different github issue?

@mbostock
Copy link
Member

mbostock commented Mar 3, 2023

Oops, yes.

@mbostock
Copy link
Member

mbostock commented Mar 3, 2023

Per the discussion in the other PR, I fear I may have approved this too hastily. I think it is dangerous to repurpose loadTableDataSource for chart cells given that chart cells don’t use this routine to load data when they run—it means that the behavior of how we load data to infer the schema for a chart cell’s data can be inconsistent with how we load data when the chart cell itself runs. I think it would be safer instead to have a dedicated loadChartDataSource routine for chart cells, and I apologize for not realizing the nuances her earlier when I was reviewing.

@lileeyuh
Copy link
Contributor Author

lileeyuh commented Mar 3, 2023

Could you explain your reasoning for it being dangerous for chart & table to share loadTableDataSource? I thought that loadTableDataSource is simply parsing tabular data depending on what the file type is, which I'm assuming we'd want to stay consistent across data table cell and chart cell

@lileeyuh
Copy link
Contributor Author

lileeyuh commented Mar 3, 2023

@mbostock
Copy link
Member

mbostock commented Mar 3, 2023

The reason it is dangerous (in the sense of likely to exhibit a bug or inconsistent behavior, but not in the security/privacy sense) is that loadTableDataSource is used by table cells to load the data, whereas for a chart cell, it’s not. And hence if we use loadTableDataSource for a chart cell to infer the schema, that behavior could be inconsistent with what actually happens at runtime.

Case in point, loadTableDataSource doesn’t set {typed: true} when loading CSV and TSV files, and hence every column will be considered a string by the chart cell, whereas the compiler generates {typed: true} such that types are inferred on a per-value basis. Chart cells should show the correct inferred types of columns in CSV, TSV, and JSON files. (I filed a related bug in the other repo in that chart cell should be using the same inferSchema function that data table cell does.)

Data table cells are compiled to a call to __query which calls loadTableDataSource here:

source = await loadTableDataSource(await source, name);

Whereas for chart cells the compiler passes e.g. await FileAttachment("foo.csv").csv({typed: true}) to Plot.auto directly, and hence doesn’t go through loadTableDataSource. Hence, we should have a separate loadChartDataSource function that matches the behavior of the chart cell compiler so that we infer the schema types consistently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants