Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect date fields in Apache Arrow tables #263

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

Fil
Copy link
Collaborator

@Fil Fil commented May 22, 2024

addresses the Inputs.table part of observablehq/framework#1376

I thought we could extend it to duck-type other field types (numbers vs strings), but there is no real need beyond dates and it would add complexity.

@Fil Fil requested a review from mbostock May 22, 2024 12:58
src/table.js Outdated
@@ -355,6 +355,8 @@ function alignof(base = {}, data, columns) {
}

function type(data, column) {
// duck-type Arrow table date fields
if (String(data?.schema?.fields?.find?.(d => d?.name === column)).match(/<MILLISECOND>/)) return "date";
Copy link
Member

@mootari mootari May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a better fit:

Suggested change
if (String(data?.schema?.fields?.find?.(d => d?.name === column)).match(/<MILLISECOND>/)) return "date";
if (String(data?.schema?.fields?.find?.(d => d?.name === column)).endsWith("<MILLISECOND>") return "date";

I also wonder if the string representation is part of arrow's public API. Should we perhaps check against the typeId instead?

Edit: sorry, accidentally dropped a )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know… maybe… I tried to identify something that had a chance to be a bit stable, but it's a moving target.

Copy link
Member

@mootari mootari May 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I recall the positive typeIds are public, and negative IDs are meant for internal use.

Edit: sort of

 * **Note**: Only enum values 0-17 (NONE through Map) are written to an Arrow
 * IPC payload.

Co-authored-by: Fabian Iwand <mootari@users.noreply.github.com>
src/table.js Outdated Show resolved Hide resolved
Fil added a commit to observablehq/plot that referenced this pull request Jun 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants