Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

heads up: Arrow API updates and breaking changes #38

Closed
trxcllnt opened this issue Feb 8, 2018 · 4 comments
Closed

heads up: Arrow API updates and breaking changes #38

trxcllnt opened this issue Feb 8, 2018 · 4 comments

Comments

@trxcllnt
Copy link
Contributor

trxcllnt commented Feb 8, 2018

Hey guys, we're getting ready to release Arrow JS 0.3.0 soon and want to give you a heads up on breaking changes. We refactored things a bit to align better with the C++ implementation, and make it easier to implement the Arrow Writer APIs (coming soon).

I'm just looking at perspective.js#L281, but haven't searched the rest of the perspective codebase so there may be more places to update.

  • Table now exposes its Schema

  • Removed the eagerly-allocated columns list from the Table, in favor of lazily allocating columns via getColumnAt(i). You can map the schema's fields to get the columns:

    table.schema.fields.map((field, idx) => table.getColumnAt(idx))
  • Removed the vector.name property, now accessible via the schema (Vectors can be combined in ways that they differ from the original schema, in which any tie to the original field metadata isn't necessarily valid anymore)

  • Added DataType classes so now all the vector type information from the schema is available and strongly-typed at runtime. Now the vector.type field will refer to this instance. We also export the DataType classes and enums on the Arrow.type namespace, so you can do enum or instanceof comparisons.

  • Added TypeVisitor and VectorVisitor classes, to make it easier to walk the schema and vector trees:

    import { Vector, visitor, type } from 'apache-arrow';
    // Visitor to convert Vector<Date | Timestamp> to Vector<Int> of epoch ms
    class DateTimeVisitor extends visitor.VectorVisitor {
        visitDateVector(vec: Vector<type.Date_>) {
            return vec.asEpochMilliseconds();
        }
        visitTimestampVector(vec: Vector<type.Timestamp>) {
            return vec.asEpochMilliseconds();
        }
        visitNullVector(vec: Vector<type.Null>) { return vec; }
        visitBoolVector(vec: Vector<type.Bool>) { return vec; }
        visitIntVector(vec: Vector<type.Int>) { return vec; }
        visitFloatVector(vec: Vector<type.Float>) { return vec; }
        visitUtf8Vector(vec: Vector<type.Utf8>) { return vec; }
        visitBinaryVector(vec: Vector<type.Binary>) { return vec; }
        visitFixedSizeBinaryVector(vec: Vector<type.FixedSizeBinary>) { return vec; }
        visitTimeVector(vec: Vector<type.Time>) { return vec; }
        visitDecimalVector(vec: Vector<type.Decimal>) { return vec; }
        visitListVector(vec: Vector<type.List>) { return vec; }
        visitStructVector(vec: Vector<type.Struct>) { return vec; }
        visitUnionVector(vec: Vector<type.Uniona{ return vector; }>): vecny;
        visitDictionaryVector(vec: Vector<type.Dictionary>) { return vec; }
        visitIntervalVector(vec: Vector<type.Interval>) { return vec; }
        visitFixedSizeListVector(vec: Vector<type.FixedSizeList>) { return vec; }
        visitMapVector(vec: Vector<type.Map_>) { return vec; }
    }
    const table = Table.from(buf);
    const visitor = new DateTimeVisitor();
    const cols = table.schema.fields.map((field, idx) => {
        return visitor.visit(table.getColumnAt(idx));
    });

That's all I can think of for now. We have a repo where we push new releases for testing and use ahead of the apache release vote, so feel free to link to this in your package.json if you want to test it out: https://github.com/graphistry/arrow

Best,
Paul

@nmichaud
Copy link
Contributor

nmichaud commented Feb 9, 2018

Thanks! We noticed the changes last week and were wondering when they were targeted to be released.

@trxcllnt
Copy link
Contributor Author

@nmichaud we're sitting down for a code-review this afternoon. Assuming that goes well, we'll call for a release vote shortly after, which would mean likely release this weekend or early next week. I can update this thread when I have more information if you'd like.

@nmichaud
Copy link
Contributor

@trxcllnt that would be great! We definitely depend on the eagerly allocated columns list in our integration - will need to take a deeper look into the changes in the new version.

@texodus
Copy link
Member

texodus commented Mar 17, 2018

Thanks for the fix!

@texodus texodus closed this as completed Mar 17, 2018
texodus pushed a commit that referenced this issue Feb 26, 2019
Moved the getChartElement helper into a new module
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants