This repository has been archived by the owner. It is now read-only.
  • Past due by about 2 months Last updated 15 days ago

    The goal of this milestone is to have DataFusion working well enoug…

    The goal of this milestone is to have DataFusion working well enough to run single-threaded SQL queries against CSV and Parquet data sources, supporting projection, selection, cast, type coercion, sort (in memory) and simple aggregates (in memory).

     
    100% complete
  • No due date Last updated 15 days ago

    Implement distributed processing: Implement serialization for Reco…

    Implement distributed processing: Implement serialization for RecordBatch and Schema (using Arrow IPC) so that data can be persisted to disk and streamed between nodes Implement basic distributed query planner Implement serialization for query plans Docker packaging for worker nodes Kubernetes to orchestrate cluster

     
    100% complete
  • No due date Last updated 15 days ago

    Implement JOIN, ORDER BY, UNION, SUBQUERY

     
    100% complete