Skip to content

TiSpark 1.0 GA

Compare
Choose a tag to compare
@ilovesoup ilovesoup released this 26 Apr 13:36
· 913 commits to master since this release
6c9136d

TiSpark provides distributed computing of TiDB data using Apache Spark.

  • Provide a gRPC communication framework to read data from TiKV
  • Provide encoding and decoding of TiKV component data and communication protocol
  • Provide calculation pushdown, which includes:
    • Aggregate pushdown
    • Predicate pushdown
    • TopN pushdown
    • Limit pushdown
  • Provide index related support
    • Transform predicate into Region key range or secondary index
    • Optimize Index Only queries
    • Adaptive downgrade index scan to table scan per region
  • Provide cost-based optimization
    • Support statistics
    • Select index
    • Estimate broadcast table cost
  • Provide support for multiple Spark interfaces
    • Support Spark Shell
    • Support ThriftServer/JDBC
    • Support Spark-SQL interaction
    • Support PySpark Shell
    • Support SparkR