Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-3880] HBase as data source to SparkSQL #4084

Closed
wants to merge 106 commits into from

Commits on Jan 7, 2015

  1. Configuration menu
    Copy the full SHA
    5c7f25f View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    990038a View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    a9d0e6b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    3ec2004 View commit details
    Browse the repository at this point in the history
  5. remove the duplicate folder

    bomeng committed Jan 7, 2015
    Configuration menu
    Copy the full SHA
    d223b79 View commit details
    Browse the repository at this point in the history

Commits on Jan 8, 2015

  1. fix for JavaAPI 'show tables'

    xinyunh committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    1d214fd View commit details
    Browse the repository at this point in the history
  2. fix java API issue on 'Describe Table'

    Yan Zhou committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    7afc8e6 View commit details
    Browse the repository at this point in the history
  3. fix a Describe Table issue

    xinyunh committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    2618294 View commit details
    Browse the repository at this point in the history
  4. some code cleanup

    xinyunh committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    1b4ede8 View commit details
    Browse the repository at this point in the history
  5. more cleanup of leftover files

    xinyunh committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    090b215 View commit details
    Browse the repository at this point in the history
  6. Fix for snowy-evening issue #5 "sbt test do not recognize MiniHBaseCl…

    …uster".
    
    The reason the SBT test framework failed to put the HBase test artifacts in its classpath is because of an IVY bug:  sbt/sbt#861.
    
    SBT/IVY can fail to resolve transitive dependencies defined in a 'pom-only' module, and 'hbase-testing-util' is a pom-only module.  (Maven resolves them without any problem.)
    The workaround is to replace the hbase-testing-util dependency definition in spark/sql/hbase/pom.xml with the dependencies listed in that pom-only module, at the cost of developers having to edit this pom file to make it work for unprofiled combinations of HBase/Hadoop releases.
    
    =================================
    
    This works only for hadoop2-compat hbase distributions.  The spark/sql/hbase/pom.xml must be edited to make it work for hadoop1-compat hbase distributions (more on that later).
    
    The fix has been tested against HBase 0.98.5-hadoop2 and 0.98.7-hadoop2, with hadoop 2.2.0, 2.3.0, and 2.4.0.
    
    =================
    SBT test commands
    =================
    
    sbt/sbt -Phive,hbase,yarn,hadoop-2.2 -Dhadoop.version=2.2.0  "hbase/test-only org.apache.spark.sql.hbase.BasicQueriesSuite"
    sbt/sbt -Phive,hbase,yarn,hadoop-2.3 -Dhadoop.version=2.3.0   "hbase/test-only org.apache.spark.sql.hbase.BasicQueriesSuite"
    sbt/sbt -Phive,hbase,yarn,hadoop-2.4 -Dhadoop.version=2.4.0   "hbase/test-only org.apache.spark.sql.hbase.BasicQueriesSuite"
    
    =================
    MVN test commands
    =================
    
    mvn -e -Pyarn,hbase,hadoop-2.2  -Dhadoop.version=2.2.0  -pl sql/hbase clean  test -DwildcardSuites=org.apache.spark.sql.hbase.BasicQueriesSuite
    mvn -e -Pyarn,hbase,hadoop-2.3  -Dhadoop.version=2.3.0  -pl sql/hbase clean  test -DwildcardSuites=org.apache.spark.sql.hbase.BasicQueriesSuite
    mvn -e -Pyarn,hbase,hadoop-2.4  -Dhadoop.version=2.4.0  -pl sql/hbase clean  test -DwildcardSuites=org.apache.spark.sql.hbase.BasicQueriesSuite
    
    ===========================
    
    To build against a hadoop1-compatible HBase release (for example: Hadoop 1.2.1), you will need to edit the pom to your needs.
    Add the appropriate hadoop profile to import the hadoop 1.2.1 libraries, and redefine the hbase dependencies to match the module structure of you chosen hbase release.
    
    The SBT libraryDependencies declaration below describes the mvn dependencies you would need to define in spark/sql/hbase/pom.xml to build
    Spark against HBase v0.98.5-hadoop1 over Hadoop v1.2.1.
    
    libraryDependencies ++= Seq(
        "org.scalatest" %% "scalatest" % "2.2.1" % "test",
        "com.novocode" % "junit-interface" % "0.9" % "test",
        "org.apache.hbase" % "hbase-common" % "0.98.5-hadoop1",
        "org.apache.hbase" % "hbase-common" % "0.98.5-hadoop1" % "test" classifier "tests",
        // no hbase-annotation module before v0.98.7
        //"org.apache.hbase" % "hbase-annotations" % "0.98.5-hadoop1",
        //"org.apache.hbase" % "hbase-annotations" % "0.98.5-hadoop1" % "test" classifier "tests",
        "org.apache.hbase" % "hbase-protocol" % "0.98.5-hadoop1",
        "org.apache.hbase" % "hbase-client" % "0.98.5-hadoop1",
        "org.apache.hbase" % "hbase-server" % "0.98.5-hadoop1",
        "org.apache.hbase" % "hbase-server" % "0.98.5-hadoop1" % "test" classifier "tests",
        "org.apache.hbase" % "hbase-hadoop-compat" % "0.98.5-hadoop1",
        "org.apache.hbase" % "hbase-hadoop-compat" % "0.98.5-hadoop1" % "test" classifier "tests",
        "org.apache.hbase" % "hbase-hadoop1-compat" % "0.98.5-hadoop1",
        "org.apache.hbase" % "hbase-hadoop1-compat" % "0.98.5-hadoop1" % "test" classifier "tests",
        "org.slf4j" % "slf4j-log4j12" % "1.6.4",
        "org.apache.hadoop" % "hadoop-core" % "1.2.1",
        "org.apache.hadoop" % "hadoop-client" % "1.2.1",
        "org.apache.hadoop" % "hadoop-minicluster" % "1.2.1",
        "org.apache.hadoop" % "hadoop-test" % "1.2.1",
        "org.apache.hadoop" % "hadoop-tools" % "1.2.1"
    )
    
    Note that there are no 'hbase-annotations' modules in HBase releases prior to v0.98.7.
    sparksburnitt committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    d328d75 View commit details
    Browse the repository at this point in the history
  7. fix the testcase

    bomeng committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    34f326d View commit details
    Browse the repository at this point in the history
  8. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    sparksburnitt committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    b4ccd6b View commit details
    Browse the repository at this point in the history
  9. remove println

    bomeng committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    98109f7 View commit details
    Browse the repository at this point in the history
  10. remove println

    bomeng committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    575c778 View commit details
    Browse the repository at this point in the history
  11. remove println

    bomeng committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    b3ada4a View commit details
    Browse the repository at this point in the history
  12. remove println

    bomeng committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    b35db4a View commit details
    Browse the repository at this point in the history
  13. remove println

    bomeng committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    d7737d7 View commit details
    Browse the repository at this point in the history
  14. Modify the pom and Remove the useless testcases

    xinyunh committed Jan 8, 2015
    Configuration menu
    Copy the full SHA
    2af3727 View commit details
    Browse the repository at this point in the history

Commits on Jan 9, 2015

  1. remove buggy codes

    bomeng committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    435184e View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    ca02383 View commit details
    Browse the repository at this point in the history
  3. clean up the codes

    bomeng committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    5dbe2b7 View commit details
    Browse the repository at this point in the history
  4. use of SparkHome in BulkLoadIntoTableSuite

    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    ce5ad97 View commit details
    Browse the repository at this point in the history
  5. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    64e59ca View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    4e72f70 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    5ccefe7 View commit details
    Browse the repository at this point in the history
  8. fix bulkload issues

    bomeng committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    9976fa6 View commit details
    Browse the repository at this point in the history
  9. Configuration menu
    Copy the full SHA
    d4ffa00 View commit details
    Browse the repository at this point in the history
  10. Modify the path for BulkLoadSuite

    xinyunh committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    0d5c3a6 View commit details
    Browse the repository at this point in the history
  11. add assert

    bomeng committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    3087c5c View commit details
    Browse the repository at this point in the history
  12. test code cleanup

    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    d2baa54 View commit details
    Browse the repository at this point in the history
  13. merge from origin

    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    caa3edf View commit details
    Browse the repository at this point in the history
  14. fix typo

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    d1a96e2 View commit details
    Browse the repository at this point in the history
  15. enable a HBasePartitioner test case

    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    0cce338 View commit details
    Browse the repository at this point in the history
  16. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    3338424 View commit details
    Browse the repository at this point in the history
  17. fix partitioner test failure

    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    ec0f5d7 View commit details
    Browse the repository at this point in the history
  18. add asserts

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    95858f3 View commit details
    Browse the repository at this point in the history
  19. polish on Partition(er) test

    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    baff6e8 View commit details
    Browse the repository at this point in the history
  20. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    Yan Zhou committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    89ca306 View commit details
    Browse the repository at this point in the history
  21. add asserts

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    cd0a6fa View commit details
    Browse the repository at this point in the history
  22. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    1008e94 View commit details
    Browse the repository at this point in the history
  23. remove unused import

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    599ebd6 View commit details
    Browse the repository at this point in the history
  24. Configuration menu
    Copy the full SHA
    613c2af View commit details
    Browse the repository at this point in the history
  25. change val name 'd' & add asserts

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    f876e5c View commit details
    Browse the repository at this point in the history
  26. move test JavaHbaseSuite.scala to right pkg

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    06709e9 View commit details
    Browse the repository at this point in the history
  27. add hbase api to pyspark

    kenmsj committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    18b307e View commit details
    Browse the repository at this point in the history
  28. fix typo

    sparksburnitt committed Jan 9, 2015
    Configuration menu
    Copy the full SHA
    f91d44e View commit details
    Browse the repository at this point in the history

Commits on Jan 10, 2015

  1. remove log statement

    sparksburnitt committed Jan 10, 2015
    Configuration menu
    Copy the full SHA
    7dbd697 View commit details
    Browse the repository at this point in the history
  2. comment out log statements

    sparksburnitt committed Jan 10, 2015
    Configuration menu
    Copy the full SHA
    25d3fb0 View commit details
    Browse the repository at this point in the history
  3. comment out & remove some log statements

    sparksburnitt committed Jan 10, 2015
    Configuration menu
    Copy the full SHA
    66da382 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    0de92fc View commit details
    Browse the repository at this point in the history
  5. comment out another log statement

    sparksburnitt committed Jan 10, 2015
    Configuration menu
    Copy the full SHA
    a382f11 View commit details
    Browse the repository at this point in the history
  6. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    sparksburnitt committed Jan 10, 2015
    Configuration menu
    Copy the full SHA
    7183c8e View commit details
    Browse the repository at this point in the history
  7. fix typo

    sparksburnitt committed Jan 10, 2015
    Configuration menu
    Copy the full SHA
    2f6be5d View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    192d2a5 View commit details
    Browse the repository at this point in the history

Commits on Jan 12, 2015

  1. Configuration menu
    Copy the full SHA
    2592451 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    381ed2d View commit details
    Browse the repository at this point in the history
  3. remove unnecessary codes

    bomeng committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    6f91555 View commit details
    Browse the repository at this point in the history
  4. fix style issues

    bomeng committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    e304e7b View commit details
    Browse the repository at this point in the history
  5. fix styles

    bomeng committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    90c41ad View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    eb36a2d View commit details
    Browse the repository at this point in the history
  7. Fix some errors in testcases

    xinyunh committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    1a85c7a View commit details
    Browse the repository at this point in the history
  8. add assert

    sparksburnitt committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    5e87258 View commit details
    Browse the repository at this point in the history
  9. Modify the HBase testcases

    kenmsj committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    04dca1c View commit details
    Browse the repository at this point in the history
  10. remove jar file

    bomeng committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    757d805 View commit details
    Browse the repository at this point in the history
  11. fix the message

    bomeng committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    1e59122 View commit details
    Browse the repository at this point in the history
  12. change to use variable

    bomeng committed Jan 12, 2015
    Configuration menu
    Copy the full SHA
    b238c04 View commit details
    Browse the repository at this point in the history

Commits on Jan 13, 2015

  1. Configuration menu
    Copy the full SHA
    0cbb0ce View commit details
    Browse the repository at this point in the history
  2. Fix the configuration error in BulkLoad

    xinyunh committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    402cad3 View commit details
    Browse the repository at this point in the history
  3. remove unused imports

    bomeng committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    ae7ce57 View commit details
    Browse the repository at this point in the history
  4. Fix the log error

    xinyunh committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    db00a26 View commit details
    Browse the repository at this point in the history
  5. add configuration to HBaseSQLContext

    Yan Zhou committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    ce052a5 View commit details
    Browse the repository at this point in the history
  6. move Configuration setting from catalog to HBaseSQLContext; handles n…

    …ull value in non-key columns during loading
    Yan Zhou committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    e8fc610 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    d961c5d View commit details
    Browse the repository at this point in the history
  8. remove unused import

    sparksburnitt committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    1ee6a46 View commit details
    Browse the repository at this point in the history
  9. fix classname typo

    sparksburnitt committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    405b371 View commit details
    Browse the repository at this point in the history
  10. fix classname typo

    sparksburnitt committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    40f33ff View commit details
    Browse the repository at this point in the history
  11. Configuration menu
    Copy the full SHA
    e471248 View commit details
    Browse the repository at this point in the history
  12. was renamed InsertWrappers

    sparksburnitt committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    7d85ce2 View commit details
    Browse the repository at this point in the history
  13. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    Yan Zhou committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    c77797f View commit details
    Browse the repository at this point in the history
  14. Merge branch 'HW_HBase' of https://github.com/Huawei-Spark/spark into…

    … HW_HBase
    Yan Zhou committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    3ad6904 View commit details
    Browse the repository at this point in the history
  15. Configuration menu
    Copy the full SHA
    7b6bb0c View commit details
    Browse the repository at this point in the history
  16. support insertion and bulkloading of nullable columnns

    sparksburnitt committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    525bea6 View commit details
    Browse the repository at this point in the history
  17. forgot to uncomment old test cases, and use unique htable name for ne…

    …w null col-value load/select test
    sparksburnitt committed Jan 13, 2015
    Configuration menu
    Copy the full SHA
    477eef4 View commit details
    Browse the repository at this point in the history

Commits on Jan 14, 2015

  1. Configuration menu
    Copy the full SHA
    2db8852 View commit details
    Browse the repository at this point in the history
  2. Fix bulkload import tests: some test cases were writing HFiles to the…

    … same mapreduce hfile output dir during the bulkloader's text->hfile transform, resulting in broken HFile imports after the 1st test. The workaround is to make sure the physical HTable names are unique across test cases in the suite.
    sparksburnitt committed Jan 14, 2015
    Configuration menu
    Copy the full SHA
    e459cf8 View commit details
    Browse the repository at this point in the history

Commits on Jan 15, 2015

  1. Reduce the call to getMetadataTable

    xinyunh committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    83cf5bc View commit details
    Browse the repository at this point in the history
  2. Modify createTable

    xinyunh committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    59be1cc View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    c438a8b View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    00899e1 View commit details
    Browse the repository at this point in the history
  5. reuse val in hadoopreader

    scwf committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    8b46235 View commit details
    Browse the repository at this point in the history
  6. Configuration menu
    Copy the full SHA
    9022a74 View commit details
    Browse the repository at this point in the history
  7. hot fix for compile error

    scwf committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    6f732c4 View commit details
    Browse the repository at this point in the history
  8. style fix

    scwf committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    8545195 View commit details
    Browse the repository at this point in the history
  9. Use set to gather the families

    xinyunh committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    7f85ac8 View commit details
    Browse the repository at this point in the history
  10. Add 'hbase' module to allProjects list to fix sbt's sql/hbase module …

    …classloading problem and fix typo: ExludedDependencies -> ExcludedDependencies
    sparksburnitt committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    2dc1bb3 View commit details
    Browse the repository at this point in the history
  11. Change sql 'LOAD DATA LOCAL INPATH' to 'LOAD PARALL DATA LOCAL INPATH…

    …' to fix 3 test cases expecting parser to return ParallelizedBulkLoadIntoTableCommand.
    sparksburnitt committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    4049151 View commit details
    Browse the repository at this point in the history
  12. test query containing '... where col is not null'

    sparksburnitt committed Jan 15, 2015
    Configuration menu
    Copy the full SHA
    0e05d30 View commit details
    Browse the repository at this point in the history

Commits on Jan 16, 2015

  1. fix test warning

    scwf committed Jan 16, 2015
    Configuration menu
    Copy the full SHA
    1b5f8ee View commit details
    Browse the repository at this point in the history
  2. more refinements

    bomeng committed Jan 16, 2015
    Configuration menu
    Copy the full SHA
    eba0c62 View commit details
    Browse the repository at this point in the history
  3. remove unused codes

    bomeng committed Jan 16, 2015
    Configuration menu
    Copy the full SHA
    a3637a7 View commit details
    Browse the repository at this point in the history
  4. fix an partial evaluation error in NOT; ane the issue that create tab…

    …le on an nonexistent HBase table created a HBase table with only one presplit region
    Yan Zhou committed Jan 16, 2015
    Configuration menu
    Copy the full SHA
    c811717 View commit details
    Browse the repository at this point in the history
  5. Fix the error of OR case in ScanPredClassifier

    xinyunh committed Jan 16, 2015
    Configuration menu
    Copy the full SHA
    3bb6ccf View commit details
    Browse the repository at this point in the history

Commits on Jan 17, 2015

  1. Add HTable close before exit the function

    xinyunh committed Jan 17, 2015
    Configuration menu
    Copy the full SHA
    0c71c55 View commit details
    Browse the repository at this point in the history
  2. Modify the htable in HBaseRelation

    xinyunh committed Jan 17, 2015
    Configuration menu
    Copy the full SHA
    f2a6c16 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    4e8e72a View commit details
    Browse the repository at this point in the history