Skip to content
Yi Wang edited this page Oct 19, 2018 · 6 revisions

Hive

SQL

https://stackoverflow.com/questions/18049444/hive-select-into

create table people1990 as select * from people where dob_year=1990 

Python

  1. https://dev.mysql.com/doc/connector-python/en/connector-python-examples.html

    Save results into a local CSV file and upload it to Hive. Doesn't seems scalable.

  2. https://stackoverflow.com/a/26777130

    1. Load the data in a relational database like mysql.
    2. Import data from relational database to HDFS using Apache Sqoop.
    3. Create a Hive table as parquet format
    4. Load the data from HDFS to Hive table.

SparkSQL

SQL

https://docs.databricks.com/spark/latest/spark-sql/language-manual/insert.html

Similar to Hive

Python

MySQL

SQL

https://dev.mysql.com/doc/refman/8.0/en/ansi-diff-select-into-table.html

INSERT INTO tbl_temp2 (fld_id)
   SELECT tbl_temp1.fld_order_id
   FROM tbl_temp1 WHERE tbl_temp1.fld_order_id > 100;

Python

https://dzone.com/articles/write-csv-data-hive-and-python