## Loading Data into Tables - Local

Let us understand how to load data into Spark Metastore tables. We can load either from local file system or from HDFS.

In [5]:
%%HTML
<iframe width="560" height="315" src="https://www.youtube.com/embed/3leee3drHs0?rel=0&amp;controls=1&amp;showinfo=0" frameborder="0" allowfullscreen></iframe>

* Data should be in sync with Spark Metastore table structure.
* We need to create table with the same file format and delimiters so that we can load the data in files into Spark Metastore tables.
* Our data is in text files, line delimiter is new line character and field delimiter is comma.
* As our table uses default file format (text file), default line/record delimiter and field delimiter is specified as comma, we should be able to load the data with out any issues.
* Here is the script which will create table and then load data into the table.

In [None]:
%%sql

USE itversity_retail

In [None]:
%%sql

DROP TABLE orders

In [None]:
%%sql

CREATE TABLE orders (
  order_id INT COMMENT 'Unique order id',
  order_date STRING COMMENT 'Date on which order is placed',
  order_customer_id INT COMMENT 'Customer id who placed the order',
  order_status STRING COMMENT 'Current status of the order'
) COMMENT 'Table to save order level details'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','

In [None]:
%%sql

LOAD DATA LOCAL INPATH '/data/retail_db/orders' INTO TABLE orders

* Using Spark SQL with Python or Scala

In [None]:
spark.sql("USE itversity_retail")

In [None]:
spark.sql("DROP TABLE orders")

In [None]:
spark.sql("""
CREATE TABLE orders (
  order_id INT COMMENT 'Unique order id',
  order_date STRING COMMENT 'Date on which order is placed',
  order_customer_id INT COMMENT 'Customer id who placed the order',
  order_status STRING COMMENT 'Current status of the order'
) COMMENT 'Table to save order level details'
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
""")

In [None]:
spark.sql("LOAD DATA LOCAL INPATH '/data/retail_db/orders' INTO TABLE orders")

* Once the data is loaded we can run these queries to preview the data.

In [None]:
%%sql

SELECT * FROM orders LIMIT 10

In [None]:
%%sql

SELECT count(1) FROM orders