## Inserting into Existing Tables

Let us understand how we can insert data into existing tables using `insertInto`.

* We can use modes such as `append` and `overwrite` with `insertInto`. Default is `append`.
* When we use `insertInto`, following happens:
  * If the table does not exist, `insertInto` will throw an exception.
  * If the table exists, by default data will be appended.
  * We can alter the behavior by using keyword argument overwrite. It is by default False, we can pass True to replace existing data.

Let us start spark context for this Notebook so that we can execute the code provided.

If you want to use terminal for the practice, here is the command to use.

```
spark2-shell \
  --master yarn \
  --name "Joining Data Sets" \
  --conf spark.ui.port=0
```

In [None]:
import org.apache.spark.sql.SparkSession

val spark = SparkSession.
    builder.
    config("spark.ui.port", "0").
    appName("Spark Metastore").
    master("yarn").
    getOrCreate()

In [None]:
spark.conf.set("spark.sql.shuffle.partitions", "2")

In [None]:
import spark.implicits._

### Tasks

Let us perform few tasks to understand how to write a Data Frame into existing tables in the Metastore.

* Make sure hr_db database and employees table in hr_db are created.

In [None]:
spark.catalog.listDatabases()

In [None]:
spark.catalog.setCurrentDatabase(f"{username}_hr_db")

In [None]:
spark.catalog.listTables()

Use employees Data Frame and insert data into the employees table in hr_db database. Make sure existing data is overwritten.

In [None]:
employees = [(1, "Scott", "Tiger", 1000.0, "united states"),
             (2, "Henry", "Ford", 1250.0, "India"),
             (3, "Nick", "Junior", 750.0, "united KINGDOM"),
             (4, "Bill", "Gomes", 1500.0, "AUSTRALIA")
            ]

In [None]:
employeesDF = spark.createDataFrame(employees,
    schema="""employee_id INT, first_name STRING, last_name STRING,
              salary FLOAT, nationality STRING
           """
)

In [None]:
employeesDF.write.insertInto("employees", overwrite=True)

In [None]:
spark.read.table("employees").show()