### Uploading CSV to Vertica database and downloading data from Vertica to CSV

In this notebook, we show how we can load data from CSV into Vertica notebook or download data into CSV with [sqlalchemy](https://www.sqlalchemy.org/). We need to install the below vertica dialect for sqlalchemy:
`pip install sqlalchemy-vertica-python`

We will also need to install the native [Python client](https://github.com/vertica/vertica-python) for the Vertical Analytics Database.

In [1]:
import random

import numpy as np
import pandas as pd
import pyodbc
import sqlalchemy as sa
import vertica_python

The following function will help to cast the data to the correct data type for insertion into the database.

In [2]:
def updateType(df_para):
    dtypedict = {}  # create and empty dictionary
    for i, j in zip(df_para.columns, df_para.dtypes):
        if "object" in str(j):
            dtypedict.update({i: sa.types.VARCHAR})

    return dtypedict

We create the engine, pointing to the Vertica database. The database used here is a community edition from Vertica.

In [3]:
engine = sa.create_engine(
    "vertica+vertica_python://dbadmin:password@192.168.56.101:5433/VMart"
)

In [4]:
dbConnection = engine.connect()

### Uploading CSV into Vertica database

We read the CSV files and create the tables necessary for [02_main_vertica_db.ipynb](02_main_vertica_db.ipynb).

In [26]:
policy = pd.read_csv("https://data.atoti.io/notebooks/customer360/Policy_life.csv")

policy["QUOTE_DATE"] = pd.to_datetime(policy["QUOTE_DATE"])
policy["COVER_START"] = pd.to_datetime(policy["COVER_START"])

policy_updatedict = updateType(policy)

policy.to_sql(
    name="policy_life",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=policy_updatedict,
)

In [None]:
policy = pd.read_csv("https://data.atoti.io/notebooks/customer360/Policy_vehicle.csv")

policy["QUOTE_DATE"] = pd.to_datetime(policy["QUOTE_DATE"])
policy["COVER_START"] = pd.to_datetime(policy["COVER_START"])

policy_updatedict = updateType(policy)

policy.to_sql(
    name="policy_vehicle",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=policy_updatedict,
)

In [None]:
policy = pd.read_csv("https://data.atoti.io/notebooks/customer360/Policy_life.csv")

policy["QUOTE_DATE"] = pd.to_datetime(policy["QUOTE_DATE"])
policy["COVER_START"] = pd.to_datetime(policy["COVER_START"])

policy_updatedict = updateType(policy)

policy.to_sql(
    name="policy_life",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=policy_updatedict,
)

In [12]:
client = pd.read_csv("https://data.atoti.io/notebooks/customer360/customer.csv")

client_updatedict = updateType(client)

client.to_sql(
    name="client",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=client_updatedict,
)

In [13]:
addons = pd.read_csv("https://data.atoti.io/notebooks/customer360/additional_coverage.csv")

addons_updatedict = updateType(addons)

addons.to_sql(
    name="additional_coverage",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=addons_updatedict,
)

In [14]:
coverage = pd.read_csv("https://data.atoti.io/notebooks/customer360/coverage.csv")

coverage_updatedict = updateType(coverage)

coverage.to_sql(
    name="coverage",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=coverage_updatedict,
)

In [28]:
claims = pd.read_csv("https://data.atoti.io/notebooks/customer360/claims.csv")
claims["CLAIM_DATE"] = pd.to_datetime(claims["CLAIM_DATE"])

In [30]:
claims.head()

Unnamed: 0,POLICY,CLAIM_DATE,CLAIM_AMOUNT,CLAIM_REASON
0,V000000,2013-12-01,276.351928,Collision
1,V000001,2014-04-11,697.95359,Scratch/Dent
2,V000002,2012-07-17,1288.743165,Collision
3,V000003,2011-11-11,764.586183,Collision
4,V000004,2013-06-08,281.369258,Collision


In [31]:
claims_updatedict = updateType(claims)

claims.to_sql(
    name="claims",
    con=dbConnection,
    schema="public",
    if_exists="append",
    index=False,
    dtype=claims_updatedict,
)

### Downloading data into CSV

In [None]:
sql_query = pd.read_sql_query(
    "SELECT * FROM ADDITIONAL_COVERAGE A",
    engine,
)

df = pd.DataFrame(sql_query)
df.head()

In [None]:
df.to_csv("additional_coverage.csv", index=False)