<div style="text-align: center; line-height: 0; padding-top: 2px;">
  <img src="https://www.quantiaconsulting.com/logos/quantia_logo_orizz.png" alt="Quantia Consulting" style="width: 600px; height: 250px">
</div>

# ![Quantia Tiny Logo](https://www.quantiaconsulting.com/logos/quantia_logo_tiny.png) JDBC Ingestion

**Data Source**
* The data is available on a remote PostgreSQL db:
  * dbms: PostgresSQL
  * db name: `hdp`
  * table: `trucks`
  * username: `qcro`
  * password: `qc-readonly`

**Instructions**
* Crate the JDBC connection String using **pyspark**
* Connect and read the data from the table
* Show the number of partitions
* [OPTIONAL] Try to increase the number of partitions and shortly discuss the performance gain (if any)

## ![Spark Logo Tiny](https://www.quantiaconsulting.com/logos/logo_spark_tiny.png) Pyspark

In [None]:
%load_ext autotime

import os
import qcutils
from pyspark.sql import SparkSession

os.environ['PYSPARK_SUBMIT_ARGS'] = '--packages org.postgresql:postgresql:42.2.10,com.amazonaws:aws-java-sdk:1.7.4,org.apache.hadoop:hadoop-aws:2.7.5 pyspark-shell'

spark = (SparkSession.builder 
    .master("local[*]")
    .appName("test")
    .getOrCreate()
        )
qcutils.init_spark_session(spark)

In [None]:
tableName = "trucks"
jdbcURL = "jdbc:postgresql://54.195.117.194/hdp"

connProperties = {
    "driver": "org.postgresql.Driver",
    "user": "qcro",
    "password": "qc-readonly"
}

In [None]:
df = spark.read.jdbc(
    url=jdbcURL,
    table=tableName,
    properties=connProperties)

print("Partitions: " + str(df.rdd.getNumPartitions()) )

In [None]:
df

In [None]:
from pyspark.sql.functions import *

minimumJun = (df
  .select(min("jun13_miles"))
  .first()["min(jun13_miles)"]
)
maximumJun = (df
  .select(max("jun13_miles"))
  .first()["max(jun13_miles)"]
)


df2 = spark.read.jdbc(
  url=jdbcURL,
  table=tableName,
  column="jun13_miles",
  lowerBound=minimumJun,
  upperBound=maximumJun,
  numPartitions=4,
  properties=connProperties)

print("Partitions: " + str(df2.rdd.getNumPartitions()) )

##### ![Quantia Tiny Logo](https://www.quantiaconsulting.com/logos/quantia_logo_tiny.png) 2020 Quantia Consulting, srl. All rights reserved.