## Example - Data Engineering

Let us understand how CRUD operations are performed for Data Engineering using Python. Don't worry if you do not understand the example completely. By the end of the course, you are supposed to gain fair amount of expertise related to Data Engineering.

Here are the highlevel steps to connect to database and perform CRUD operations for Data Engineering applications:
* Make sure to install database driver such as `psycopg2` using **pip**.
* Import `psycopg2` and connect to database
* We can insert one record at a time or list of records at a time. After inserting the records, make sure to commit.
* Get data from the table into resultset.
* Iterate through the resultset and print.
* Once all the database operations are performed make sure to close the database connection.

In [27]:
%load_ext sql

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [71]:
#%env DATABASE_URL=postgresql://itversity_sms_user:itversity!23@localhost:5433/itversity_sms_db

%env DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


env: DATABASE_URL=postgresql://itversity_sms_user:sms_password@localhost:5432/itversity_sms_db


In [29]:
%%sql
SELECT * FROM information_schema.tables 
WHERE table_catalog = 'itversity_sms_db' AND table_schema = 'public'
LIMIT 10

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
2 rows affected.


table_catalog,table_schema,table_name,table_type,self_referencing_column_name,reference_generation,user_defined_type_catalog,user_defined_type_schema,user_defined_type_name,is_insertable_into,is_typed,commit_action
itversity_sms_db,public,users,BASE TABLE,,,,,,YES,NO,
itversity_sms_db,public,courses,BASE TABLE,,,,,,YES,NO,


In [45]:
%sql DROP TABLE IF EXISTS users;

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [46]:
%%sql
CREATE TABLE users (
    user_id SERIAL PRIMARY KEY,
    user_first_name VARCHAR(30) NOT NULL,
    user_last_name VARCHAR(30) NOT NULL,
    user_email_id VARCHAR(50) NOT NULL,
    user_email_validated BOOLEAN DEFAULT FALSE,
    user_password VARCHAR(200),
    user_role VARCHAR(1) NOT NULL DEFAULT 'U', --U and A
    is_active BOOLEAN DEFAULT FALSE,
    created_dt TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    last_updated_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [47]:
%sql TRUNCATE TABLE users

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
Done.


[]

In [48]:
!pip install psycopg2



In [49]:
import psycopg2

In [50]:
psycopg2.connect?

[0;31mSignature:[0m
[0mpsycopg2[0m[0;34m.[0m[0mconnect[0m[0;34m([0m[0;34m[0m
[0;34m[0m    [0mdsn[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mconnection_factory[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0mcursor_factory[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m[0;34m[0m
[0;34m[0m    [0;34m**[0m[0mkwargs[0m[0;34m,[0m[0;34m[0m
[0;34m[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Create a new database connection.

The connection parameters can be specified as a string:

    conn = psycopg2.connect("dbname=test user=postgres password=secret")

or using a set of keyword arguments:

    conn = psycopg2.connect(database="test", user="postgres", password="secret")

Or as a mix of both. The basic connection parameters are:

- *dbname*: the database name
- *database*: the database name (only as keyword argument)
- *user*: user name used to authenticate
- *password*: password used to authentica

In [51]:
user = ('Gordan', 'Bradock', 'gbradock0@barnesandnoble.com', 'A', '2020-01-10')

In [52]:
query = '''
INSERT INTO users 
    (user_first_name, user_last_name, user_email_id, user_role, created_dt) 
VALUES 
    (%s, %s, %s, %s, %s)
'''

In [53]:
connection = psycopg2.connect(
    host='localhost',
    port='5432',
    database='itversity_sms_db',
    user='itversity_sms_user',
    password='sms_password'
)

In [54]:
cursor = connection.cursor()

In [55]:
cursor.execute(query, user) # Inserts one record

In [58]:
%sql SELECT * FROM users

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
0 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts


In [59]:
connection.commit() # Commit have to be explicit

In [60]:
%sql SELECT * FROM users

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
1 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
1,Gordan,Bradock,gbradock0@barnesandnoble.com,False,,A,False,2020-01-10 00:00:00,2021-08-13 08:40:44.296377


In [61]:
users = [
    ('Tobe', 'Lyness', 'tlyness1@paginegialle.it', '2020-02-10'),
    ('Addie', 'Mesias', 'amesias2@twitpic.com', '2020-03-05'),
    ('Corene', 'Kohrsen', 'ckohrsen3@buzzfeed.com', '2020-04-15'),
    ('Darill', 'Halsall', 'dhalsall4@intel.com', '2020-10-10') 
]

In [62]:
query = '''
INSERT INTO users 
    (user_first_name, user_last_name, user_email_id, created_dt) 
VALUES 
    (%s, %s, %s, %s)
'''

In [63]:
cursor.executemany(query, users)

In [64]:
connection.commit()

In [65]:
%sql SELECT * FROM users

 * postgresql://itversity_sms_user:***@localhost:5432/itversity_sms_db
5 rows affected.


user_id,user_first_name,user_last_name,user_email_id,user_email_validated,user_password,user_role,is_active,created_dt,last_updated_ts
1,Gordan,Bradock,gbradock0@barnesandnoble.com,False,,A,False,2020-01-10 00:00:00,2021-08-13 08:40:44.296377
2,Tobe,Lyness,tlyness1@paginegialle.it,False,,U,False,2020-02-10 00:00:00,2021-08-13 08:41:33.182495
3,Addie,Mesias,amesias2@twitpic.com,False,,U,False,2020-03-05 00:00:00,2021-08-13 08:41:33.182495
4,Corene,Kohrsen,ckohrsen3@buzzfeed.com,False,,U,False,2020-04-15 00:00:00,2021-08-13 08:41:33.182495
5,Darill,Halsall,dhalsall4@intel.com,False,,U,False,2020-10-10 00:00:00,2021-08-13 08:41:33.182495


In [66]:
users_query = 'SELECT * FROM users'

In [67]:
cursor.execute(users_query) # Now we can iterate through cursor to read the data

In [68]:
for user in cursor:
    print(user)

(1, 'Gordan', 'Bradock', 'gbradock0@barnesandnoble.com', False, None, 'A', False, datetime.datetime(2020, 1, 10, 0, 0), datetime.datetime(2021, 8, 13, 8, 40, 44, 296377))
(2, 'Tobe', 'Lyness', 'tlyness1@paginegialle.it', False, None, 'U', False, datetime.datetime(2020, 2, 10, 0, 0), datetime.datetime(2021, 8, 13, 8, 41, 33, 182495))
(3, 'Addie', 'Mesias', 'amesias2@twitpic.com', False, None, 'U', False, datetime.datetime(2020, 3, 5, 0, 0), datetime.datetime(2021, 8, 13, 8, 41, 33, 182495))
(4, 'Corene', 'Kohrsen', 'ckohrsen3@buzzfeed.com', False, None, 'U', False, datetime.datetime(2020, 4, 15, 0, 0), datetime.datetime(2021, 8, 13, 8, 41, 33, 182495))
(5, 'Darill', 'Halsall', 'dhalsall4@intel.com', False, None, 'U', False, datetime.datetime(2020, 10, 10, 0, 0), datetime.datetime(2021, 8, 13, 8, 41, 33, 182495))


In [69]:
cursor.close()

In [70]:
connection.close()