# HOW TO: Insert Data From Python Into PSQL

`--------------------------`
`With Mr Fugu Data Science` `-------------------------`


[Follow me on Github](https://github.com/MrFuguDataScience) | [Check me out on YouTube](https://www.youtube.com/channel/UCbni-TDI-Ub8VlGaP8HLTNw?view_as=subscriber)


**Create Files to interact with Psycopg & Hide Database User credentials**:

+ If you are working in a real world scenario, you will most likely need to hide your credentials,   permissions, login & password
that is where these files shine. 

[create init and config files](https://towardsdatascience.com/python-and-postgresql-how-to-access-a-postgresql-database-like-a-data-scientist-b5a9c5a0ea43) | [Python_configparser doc](https://docs.python.org/3/library/configparser.html) | [Postgres_tutorial_Python_PSQL](https://www.postgresqltutorial.com/postgresql-python/connect/)

1<sup>st</sup>) : create the initialization file:

+ use your terminal or notepad, unless your favorite editor directly converts to this format `.ini` file.
    + on `Mac` I did `touch database_file_init.ini`
        + `vi database_file.ini`     *then inside the file type following lines:*
            + `[postgresql]`
            + `host=localhost`
            + `database=what_databse_you_want_to_access`
            + `user= some_user_you_created_for_this_user_in_psql`
            + `password= some_password_you_have_to_this_db`
            
`------------------------------------------------`

2<sup>nd</sup>) : now the `config.py` file this will be used to take data from `init file` and outputs a `dictionary`.

+ this file will look like this as an **example**:
`{‘host’: ‘localhost’, ‘database’: ‘suppliers’, ‘user’: ‘postgres’, ‘password’: ‘postgres’}` when it is read.

`HERE IS THE CODE`:

`____________________________________________________`

#!/usr/bin/python    (can also do virtenv,env)

from configparser import ConfigParser
 
def config(filename='database.ini', section='postgresql'):
    # create a parser
    parser = ConfigParser()
    # read config file
    parser.read(filename)
 
    # get section, default to postgresql
    db = {}
    
    # Checks to see if section (postgresql) parser exists
    if parser.has_section(section):
        params = parser.items(section)
        for param in params:
            db[param[0]] = param[1]
         
    # Returns an error if a parameter is called that is not listed in the initialization file
    else:
        raise Exception('Section {0} not found in the {1} file'.format(section, filename))
 
    return db
    
    
**This code was adpated from online material on Postgres website and used in 1<sup>st</sup> & 3<sup>rd</sup> link above**

`_____________________________________________________________________________________`
# Install Psycopg2:

+ `python -m pip install psycopg2`

* if this doesn't work, try changing `pip` to `pip3` depending on what version of Python you are using

+ if that doesn't work either try to do: `conda install -c anaconda psycopg2`


**For my installation I found problems with installation: I did two steps**: `pip3 install psycopg2` followed by the `conda install -c anaconda psycopg2`. The first install showed complete but did not work, I feel that I may have been imcomplete with all the dependencies needed. Also, I was getting a `Python 2.7 error hashing`. `I suggest that you first check the version of Python and Location PATH. I feel that my default Location may be anaconda calling Python if I remember correctly.`

+ I thought this was from using a wrong default version of Python, changed my default and still didn't work. That was not the entire case, try the above and hope this works for you.

+ Separate source for installing: depending on situation. [install on Mac](https://www.youtube.com/watch?v=N4RxnQH2pVY)

`________________________________`


# Install: Memory_Profiler

+ `pip3 install memory-profiler`  (pip,pip3.5 all depends what version you have of Python)  


+ **Anaconda version**: 

`conda config --add channels conda-forge`

`conda install memory_profiler`

[memory_profiler doc](https://pypi.org/project/memory-profiler/)

# `Create New User If Needed To Work Within PSQL`:


`psql postgres`   (if you have `Homebrew` install)

`CREATE ROLE` **Somename_youlike**` WITH LOGIN PASSWORD` **'Some_password_you_want'**`;`

`ALTER` **Your_new_user_name** `CREATEDB;`   (*Giving permissions*)

*Now You Can Check if it worked here is my printout*:

| Role name        	| List of roles Role  name,Attributes                          	| Member of 	|
|------------------	|--------------------------------------------------------------	|-----------	|
| mrfugu           	| Create DB                                                    	| {}        	|
| my_computer_name 	| Superuser, Create role,  Create DB, Replication,  Bypass RLS 	| {}        	|

`________________________________________________`

# Create a new DataBase:

`createdb -O UserName DB_NameYouMade`  (can create outside of `psql`)

`psql postgress` 

and do the command `\l` to see what databases are available and to who


In [2]:
import psycopg2             # python->psql connection
import psycopg2.extras

import pandas as pd         # create dataframes 
import os                   # fetch files

import time                 # timing operations
import memory_profiler      # managing memory usage
from memory_profiler import memory_usage

from functools import wraps # decorator/wrapper

from typing import Iterator, Optional, Dict, Any,List  # Create Iterator for One-By-One Loading 

import io

# Import the 'config' function from the config_user_dta.py file:
from config_user_dta import config

In [None]:
# Verify Psycog2 Install
!pip3 show psycopg2

+ `Connection`: this is a class is responsible for *Transactions*
    
    * Two `methods` for *terminating a Transaction*: `commit()` and `rollback()`
        
        * `commit()`: if you want to `permanently` change database
        
        * `rollback()`: if you want to `change` database, this is convient if there is a failure somewhere and you want to create an exception and not lose, accept partial data, or corrupted data.  

+ `Cursor`: when you use first issue a statement for `PSQL`, the `Cursor` *object* is creating a Transaction in `psycopg2`. 
    
    * From this moment all statements in the same Transaction will execute, unless *you abort or there is a failure.* 
        
       
        
        
[Official PSQL docs](https://www.postgresqltutorial.com/postgresql-python/transaction/)

In [27]:
# Establish a connection to the database by creating a cursor object

# Get the config params
params_ = config()

# Connect to the Postgres_DB:
conn = psycopg2.connect(**params_)

# Create new_cursor allowing us to write Python to execute PSQL:
cur = conn.cursor()

conn.autocommit = True  # read documentation understanding when to Use & NOT use (TRUE)

In [3]:
def connect():
    """ Connect to the PostgreSQL database server """
    conn = None
    try:
        # read connection parameters
        params = config()
 
        # connect to the PostgreSQL server
        print('Connecting to the PostgreSQL database...')
        conn = psycopg2.connect(**params)
      
        # create a cursor
        cur = conn.cursor()
        
   # execute a statement
        print('PostgreSQL database version:')
        cur.execute('SELECT version()')
 
        # display the PostgreSQL database server version
        db_version = cur.fetchone()
        print(db_version)
       
       # close the communication with the PostgreSQL
        cur.close()
    except (Exception, psycopg2.DatabaseError) as error:
        print(error)
    finally:
        if conn is not None:
            conn.close()
            print('Database connection closed.')
 
 
if __name__ == '__main__':
    connect()

Connecting to the PostgreSQL database...
PostgreSQL database version:
('PostgreSQL 12.2 on x86_64-apple-darwin17.7.0, compiled by Apple LLVM version 10.0.0 (clang-1000.11.45.5), 64-bit',)
Database connection closed.


In [3]:
# locate file in entire directory:

print(os.getcwd())
def os_dir_search(file):
    u=[]
    for p,n,f in os.walk(os.getcwd()):
        
        for a in f:
            a = str(a)
            if a.endswith(file): # can be (.csv) or a file like I did and search 
                print(a)
                print(p)
                t=pd.read_csv(p+'/'+file,names=['row_id','credit_card',
                                                'email','first_name','last_name','primary_phone'],header=0)
            
    return t

# need to use (.csv,.png, etc) because it is looking by file type ending


addr_df=os_dir_search('fake_users_R.csv')

/Users/zatoichi59/Desktop/Projects
fake_users_R.csv
/Users/zatoichi59/Desktop/Projects


In [5]:
addr_df_=addr_df.iloc[:,1:]
addr_df_.head()

Unnamed: 0,credit_card,email,first_name,last_name,primary_phone
0,5399-3484-4724-7187,gso@qiegan.sqe,Donyell Ann,Ospina,5219459148
1,1630-5261-6108-7631,xnji@gfruaxqnvm.fha,Bishop,Siyed,4164254716
2,4435-3866-1076-3595,dvyco@tkzhsop.zxg,Connor,Powers,3627413915
3,3489-7099-9906-8660,fy@uvfhplatmz.cam,Kylie,Her,3562764561
4,8631-4500-5666-1510,rztkvliou@dkeinhgysf.deo,Anthony,Vo,7345795348


In [6]:
# CREATE TABLE FOR PSQL: staging_fake_ppl

def create_staging_table(cursor) -> None:
    cursor.execute("""
        DROP TABLE IF EXISTS staging_fake_ppl;
        CREATE UNLOGGED TABLE staging_fake_ppl (
            credit_card         TEXT,
            email               TEXT,
            first_name          TEXT,
            last_name           TEXT,
            primary_phone       TEXT
        );""")

# look at the documentation of PSQL (UNLOGGED TABLE vs TEMP)

# `Cursor` and `Connection` are Context Managers :
+ allowing you to use the `with` statment, and `psycopg2 will commit Transaction unless there is an error`

[further reading](https://www.postgresqltutorial.com/postgresql-python/transaction/)

In [7]:
with conn.cursor() as cursor:
    create_staging_table(cursor)

In [8]:
start = time.perf_counter()  # using the highest resolution timer
time.sleep(1) # do work
elapsed = time.perf_counter() - start

In [9]:
# Decorator Function: 
""" will display:  [function] we call
                   [Time] it takes to run operations
                   [Memory] Used
"""
def profile(fn):
    @wraps(fn)
    def inner(*args, **kwargs):
        fn_kwargs_str = ', '.join(f'{k}={v}' for k, v in kwargs.items())
        print(f'\n{fn.__name__}({fn_kwargs_str})')

        # Measure time
        t = time.perf_counter()
        retval = fn(*args, **kwargs)
        elapsed = time.perf_counter() - t
        print(f'Time   {elapsed:0.4}')

        # Measure memory
        mem, retval = memory_usage((fn, args, kwargs), retval=True, timeout=200, interval=1e-7)

        print(f'Memory {max(mem) - min(mem)}')
        return retval

    return inner

In [10]:
s_=addr_df_.to_dict('records')
# s_[:2]

# Send .CSV Python --> PSQL 

In [None]:
# addr_df_.to_csv('address_Python_convertR.csv',index=False)

# sql = "COPY %s FROM STDIN WITH CSV HEADER DELIMITER AS ','"
# file = open('address_Python_convertR.csv', "r")
# table = 'staging_fake_ppl'
# with conn.cursor() as cur:
#     cur.execute("truncate " + table + ";")  #avoiding uploading duplicate data!
#     cur.copy_expert(sql=sql % table, file=file)
#     conn.commit()
#     cur.close()
#     conn.close()

In [28]:
addr_df_.to_csv('address_Python_convertR.csv',index=False)

@profile
def send_csv_to_psql(connection,csv,table_):
    sql = "COPY %s FROM STDIN WITH CSV HEADER DELIMITER AS ','"
    file = open(csv, "r")
    table = table_
    with connection.cursor() as cur:
        cur.execute("truncate " + table + ";")  #avoiding uploading duplicate data!
        cur.copy_expert(sql=sql % table, file=file)
        conn.commit()
#         cur.close()
#         conn.close()
    return conn.commit()

send_csv_to_psql(conn,'address_Python_convertR.csv','staging_fake_ppl')


send_csv_to_psql()
Time   0.0584
Memory 0.0


In [25]:
# def connect(params_dic):
#     """ Connect to the PostgreSQL database server """
#     conn = None
#     try:
#         # connect to the PostgreSQL server
#         print('Connecting to the PostgreSQL database...')
#         conn = psycopg2.connect(**params_dic)
#     except (Exception, psycopg2.DatabaseError) as error:
#         print(error)
#         sys.exit(1) 
#     return conn

# connect()

TypeError: connect() missing 1 required positional argument: 'params_dic'

# Simple Query with `Psycog2`:

+ when doing a `SELECT` query use: `fetchone(), fetchall() or fetchmany()` methods



In [29]:
sql_="SELECT COUNT(*) FROM staging_fake_ppl"
cur.execute(sql_)
cur.fetchone()


(5826,)

# `Alternate` Way to Query:

+ Quick and dirty way to `Query` PSQL and bring data into Python as a DF

In [30]:
import pandas.io.sql as sqlio
# conn = psycopg2.connect("host='{}' port={} dbname='{}' user={} password={}".format(host, port, dbname, username, pwd))
sql = "select count(*) from staging_fake_ppl;"
dat = sqlio.read_sql_query(sql, conn)
conn = None


In [31]:
dat


Unnamed: 0,count
0,5826


# Convert `Df --> List(Dict())` then sending from `Python-->PSQL`:

In [95]:
def create_staging_table_(cursor) -> None:
    cursor.execute("""
        DROP TABLE IF EXISTS staging_fake_ppl02;
        CREATE UNLOGGED TABLE staging_fake_ppl02 (
            credit_card         TEXT,
            email               TEXT,
            first_name          TEXT,
            last_name           TEXT,
            primary_phone       TEXT
        );""")


In [96]:
conn = psycopg2.connect(**params_)

# Create new_cursor allowing us to write Python to execute PSQL:
cur = conn.cursor()

conn.autocommit = True  # read documentation understanding when to Use & NOT use (TRUE)


with conn.cursor() as cursor:
    create_staging_table_(cursor)
# cur.close()
# conn.close()

In [97]:
# import sql  # the patched version (file is named sql.py)
# sql.write_frame(addr_df_, 'staging_fake_ppl02', conn, flavor='postgresql')

@profile

def fcn(df,table,cur):
#     df=addr_df_
#     table='staging_fake_ppl02'
    # df is the dataframe
    if len(df) > 0:
        df_columns = list(df)
        # create (col1,col2,...)
        columns = ",".join(df_columns)

        # create VALUES('%s', '%s",...) one '%s' per column
        values = "VALUES({})".format(",".join(["%s" for _ in df_columns])) 

        #create INSERT INTO table (columns) VALUES('%s',...)
        insert_stmt = "INSERT INTO {} ({}) {}".format(table,columns,values)
        cur.execute("truncate " + table + ";")  #avoiding uploading duplicate data!
        cur = conn.cursor()
        psycopg2.extras.execute_batch(cur, insert_stmt, df.values)
    conn.commit()
   

In [98]:
fcn(addr_df_,'staging_fake_ppl02',cur)


# Close the cursor and connection to so the server can allocate
# bandwidth to other requests
cur.close()
conn.close()


fcn()
Time   0.2678
Memory 0.0


# UPDATE TABLE:


In [4]:
# ALTER TABLE vendors ADD COLUMN ID SERIAL PRIMARY KEY;

# https://www.tutorialspoint.com/python_data_access/python_postgresql_update_table.htm


# https://www.postgresqltutorial.com/postgresql-python/transaction/  (i≠≠nserts and updates)

# INSERTS:

In [None]:
# https://www.postgresqltutorial.com/postgresql-python/insert/

# Citations:

https://www.datacamp.com/community/tutorials/tutorial-postgresql-python

https://stackoverflow.com/questions/49264194/import-py-file-in-another-directory-in-jupyter-notebook

https://www.postgresqltutorial.com/postgresql-python/connect/

https://stackoverflow.com/questions/39767810/cant-install-psycopg2-package-through-pip-install-is-this-because-of-sierra

https://opensource.com/article/19/5/python-3-default-mac

https://dev.to/irfnhm/how-to-set-python3-as-a-default-python-version-on-mac-4jjf

https://www.google.com/search?q=ImportError%3A+cannot+import+name+md5&oq=ImportError%3A+cannot+import+name+md5&aqs=chrome..69i57j69i58.436j0j7&sourceid=chrome&ie=UTF-8

https://stackoverflow.com/questions/34617452/how-to-update-xcode-from-command-line

https://www.postgresql.org/docs/12/auth-password.html

https://hakibenita.com/fast-load-data-python-postgresql

https://pypi.org/project/memory-profiler/

https://docs.python.org/3/library/functools.html

https://medium.com/@viviennediegoencarnacion/getting-started-with-postgresql-on-mac-e6a5f48ee399

https://lerner.co.il/2019/05/05/making-your-python-decorators-even-better-with-functool-wraps/ (decorators/wrappers)

https://docs.python.org/3/library/configparser.html

https://hackersandslackers.com/psycopg2-postgres-python/

https://pynative.com/python-postgresql-tutorial/

https://hackersandslackers.com/psycopg2-postgres-python/  (create Classes)

https://stackoverflow.com/questions/23103962/how-to-write-dataframe-to-postgres-table  (df_to_psql_table)

https://alvinalexander.com/blog/post/postgresql/log-in-postgresql-database/  (cmd line shortcuts psql)

https://stackoverflow.com/questions/35651586/psycopg2-cursor-already-closed