# Lab 2 Python for Database


## Installing Python

-   Download Python Individually
    [Download Python | Python.org](https://www.python.org/downloads/)

-   Using Anaconda for Python Programming
    [Anaconda | The World's Most Popular Data Science Platform](https://www.anaconda.com/)

-   Python Anaconda Tutorial
    [Python Anaconda Tutorial | Getting Started With Anaconda | Edureka](https://www.edureka.co/blog/python-anaconda-tutorial/)


## Installing psycopg2 and pandas

> If Anaconda is installed on you device, then you don't need to install the pandas module anymore.

-   Installation Instruction
    -   [psycopg2 · PyPI](https://pypi.org/project/psycopg2/)


In [None]:
%pip install psycopg2

-   -   [pandas - Python Data Analysis Library (pydata.org)](https://pandas.pydata.org/getting_started.html)


In [None]:
# conda
%conda install pandas

In [None]:
# or PyPI
%pip install pandas

## Connecting to the PostgreSQL database using the psycopg2


In [1]:
import psycopg2

conn = psycopg2.connect(host="localhost",
                        port='5434',
                        database="ThirdDB",
                        user='postgres',
                        password='123456')


To connect to the `suppliers` database, you use the `connect()` function of the `psycopg2` module.

For PostgreSQL database, the following connection parameters may be needed.

-   `host`：database server address e.g., localhost or an IP address.
-   `port`: the port number that defaults to 5432 if it is not provided. If you have different versions of PostgreSQL installed on your device, you may need to specify the port number of which version you want to connect to.
-   `database`: the name of the database that you want to connect.
-   `user`: the username used to authenticate.
-   `password`: password used to authenticate.


## Querying data from the PostgreSQL tables and dumping the results to pandas DataFrame.


In [2]:
cur = conn.cursor()

# cur.execute("select * from student where name like %s", ('%m%',))
cur.execute("select * from student where name like '%ray%'")

print("The number of parts: ", cur.rowcount)

row = cur.fetchone()
print(row)

# cur.close()
# conn.close()

The number of parts:  4
('18675', 'Araya', 'Statistics', Decimal('82'))


To query data from one or more PostgreSQL tables, create a new cursor by calling the `cursor()` method of the `connection` object firstly. 

The `cursor` object is used to execute the sql statements. Define the sql statements just like querying data in pgAdmin, and then calling the `execute()` method of the `cursor` object with the sql statements.

After that, process the result set returned by the stored procedure using the `fetchone()`, `fetchall()`, or `fetchmany()` method.

In [3]:
rows = cur.fetchmany(size=3)
print(len(rows))
print(rows)

3
[('57083', 'Gray', 'Pol. Sci.', Decimal('107')), ('33759', 'Mowbray', 'Psychology', Decimal('44')), ('3651', 'Narayanan', 'Elec. Eng.', Decimal('14'))]


In [4]:
cur.scroll(value=0, mode='absolute')

rows_all = cur.fetchall()
print(len(rows_all))
print(rows_all)

4
[('18675', 'Araya', 'Statistics', Decimal('82')), ('57083', 'Gray', 'Pol. Sci.', Decimal('107')), ('33759', 'Mowbray', 'Psychology', Decimal('44')), ('3651', 'Narayanan', 'Elec. Eng.', Decimal('14'))]


- The `fetchone()` fetches the next row in the result set. It returns a single tuple or `None` when no more row is available.
- The `fetchmany(size=cursor.arraysize)` fetches the next set of rows specified by the `size` parameter. If you omit this parameter, the `arraysize` will determine the number of rows to be fetched. `cursor.arraysize` defaults to 1 meaning to fetch a single row at a time. The `fetchmany()` method returns a list of tuples or an empty list if no more rows available.
- The `fetchall()` fetches all rows in the result set and returns a list of tuples. If there are no rows to fetch, the `fetchall()` method returns an empty list.

> cursor.scroll(): 
> 
> Scroll the cursor in the result set to a new position according to mode.
> 
> If mode is relative (default), value is taken as offset to the current position in the result set, if set to absolute, value states an absolute target position.

In [None]:
import pandas as pd

df = pd.read_sql_query("select * from student where name like '%ray%'", con=conn)
df

In [29]:
cur.close()
conn.close()

Finally, close the communication with the PostgreSQL by calling the `close()` method of the `cursor` and `connection` objects

In [23]:
import sqlalchemy as sal
from sqlalchemy import create_engine

engine = create_engine("postgresql+psycopg2://postgres:123456@localhost:5434/ThirdDB")

In [24]:
import pandas as pd

df = pd.read_sql_query(sal.text("select * from student where name like '%ray%'"), con=engine)
df

Unnamed: 0,id,name,dept_name,tot_cred
0,18675,Araya,Statistics,82.0
1,57083,Gray,Pol. Sci.,107.0
2,33759,Mowbray,Psychology,44.0
3,3651,Narayanan,Elec. Eng.,14.0


In [25]:
conn = engine.connect()
# cur.execute("select * from student where name like %s", ('%m%',))
cur = conn.execute(sal.text("select * from student where name like '%ray%'"))

print("The number of parts: ", cur.rowcount)

row = cur.fetchone()
print(row)

The number of parts:  4
('18675', 'Araya', 'Statistics', Decimal('82'))
