# Demo 1: PostgreSQL


### Installing PostgreSQL Database in System (Windows)
1. Download the PostgreSQL installer https://www.postgresql.org/download/windows
2. Install PostgreSQL 
<ul>
    <li> Install in default location </li>
    <li> Check PgAdmin</li>
    <li> Set password</li> 
    <li> Save port </li>
</ul>
3. Open PgAdmin and Walk through the basics of PostgreSQL in GUI

### Install psycopg2
In Python, we have several modules available to connect and work with PostgreSQL:

* Psycopg2
* py-postgresql
* pg8000
* ocpgdb
* PyGreSQL
* bpgsql

For this tutorial, we stick to the Psycopg2. Why?

* Psycopg2 is the most popular python driver for PostgreSQL.
* It is required for most Python and Postgres frameworks.
* Actively maintained and support the major version of python i.e. Python 3 and Python 2.
* It is thread-safe (threads can share the connections). It was designed for heavily multi-threaded applications

**Using pip command, you can install Psycopg2 on any operating system including Windows, macOS, Linux, and Unix and Ubuntu. Use the following pip command to install Psycopg2.**

In [1]:
!pip install psycopg2

Collecting psycopg2
  Downloading psycopg2-2.9.3-cp39-cp39-win_amd64.whl (1.2 MB)
Installing collected packages: psycopg2
Successfully installed psycopg2-2.9.3


#### Import the library 

In [2]:
import psycopg2

### Create a connection to the database
1. Connect to the local instance of PostgreSQL (*127.0.0.1*)


#### Note: Adding the try except will make sure errors are caught and understood

In [3]:
try: 
    conn = psycopg2.connect("host=127.0.0.1 user=postgres password=ali123ali")
except psycopg2.Error as e: 
    print("Error: Could not make connection to the Postgres database")
    print(e)

### Use the connection to get a cursor that can be used to execute queries.

In [4]:
try: 
    cur = conn.cursor()
except psycopg2.Error as e: 
    print("Error: Could not get curser to the Database")
    print(e)

### Use automatic commit so that each action is commited without having to call conn.commit() after each command. The ability to rollback and commit transactions is a feature of Relational Databases. 

In [5]:
try: 
    cur.execute("create database students")
except psycopg2.Error as e:
    print(e)

CREATE DATABASE cannot run inside a transaction block



In [6]:
conn = psycopg2.connect("host=127.0.0.1 user=postgres password=ali123ali")
cur = conn.cursor()
conn.set_session(autocommit=True)

## Higher Education Commision Student's databse
<img src="sources/students.png">
<img src="sources/Universites.png">
<img src="sources/courses.png">


### Creating a database

In [7]:
try: 
    cur.execute("CREATE DATABASE students;")
except psycopg2.Error as e:
    print(e)

database "students" already exists



### Close our connection to the default database, reconnect to the students database, and get a new cursor.

In [7]:
try: 
    conn.close()
except psycopg2.Error as e:
    print(e)

In [8]:
try: 
    conn = psycopg2.connect("host=127.0.0.1 dbname=students user=postgres password=ali123ali")
except psycopg2.Error as e: 
    print("Error: Could not make connection to the Postgres database")
    print(e)

In [10]:
try: 
    cur = conn.cursor()
except psycopg2.Error as e: 
    print("Error: Could not get curser to the Database")
    print(e)

conn.set_session(autocommit=True)

### Now we will create our second table which contain information about universities. The universities table has following columns. 
`column 1: university_id
 column 2: name 
`
<img src="sources\Universites.png" width=500 height=250>

### Translate this information into a Create Table Statement. 

Review this document on PostgreSQL datatypes: https://www.postgresql.org/docs/9.5/datatype.html


In [8]:
cur.execute('DROP TABLE IF EXISTS universities CASCADE')
query = """CREATE TABLE IF NOT EXISTS universities 
(
    university_id INT, 
    university_name VARCHAR NOT NULL,
    PRIMARY KEY(university_id)
);"""
try: 
    cur.execute(query)
except psycopg2.Error as e: 
    print("Error: Issue creating table")
    print (e)

In [9]:
data = [(1,"Information Technology University"),
        (2,"Baha Uddin Zakaria University"),
        (3,"Lahore University of Management Sciences")]

query = "INSERT INTO universities (university_id,university_name) VALUES (%s,%s)"

try:
    cur.executemany(query,data)
except psycopg2.Error as e: 
    print("Error: Inserting Rows")
    print (e)

In [13]:
cur.execute("SELECT * FROM universities;")
row = cur.fetchone()
while row:
    print(row)
    row = cur.fetchone()

(1, 'Information Technology University')
(2, 'Baha Uddin Zakaria University')
(3, 'Lahore University of Management Sciences')


### We will create our first table which contain information about Students. The students_info table has following columns. 
`column 1: Student_ID
 column 2: Name 
 column 3: University
 column 4: Year_of_admission
 column 5: Payable_amount
`

<img src="sources\Students.png" width=500 height=250>

### Translate this information into a Create Table Statement. 

Review this document on PostgreSQL datatypes: https://www.postgresql.org/docs/9.5/datatype.html


In [19]:
cur.execute('DROP TABLE IF EXISTS student_info CASCADE')
query = "CREATE TABLE IF NOT EXISTS student_info (student_id varchar, \
                                                  student_name varchar, \
                                                  university_id int, \
                                                  year_of_admission int, \
                                                  payable_amount int, \
                                                  primary key(student_id),\
                                                  CONSTRAINT fkc_universities \
                                                      FOREIGN KEY(university_id) \
                                                      REFERENCES universities(university_id)\
                                                      ON DELETE SET NULL\
                                                );"
try: 
    cur.execute(query)
except psycopg2.Error as e: 
    print("Error: Issue creating table")
    print (e)

### No error was found, but lets check to ensure our table was created.  `select count(*)` which should return 0 as no rows have been inserted in the table.

In [20]:
try: 
    cur.execute("select count(*) from student_info")
except psycopg2.Error as e: 
    print("Error: Issue creating table")
    print (e)
    
print(cur.fetchall())

[(0,)]


### Insert two rows 

In [21]:
query = "INSERT INTO student_info (student_id, student_name, university_id, year_of_admission, payable_amount) \
                 VALUES (%s, %s, %s, %s, %s)"
try:
    cur.execute(query,("CS1701","Haseeb",1,2017,96000))
except psycopg2.Error as e: 
    print("Error: Inserting Rows")
    print (e)
    
try: 
    cur.execute(query,("BS1902","Kashif",2,2019,14522))
except psycopg2.Error as e: 
    print("Error: Inserting Rows")
    print (e)

### Validate your data was inserted into the table. 
The while loop is used for printing the results. If executing queries in the Postgres shell, this would not be required.

### Note: If you run the insert statement code more than once, you will see duplicates of your data. PostgreSQL allows for duplicates.

In [22]:
try: 
    cur.execute("SELECT * FROM student_info;")
except psycopg2.Error as e: 
    print("Error: select *")
    print (e)

row = cur.fetchone()
while row:
    print(row)
    row = cur.fetchone()

('CS1701', 'Haseeb', 1, 2017, 96000)
('BS1902', 'Kashif', 2, 2019, 14522)


### Insert Information using "execute many" statement

In [22]:
data = [["BS1113","Rashid",1,2011,89000],
        ["CS1704","Hira",3,2017,102500],
        ["BS1105","Faisal",1,2011,36000]]

try:
    cur.executemany(query,data)
except psycopg2.Error as e: 
    print("Error: Inserting Rows")
    print (e)

In [28]:
cur.execute("SELECT * FROM student_info;")
row = cur.fetchone()
while row:
    print(row)
    row = cur.fetchone()

('CS1701', 'Haseeb', 1, 2017, 96000)
('BS1902', 'Kashif', 2, 2019, 14522)
('BS1113', 'Rashid', 1, 2011, 89000)
('CS1704', 'Hira', 3, 2017, 102500)
('BS1105', 'Faisal', 1, 2011, 36000)


### Lets create our third table which contain information about student's regitered courses. The courses table has following columns. 
`column 1: student_id
 column 2: subject_name 
`
<img src="sources\Courses.png" width=500 height=250>

### Translate this information into a Create Table Statement. 

Review this document on PostgreSQL datatypes: https://www.postgresql.org/docs/9.5/datatype.html


In [29]:
cur.execute('DROP TABLE IF EXISTS courses CASCADE')
query = "CREATE TABLE IF NOT EXISTS courses (student_id varchar, subject_name varchar);"
try: 
    cur.execute(query)
except psycopg2.Error as e: 
    print("Error: Issue creating table")
    print (e)

In [30]:
data = [("CS1701","Computer Architecture"),
("CS1701","Operating Systems"),
("CS1701","D Lab"),
("BS1902","Computing Fundamentals"),
("BS1902","Basic Electronics"),
("BS1113","Human Computer Interface"),
("CS1704","Computer Architecture"),
("CS1704","Operating Systems"),
("BS1105","Software Engineering")]

query = "INSERT INTO courses (student_id,subject_name) VALUES (%s,%s)"

try:
    cur.executemany(query,data)
except psycopg2.Error as e: 
    print("Error: Inserting Rows")
    print (e)

In [31]:
cur.execute("SELECT * FROM courses;")
row = cur.fetchone()
while row:
    print(row)
    row = cur.fetchone()

('CS1701', 'Computer Architecture')
('CS1701', 'Operating Systems')
('CS1701', 'D Lab')
('BS1902', 'Computing Fundamentals')
('BS1902', 'Basic Electronics')
('BS1113', 'Human Computer Interface')
('CS1704', 'Computer Architecture')
('CS1704', 'Operating Systems')
('BS1105', 'Software Engineering')


## Lets try using WHERE clause

In [32]:
query = "SELECT * from student_info WHERE year_of_admission=2017"

In [33]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1701', 'Haseeb', 1, 2017, 96000)
('CS1704', 'Hira', 3, 2017, 102500)


### Using the AND OR NOT operator

In [34]:
query = "SELECT * from student_info WHERE year_of_admission=2017 AND university_id=1"

In [35]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1701', 'Haseeb', 1, 2017, 96000)


In [36]:
query = "SELECT * from student_info WHERE year_of_admission=2017 OR university_id=1"

In [37]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1701', 'Haseeb', 1, 2017, 96000)
('BS1113', 'Rashid', 1, 2011, 89000)
('CS1704', 'Hira', 3, 2017, 102500)
('BS1105', 'Faisal', 1, 2011, 36000)


In [38]:
query = "SELECT * from student_info WHERE NOT year_of_admission=2017"

In [39]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('BS1902', 'Kashif', 2, 2019, 14522)
('BS1113', 'Rashid', 1, 2011, 89000)
('BS1105', 'Faisal', 1, 2011, 36000)


## Using the LIKE operator
finding all students enrolled in bachelors using LIKE operator

In [40]:
query = "SELECT * from student_info WHERE student_id like 'BS%%'"

In [41]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('BS1902', 'Kashif', 2, 2019, 14522)
('BS1113', 'Rashid', 1, 2011, 89000)
('BS1105', 'Faisal', 1, 2011, 36000)


## SELECT DISTINCT Examples
The following SQL statement selects only the DISTINCT values from the "Subject_name" column in the "courses" table:

In [42]:
query = "SELECT DISTINCT subject_name from courses;"

In [43]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('Basic Electronics',)
('Computer Architecture',)
('Software Engineering',)
('Operating Systems',)
('Human Computer Interface',)
('D Lab',)
('Computing Fundamentals',)


## Using the COUNT statement to count distinct subjects

In [44]:
query = "SELECT count(DISTINCT subject_name) from courses;"

In [45]:
count = cur.execute(query)
count = cur.fetchall()
print(count)

[(7,)]


## Using the ORDER BY Key Word

In [46]:
query = "select * from student_info ORDER BY payable_amount"

In [47]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('BS1902', 'Kashif', 2, 2019, 14522)
('BS1105', 'Faisal', 1, 2011, 36000)
('BS1113', 'Rashid', 1, 2011, 89000)
('CS1701', 'Haseeb', 1, 2017, 96000)
('CS1704', 'Hira', 3, 2017, 102500)


In [48]:
query = "select * from student_info ORDER BY payable_amount DESC"

In [49]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1704', 'Hira', 3, 2017, 102500)
('CS1701', 'Haseeb', 1, 2017, 96000)
('BS1113', 'Rashid', 1, 2011, 89000)
('BS1105', 'Faisal', 1, 2011, 36000)
('BS1902', 'Kashif', 2, 2019, 14522)


## Using the UPDATE statement

In [50]:
query = "UPDATE student_info SET student_name = 'Javeria' where student_id = 'CS1704'"
cur.execute(query)

In [51]:
query = "select * from student_info ORDER BY payable_amount DESC"
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1704', 'Javeria', 3, 2017, 102500)
('CS1701', 'Haseeb', 1, 2017, 96000)
('BS1113', 'Rashid', 1, 2011, 89000)
('BS1105', 'Faisal', 1, 2011, 36000)
('BS1902', 'Kashif', 2, 2019, 14522)


## Lets Perform JOINS

### Find out the courses each student is taking by JOINING student_info with courses 

In [52]:
query = "select courses.student_id, student_name, subject_name from \
         student_info JOIN courses \
         ON student_info.student_id = courses.student_id;"

In [53]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1701', 'Haseeb', 'Computer Architecture')
('CS1701', 'Haseeb', 'Operating Systems')
('CS1701', 'Haseeb', 'D Lab')
('BS1902', 'Kashif', 'Computing Fundamentals')
('BS1902', 'Kashif', 'Basic Electronics')
('BS1113', 'Rashid', 'Human Computer Interface')
('CS1704', 'Javeria', 'Computer Architecture')
('CS1704', 'Javeria', 'Operating Systems')
('BS1105', 'Faisal', 'Software Engineering')


### Lets find out which students are enrolled in 'Computer Architecture' subject

In [54]:
query = "select courses.student_id, student_name, subject_name from \
         student_info JOIN courses \
         ON student_info.student_id = courses.student_id where subject_name='Computer Architecture'"

In [55]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1701', 'Haseeb', 'Computer Architecture')
('CS1704', 'Javeria', 'Computer Architecture')


### Lets find out the list of students with their respective University name by performing JOIN of student_info table with universities table

In [56]:
query = "select student_id, student_name, university_name from \
         student_info JOIN universities \
         ON student_info.university_id = universities.university_id"

In [57]:
cur.execute(query)

rows = cur.fetchall()
for row in rows:
    print(row)

('CS1701', 'Haseeb', 'Information Technology University')
('BS1902', 'Kashif', 'Baha Uddin Zakaria University')
('BS1113', 'Rashid', 'Information Technology University')
('BS1105', 'Faisal', 'Information Technology University')
('CS1704', 'Javeria', 'Lahore University of Management Sciences')


### Pandas Dataframes

In [25]:
import pandas as pd
query = 'SELECT * from student_info;'
students = pd.read_sql_query(query,conn)
display(students)

Unnamed: 0,student_id,student_name,university_id,year_of_admission,payable_amount
0,CS1701,Haseeb,1,2017,96000
1,BS1902,Kashif,2,2019,14522


In [23]:
from sqlalchemy import create_engine
engine = create_engine('postgresql://postgres:ali123ali@localhost:5432/students')

In [26]:
course_info = {'course_id': [1,2,3],
        'course_name': ['Big Data Analytics', 'Deep Learning', 'Machine Learning']}
  
df = pd.DataFrame(course_info)
df.to_sql('course_info', con=engine, if_exists='replace',index=False)

### Deleing rows from table 

In [63]:
query = 'DELETE FROM courses WHERE student_id=\'CS1701\' and subject_name = \'D Lab\';'
cur.execute(query)
display(pd.read_sql_query('select * from courses',conn))

Unnamed: 0,student_id,subject_name
0,CS1701,Computer Architecture
1,CS1701,Operating Systems
2,BS1902,Computing Fundamentals
3,BS1902,Basic Electronics
4,BS1113,Human Computer Interface
5,CS1704,Computer Architecture
6,CS1704,Operating Systems
7,BS1105,Software Engineering


### Drop the table to avoid duplicates and clean up

In [64]:
try: 
    cur.execute("DROP table IF EXISTS student_info CASCADE")
    cur.execute("DROP table IF EXISTS universities CASCADE")
    cur.execute("DROP table IF EXISTS courses CASCADE")
    cur.execute("DROP table IF EXISTS course_info CASCADE")
except psycopg2.Error as e: 
    print("Error: Dropping table")
    print (e)

###  Close the cursor and connection. 

In [65]:
cur.close()
conn.close()