So far, we've been using a database engine called **SQLite**. SQLite is one of the most common database engines, and has many advantages:

* The database is stored in a single file, making it portable.
* We can use a SQLite database directly from Python, and don't need a separate program running.
* It implements most SQL commands, enabling us to use most of the statements we're familiar with.

However, particularly when developing larger applications, **SQLite** has a few downsides that make other database engines more attractive:

* Only one process at a time can write to the database. When we have a complex web application, we may have multiple processes updating information in the database at the same time. For example, on Facebook, one process might handle updating user information, and another might handle generating the news feed.
* We can't take advantage of performance features, such as **caching**. Because a **SQLite database** is a single file, and it doesn't require a special program to run, it can't have performance optimizations like **caching**. When running a site like `Facebook` that has a ton of traffic, it's important to be able to lookup data quickly.
* **SQLite** doesn't have any built-in security. With a production website, it's common to want some people to be able to modify tables in a database (`write`), and others to only be able to make `SELECT` queries to tables in the database (`read`). This is because giving someone **write access** to the database can be a security risk, in that they can update or overwrite data. **SQLite** doesn't allow for restricting access to a database in this way.

In general, `SQLite` is good in cases where having a small and simple database engine is important. `SQLite` is used extensively in **embedded applications**, such as `Android` and `iOS` applications.

In cases where there will be **multiple users or performance** is important, [PostgreSQL](https://www.postgresql.org/) is the most commonly used database engine. **PostgreSQL** is **open source**, and is **free to download and use**.

At a high level, `PostgreSQL` consists of two pieces, a **server** and **clients**. 

* The server is a program that manages databases and handles queries.
* Clients communicate back and forth to the server. 

Only the **server** ever directly accesses the databases -- the **clients** can only make requests to the **server**. 

One of the advantages of this model is that multiple clients can communicate with the server at the same time. This allows multiple processes to write to a database at the same time.

It's possible to run a PostgreSQL server either remotely or locally. 
* If it's remote, we connect to it via internet. 
* If it's local, we connect to it on our own machine. 

In both cases, we'll be connecting to PostgreSQL via a system port.

One way to think of ports is to think of receiving mail at an apartment building. Let's say 5 people live in an apartment building, but they only have a single address. All incoming mail will come to the address, then have to be sorted out and given to each person:

All incoming mail is merged into a single pile, because the whole apartment building only has one address. Each apartment occupant then has to sort through the pile to find their mail. Not only is this inefficient, it also results in some apartments getting mail that isn't theirs by accident.

We can make life easier for everyone by giving each apartment its own address:

Now, nobody has to sort mail, and it's unlikely that someone will accidentally get a message that isn't theirs.

Every computer runs dozens to hundreds of programs. Many of these programs can accept incoming connections from the internet. For instance, web servers run on computers and accept connections from people all over the world. Once the connections are created, data is sent along the connections.

If every program received data in the same stream, we'd have a similar situation to all of the apartments only having one address. Each program would be responsible for figuring out which messages were for it, and many messages would be sent to the wrong program. It would be impossible to know which program we were communicating with when we connected to the computer.

One way to avoid this is for each program to have its own address. A system port is similar to an apartment number in that a port on a computer can only be used by one server at a time. For example, web servers run on port 80. Any incoming messages on this computer port are automatically sent to the program.

By default, `PostgreSQL` uses **port 5432** to communicate with the outside world. If we start a PostgreSQL server, it will listen for incoming connections on port 5432. Clients will be able to connect to the server using this port. 

If we start a client, we'll have to specify which server to connect to, along with the port to connect to.

There are many clients for **PostgreSQL**, including [graphical clients](https://wiki.postgresql.org/wiki/Community_Guide_to_PostgreSQL_GUI_Tools). The most common Python client for PostgreSQL is called [psycopg2](http://initd.org/psycopg/). Connecting to a PostgreSQL database using `psycopg2` is similar to connecting to a **SQLite database** using the `sqlite3` library. `psycopg2` also uses **Connection** and **Cursor** objects.

We'd connect to a database using `psycopg2` like this:

`import psycopg2
 conn = psycopg2.connect("dbname=postgres user=postgres")
 cur = conn.cursor()`

We have to specify both a database name and a user name. A PostgreSQL server can have multiple databases and multiple users, so we need to specify which user we're connecting as, and which database we're connecting to.

When **PostgreSQL** is first installed, the default user account is called **postgres**, with an associated database called **postgres**.

We may also notice that we didn't specify a server to connect to. **Psycopg2** will default to connecting to port **5432** on the current computer.

When we're done with a Connection object, we should close it to avoid issues where one connection prevents another from executing a query. We can close a connection like this:

`conn.close()`

In [3]:
pip install psycopg2

Collecting psycopg2
  Downloading psycopg2-2.8.6-cp37-cp37m-win_amd64.whl (1.1 MB)
Installing collected packages: psycopg2
Successfully installed psycopg2-2.8.6
Note: you may need to restart the kernel to use updated packages.


In [4]:
import psycopg2
import sqlite3 as sql

In [5]:
import psycopg2
conn = psycopg2.connect("dbname=postgres user=postgres password = waqas1986ali")
cur = conn.cursor() # Initialize a Cursor object from the connection.
print(cur)
conn.close()                   

<cursor object at 0x00000210791BCC88; closed: 0>


In [46]:
#  Write a SQL query that creates a table called notes in the database,
# with the following columns and data types:
#  id -- integer data type, and is a primary key.
#  body -- text data type.
#  title -- text data type.

conn = psycopg2.connect("dbname=postgres user=postgres password = waqas1986ali")
cur = conn.cursor()
cur.execute("""Create Table If not Exists notes(
                id Integer Primary Key,
                body Text,
                title Text)""")
conn.close()

If We checked the database postgres now, we would notice that there actually isn't a notes table inside it. This isn't a bug -- it's because of a concept called **SQL transactions**. With SQLite, every query we made that modified the data was immediately executed, and immediately changed the database.

With PostgreSQL, we're dealing with multiple users who could be changing the database at the same time. Let's imagine a simple scenario where we're keeping track of accounts for different customers of a bank. We could write a simple query to create a table for this:

`CREATE TABLE accounts(
   id integer PRIMARY KEY,
   name text,
   balance float
);`

Jim would be credited 100 dollars, but 100 dollars would not be removed from Sue. This would cause the bank to lose money.

Transactions prevent this type of behavior by ensuring that all the queries in a transaction block are executed at the same time. If any of the transactions fail, the whole group fails, and no changes are made to the database at all.

Whenever we open a Connection in `psycopg2`, a new transaction will automatically be created. All queries run up until the [commit](http://initd.org/psycopg/docs/connection.html#connection.commit) method is called will be placed into the same transaction block. When commit is called, the PostgreSQL engine will run all the queries at once.

If we don't want to apply the changes in the transaction block, we can call the [rollback](http://initd.org/psycopg/docs/connection.html#connection.rollback) method to remove the transaction. Not calling either commit or rollback will cause the transaction to stay in a pending state, and will result in the changes not being applied to the database.

In [47]:
conn = psycopg2.connect("dbname=postgres user=postgres password=waqas1986ali")
cur = conn.cursor()
cur.execute("""Create Table IF not Exists notes(
                id Integer Primary Key,
                body Text,
                title Text)""")
conn.commit()
conn.close()

There are cases when we won't want to manage a transaction, and we'll instead want changes right away. This is most common when we're making changes to the database that we want to be guaranteed to happen immediately.

Some changes also have such widespread effects that they can't be wrapped inside of a transaction. One example of this is creating a database. When creating a database, we'll need to activate autocommit mode first.

To activate autocommit mode, we'll need to set the [autocommit](http://initd.org/psycopg/docs/connection.html#connection.autocommit) property of the Connection object to True

In [48]:
# Write a SQL query that creates a table called facts in the database, with the following columns and data types:
# id -- integer data type, and is a primary key.
# country -- text data type.
# value -- text data type.
conn = psycopg2.connect("dbname=postgres user=postgres password=waqas1986ali")
conn.autocommit = True
cur = conn.cursor()
cur.execute("""Create Table IF not Exists facts(
                id Integer,
                country Text,
                value Text)""")
conn.close()

In [65]:
# Execute a SQL query that inserts a row into the notes table with the following values:
# id -- 1
# body -- 'Do more missions on Dataquest.'
# title -- 'Dataquest reminder'.
# Execute a SQL query that selects all of the rows from the notes table.

conn = psycopg2.connect("dbname=postgres user=postgres password=waqas1986ali")
cur = conn.cursor()

cur.execute("""Delete From notes
               Where id = 1
            """)
cur.execute("""Insert Into notes
               Values
               (1,'Do more missions','reminder')
               ON Conflict (id)
               Do Nothing""")   # on conflict will work same as ignore in sqlite3
conn.commit()
cur.execute("""select * from notes""")

rows = cur.fetchall()
print(rows)

[(1, 'Do more missions on Dataquest.', 'Dataquest reminder')]


One of the most powerful aspects of PostgreSQL is that it enables us to create multiple databases. Different databases are generally used to hold information about different applications. For instance, if we have the following three datasets and applications:

* An application that enables us to add and remove friends in our neighborhood.
* A dataset on household income worldwide.
* An application that allows us to store and share notes.

We could in theory make different tables for each of these in an existing database. But eventually, we'll reach a point where each application has multiple tables, due to foreign keys and joins. It will get messy to manage all the tables for each application separately. By storing data for a single application in a single database, we encapsulate that application, and make it easier to manage and alter the data for it.

We can create a database using the `CREATE DATABASE SQL statement`:

`CREATE DATABASE dbName;`

Here's a concrete example:

`CREATE DATABASE notes;`

The above SQL command will create a database called **notes**. We can specify the user who will own the database when we create it as well, using the `OWNER statement`:

`CREATE DATABASE notes OWNER postgres;`

The above statement will create a database called **notes** with the `default postgres user` as the **owner**. The owner of a database is the only one that can access and modify a database, unless they give permission to other users. An exception is superusers who can perform any action on any database without being given permission.

In [58]:
# Create a database called income where the owner is the user postgres
conn = psycopg2.connect("""dbname=postgres user=postgres password=waqas1986ali""")
conn.autocommit = True
cur = conn.cursor()
cur.execute("""Drop Database if Exists income""")  # to avoid error while re run a cell
cur.execute("""Create Database income Owner postgres;""")
conn.close()

We can delete a database using the `DROP DATABASE statement`. The `DROP DATABASE statement` will immediately remove a database, provided the user executing the query has the right permissions. It should be used with caution when working with real data.

`DROP DATABASE dbName;`

Here's a more concrete example:

`DROP DATABASE income;`

The above statement will remove the database called `income`, along with any tables it contains.

In [60]:
conn = psycopg2.connect("dbname=postgres user=postgres password=waqas1986ali")
conn.autocommit = True
cur = conn.cursor()
cur.execute("DROP DATABASE income;")
conn.close()