# Welcome to your Database
## We will be Connecting Python code to a SQL Database
In this assignment we will be running a webform and will be adapting the code to use a database backend. Once we have sucessfully used a database backend we will then find and fix the most common bugs affecting webapp database and fixing the damage that they can cause to your database. Lets get started!

## What is a DB-API?

![](img/dbapi1.PNG)

In the first two notebooks we have gone though how to get a database to give up and output given a query and had used a black box to hide the bits about how a query gets the information. If you were following the course on Udacity the querys you enter were send to the web servers but now we will go into more depth about how the magic happens.

![](img/dbapi2.PNG)

Behind the Udacity web servers is python code that allows us to connect to an sqlite database using the DB-API method calls. These DB_API isnt a library but a standard for libraries to left python code connect to databases.  If you learnt he BD-API functions you can apply those to any database system. Below we have Database Systems and their corresponding libraries.

![](img/dbapi3.PNG)

In the VM we will be using the PostgreSQL.

![](img/dbapi4.PNG)

Above is an example of using python code with the DB-API and in this case sqlite3. The general rules can be applied to any database with the DB-API. 

## Query Example
1. We want to do is import the DB-API in the example above we use sqlite
2. Connect to your Database may have to state host name, username, password and other information. The connection will stay open until we close it.
3. Initialize  our cursor, the cursor is what actually runs query and fetches results. It is called cursor because when the data base give you results we want to scan though them like we would with a text cursor.
4. Execute a query using the cursor
5. Fetch all the results
6. (optional) if you insert data into our database we may want to commit() it here or is something went wrong rollback
6. Close our connection once we finish!

### Code to Create Table

In [1]:
import sqlite3
conn = sqlite3.connect('people.db')
c = conn.cursor()
c.execute("CREATE TABLE friends (name text, age integer)")
conn.commit()
conn.close()

### Code to Insert into a Table

In [2]:
import sqlite3
conn = sqlite3.connect('people.db')
c = conn.cursor()
c.execute("INSERT INTO friends VALUES ('STEVE', 26)")
conn.commit()
conn.close()

One thing to always remember is that when we do a query using the keyword insert, we must also commit our changes to the database. Image we were writing an accounting system and we wanted to pull \\$100 from your account and put \\$100 into Kims account. We might do changes to two different tables but the key takeaways is that we want both to take place at the same time or if something goes wrong we want neither to take effect. Also say that another user was viewing the database we would never want them to see one change and not the other. When we make changes such as insert to our database they go into something called transaction, when we call commit the transaction actually takes effect. If we close the connection without commiting our changes will be rolled back. This means that none of our changed will have taken place.

### Code to Query our Table

In [4]:
conn = sqlite3.connect('people.db')
cursor = conn.cursor()
cursor.execute("SELECT * from friends ")
result = cursor.fetchall()
conn.commit()
conn.close()

In [5]:
# print our results
print(result)

[('STEVE', 26)]


Great lets get some more people into our database!

In [8]:
conn = sqlite3.connect('people.db')
cursor = conn.cursor()
cursor.execute("INSERT INTO friends VALUES ('TIM', 25), ('LARRY', 26), ('KIM', 25), ('STEPHANIE', 21)")
result = cursor.fetchall()
conn.commit()
conn.close()

Let Query our Table again!

In [9]:
conn = sqlite3.connect('people.db')
cursor = conn.cursor()
cursor.execute("SELECT * from friends ")
result = cursor.fetchall()
conn.commit()
conn.close()
# print our results
print(result)

[('STEVE', 26), ('TIM', 25), ('LARRY', 26), ('KIM', 25), ('STEPHANIE', 21)]


Lets try to sort our group of friends alphabetically by their names. 

In [12]:
conn = sqlite3.connect('people.db')
c = conn.cursor()
query = "select * from friends ORDER BY name;"
c.execute(query)
results = c.fetchall()
conn.close()
print(results)

[('KIM', 25), ('LARRY', 26), ('STEPHANIE', 21), ('STEVE', 26), ('TIM', 25)]


### Running a Web Forum

For the second of this notebook we will using vagrant to work on our forum.py file.
In our example we will be using windows with git bash. 

### Start the Virtual Machine
To start our virtual machine we need to open up gitbash to the relational-db directory and run the command <code>vagrant up</code>.
![](img/vagrant.PNG)

Next we need to log in, to do this we run <code>vagrant ssh</code>.

![](img/vagrant2.PNG)

Inside the VM, change directory to <code>/vagrant</code> and look around with <code>ls</code>.

The files you see here are the same as the ones in the <code>vagrant</code> subdirectory on your computer (where you started Vagrant from). Any file you create in one will be automatically shared to the other. This means that you can edit code in your favorite text editor, and run it inside the VM.

Files in the VM's <code>/vagrant</code> directory are shared with the <code>vagrant</code> folder on your computer. But other data inside the VM is not. For instance, the PostgreSQL database itself lives only inside the VM.

## Logging Out and In
If you type exit (or Ctrl-D) at the shell prompt inside the VM, you will be logged out, and put back into your host computer's shell. To log back in, make sure you're in the same directory and type vagrant ssh again.

If you reboot your computer, you will need to run vagrant up to restart the VM.

If you run into any problem please review this [link](https://classroom.udacity.com/courses/ud197/lessons/3423258756/concepts/14c72fe3-e3fe-4959-9c4b-467cf5b7c3a0) for troubleshooting tips.

### Starting Our Forum
The first thing we must do is cd in the forum directory. Once there we run the command <code>python forum.py</code>
![](img/db.PNG)
The forum seems to work as expected where we put post it will store our message and the time of posting but when we try to reset the web server we see that our all post are gone! If we look into our forumdb.py we see that our <code>POSTS</code> are stored as a just a plain old variable. Plain old variables arn't a database and will go away once the program ends. Lets work on making from changes to this code.

### Hello PSQL
[Postgresql](http://www.postgresql.org/docs/9.4/static/app-psql.html) can we accessed with <code>psql</code> in the command line.

To connect psql to a database running on the same machine (such as your VM), all you need to give it is the database name. For instance, the command psql forum will connect to the forum database.

From within psql, you can run any SQL statement using the tables in the connected database. Make sure to end SQL statements with a semicolon, which is not always required from Python.

You can also use a number of special psql commands to get information about the database and make configuration changes. The \d posts command shown in the video is one example — this displays the columns of the posts table.

Some other things you can do:

<code>\dt</code> — list all the tables in the database.

<code>\dt+</code> — list tables plus additional information (notably, how big each table is on disk).

<code>\H</code> — switch between printing tables in plain text vs. HTML.

### Give that App a Backend
The forum database has already been created for you in the virtual machine that you downloaded. Your code will need to connect to it using psycopg2.connect("dbname=forum") and then perform select and insert operations on the posts table.

The existing get_posts function returns all the entries from a list. So its database version should return all the entries from the posts table.

And likewise, the existing add_post function inserts an entry into a list.

You do not need to provide the time column when you insert a post. The table is set up to already provide a timestamp.

The existing get_posts function puts the posts in order using a Python reversed function. When you implement this function using the database, can you put the posts in order using only SQL?

Initially our forumdb file looked like 

In [None]:
POSTS = [("This is the first post.", datetime.datetime.now())]

def get_posts():
  """Return all posts from the 'database', most recent first."""
  return reversed(POSTS)

def add_post(content):
  """Add a post to the 'database' with the current timestamp."""
  POSTS.append((content, datetime.datetime.now()))


We wanted to get some backend so the first step in to connect to the database, once connected we want to add a cursor then execute our query. We would like to get the post concent and time time of all the post in our database byt we would like the most recent ones first. We store the results from the query in posts and close our connection. 

When we add a post we want to make sure to remember to commit our changes. We will insert the contents to our post and then commit before disconnecting from the server. 
The code below is found in the solutions under forumdb_step_one.

In [None]:
def get_posts():
  """Return all posts from the 'database', most recent first."""
  db = psycopg2.connect(database=DBNAME)
  c = db.cursor()
  c.execute("select content, time from posts order by time desc")
  posts = c.fetchall()
  db.close()
  return posts

def add_post(content):
  """Add a post to the 'database' with the current timestamp."""
  db = psycopg2.connect(database=DBNAME)
  c = db.cursor()
  c.execute("insert into posts values ('{}')".format(content)) # Almost but not quite.
  db.commit()
  db.close()


Everything looks to be working find if input from the user are as expected but we seem to have missed something. If we enter a message containing <code>'</code> will give us an error as it will give us a syntax error in our query. This will cause our forum to crash.

![](img/db2.PNG)

Another mess we can try is <code>'); delete from posts; --</code>. This is a famous security bug that will delete our messages from the post table.

![](img/db3.PNG)

Looks like that post deleted all our other posts, What can we do to avoid this kind of bug?

![](img/db4.PNG)


We want to make sure to use query parameters instead of string substitution, Let change our code and see how this may look. You can find the code in forumdb_step_two.py in the solutions directory.

In [None]:
def get_posts():
  """Return all posts from the 'database', most recent first."""
  db = psycopg2.connect(database=DBNAME)
  c = db.cursor()
  c.execute("select content, time from posts order by time desc")
  posts = c.fetchall()
  db.close()
  return posts

def add_post(content):
  """Add a post to the 'database' with the current timestamp."""
  db = psycopg2.connect(database=DBNAME)
  c = db.cursor()
  c.execute("insert into posts values (%s)", (content,))  # Better, but ...
  db.commit()
  db.close()

Lets see what our results are now when we run <code>'); delete from posts; --</code>. 

![](img/db5.PNG)

Looks like we may be safe from sql injection attacks.

What happens when we run the following code

```
<script>
setTimeout(function() {
    var tt = document.getElementById('content');
    tt.value = "<h2 style='color: #FF6699; font-family: Comic Sans MS'>Spam, spam, spam, spam,<br>Wonderful spam, glorious spam!</h2>";
    tt.form.submit();
}, 2500);
</script>
```

![](img/db6.PNG)

Look like our post is posting spam everytime we load the page. Why might this be happening?

![](img/db7.PNG)

Our database and our web sever are taking the post in find but our browser thinks that this is some code. THis is called a scipt injection attack. Real web forms dont allowed javascript code in their comments.

### Stop Spam
[Bleach](https://bleach.readthedocs.io/en/latest/) is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes. We can clean our data.

Do you think it is better to clean bad stuff out of our post before our code stores it in the database or store whatever the user sends and clean the bad stuff our before we display it?

The answer is it depends, we do like input sanitization where we clean our data before storing it so we dont have to worry about problems later even if we want to use the data with a different interface. On the other hand if we want an accurate record of what users have sent to us we wanted to save what the users have sent to use, but we would also like to make sure that bad inputs were no already present in our databased before we got started. 
 
In our code we start with cleaning our input.

In [None]:
def get_posts():
  """Return all posts from the 'database', most recent first."""
  db = psycopg2.connect(database=DBNAME)
  c = db.cursor()
  c.execute("select content, time from posts order by time desc")
  posts = c.fetchall()
  db.close()
  return posts

def add_post(content):
  """Add a post to the 'database' with the current timestamp."""
  db = psycopg2.connect(database=DBNAME)
  c = db.cursor()
  c.execute("insert into posts values (%s)", (bleach.clean(content),))  # good
  db.commit()
  db.close()
