<a href="https://colab.research.google.com/github/casstrottter/COMP-593/blob/main/WEEK_3_WORKING_WITH_DATABASES.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# LAB 3: WORKING WITH DATABASES


## Introduction

During this lab, we will experiment with using a few different libararies and modules to connect to and interact with an SQLite database. After creating and configuring our testing database, we will populate it with Fake data and run some queries to see what kind of information we can retreive. For context, we will be creating RSSN, the "Really Simple Social Network".

## Creating a SQLite Database

We will be utilizing the `sqlite3` library to create our database file, as it offers the convienience of creating our database file automatically in the event that it cannot find an existing file at the path specified.

**Before you run the below code:**
On the left of the notebook, select the file folder icon. Then, after you run the code block, you should be able to observe the creation of the database file. If it doesn't appear, try clicking the "Refresh" button above the list of folders.

In [1]:
import sqlite3

#When we retreive a Connection object, a new database will be created for us if it doesn't already exist. 
myConnection = sqlite3.connect('social_network.db')
print(sqlite3.version)

2.6.0


## Creating a Table


We'll use the below code to create a table called `people` within our database. 

In [2]:
import sqlite3

#Retreive the Connection object
myConnection = sqlite3.connect('social_network.db')

#Once we have a Connection object, we can generate a Cursor object, and use that to run our SQL Queries
myCursor = myConnection.cursor()

#Let's define the SQL Query we will use to create our first table:
createPeopleTable = """ CREATE TABLE IF NOT EXISTS people (
                          id integer PRIMARY KEY,
                          name text NOT NULL,
                          email text NOT NULL,
                          address text NOT NULL,
                          city text NOT NULL,
                          province text NOT NULL,
                          country text NOT NULL,
                          phone text,
                          bio text,
                          dob date NOT NULL,
                          heatmap integer,
                          created_at datetime NOT NULL,
                          updated_at datetime NOT NULL,
                          ipv4 text
                        );"""

#Now that we have the string to create our table,
#Cursor objects have an execute() method which will accept an SQL string and perform the operations described.

myCursor.execute(createPeopleTable)

#We can confirm if our table was created successfully by running the following SQL Query
#pragma_table_info is an internal SQLite function that will retun information about a table
myCursor.execute("SELECT group_concat(name, ', ') FROM pragma_table_info('people')")
print(myCursor.fetchone())

#We use to the commit() method on the database Connection object to persist our changes
myConnection.commit()

#It is always a good idea to close a connection when it will no longer be used
myConnection.close()


('id, name, email, address, city, province, country, phone, bio, dob, heatmap, created_at, updated_at, ipv4',)



If you received a tuple containing the names of the columns, awesome! We have successfully created our database table.
```
('id, name, email, address, city, province, country, phone, bio, created_at, updated_at',)
```

Run the below code block to add our first entry.

In [3]:
import sqlite3
from pprint import pprint #Outputs data in a slightly easier to read format
from datetime import datetime #For generating dates and times

#Retreive the Connection object
myConnection = sqlite3.connect('social_network.db')

#Once we have a Connection object, we can generate a Cursor object, and use that to run our SQL Queries
myCursor = myConnection.cursor()

#Let's define the SQL Query we will use to create our first entry:
addPersonQuery = """INSERT INTO people (name, 
                      email, 
                      address, 
                      city, 
                      province, 
                      country, 
                      phone, 
                      bio,
                      dob,
                      heatmap,
                      created_at, 
                      updated_at, 
                      ipv4)
                  VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);"""

"""
The ?'s are placeholders that we can fill in when we use the execute() method.
This is really handy for code reuse, as we can pass those values are variables in a tuple
Instead of hard coding them into the statement.
"""

myPerson = ("John Doe", 
            "johndoe@anon.null", 
            "123 Fake St.", 
            "Fakesville", 
            "Fakesdom", 
            "Fakopolis", 
            None, 
            None,
            "1967-09-12",
            0,
            datetime.now(), 
            datetime.now(),
            None)

myCursor.execute(addPersonQuery, myPerson)

#We can confirm if our table was created successfully by running the following SQL Query
#pragma_table_info is an internal SQLite function that will retun information about a table
myCursor.execute("SELECT * FROM people")
pprint(myCursor.fetchall())

# If you run this code block a few times, you will see that you only have 1 entry,
# If you uncomment the below lines and run the block a few more times, you will begin to see multiple entries.
 myConnection.commit()
 myConnection.close()

[(1,
  'John Doe',
  'johndoe@anon.null',
  '123 Fake St.',
  'Fakesville',
  'Fakesdom',
  'Fakopolis',
  None,
  None,
  '1967-09-12',
  0,
  '2021-02-11 18:43:23.912379',
  '2021-02-11 18:43:23.912390',
  None)]


# Lab Submission

We're going to build our experience with working with Libraries and examining documentation by populating our 'People' table with data provided to us by the `Faker` library. `Faker` is used to generate fake data and is very helpful for the rapid generation of databases for the purposes of testing. Run the two blocks below to install faker and get an idea of what it can do.

In [4]:
!pip install faker

Collecting faker
[?25l  Downloading https://files.pythonhosted.org/packages/4e/6a/591bea01ef396a4611b2097af19aa86975ebef06a4bb571a8a25ba36cf9a/Faker-6.1.1-py3-none-any.whl (1.1MB)
[K     |▎                               | 10kB 11.4MB/s eta 0:00:01[K     |▋                               | 20kB 17.3MB/s eta 0:00:01[K     |▉                               | 30kB 12.9MB/s eta 0:00:01[K     |█▏                              | 40kB 9.5MB/s eta 0:00:01[K     |█▌                              | 51kB 8.2MB/s eta 0:00:01[K     |█▊                              | 61kB 9.3MB/s eta 0:00:01[K     |██                              | 71kB 10.3MB/s eta 0:00:01[K     |██▍                             | 81kB 10.9MB/s eta 0:00:01[K     |██▋                             | 92kB 11.3MB/s eta 0:00:01[K     |███                             | 102kB 10.9MB/s eta 0:00:01[K     |███▎                            | 112kB 10.9MB/s eta 0:00:01[K     |███▌                            | 122kB 10.9MB/s et

In [5]:
from faker import Faker

fake = Faker()

for _ in range(10):
  print('{} || {}'.format( fake.name(), fake.job() ) )

Teresa Gallegos || Estate agent
Maria Johnson || Psychiatrist
Crystal Durham || Stage manager
Michael Shepherd MD || Garment/textile technologist
Christopher Marsh || Civil Service fast streamer
Vincent Parker || Doctor, general practice
Gloria Cochran || Interior and spatial designer
Tina Brown || Solicitor
David Powell || Health and safety inspector
Mr. Brian Williams || Printmaker


Very cool! Faker has tons of `providers` that can all be used to populate fake data. The [list of providers](https://faker.readthedocs.io/en/stable/providers.html) in the Faker documentation will help you fill out the columns for our People table.

The goal of this script is to populate the people table with 1000 entries, with the following constraints:

1. The `heatmap` column must contain a random number between `999` and `2500`
2. The `created_at` and `updated_at` columns must use the `datetime` object (see examples above) 
2. Use `Faker` to generate all other fields.

*Hint: Each of the methods contained in the provider can be called directly from the base Faker object, for example, one can call the `file_name()` method from `faker.providers.file` by calling `Faker().file_name()`*

In [14]:
import sqlite3
from faker import Faker
from datetime import datetime #For generating dates and times
from random import randint #For generating random numbers

#Retreive the Connection object
myConnection = sqlite3.connect('social_network.db')

#To generate a Cursor object
myCursor = myConnection.cursor()

#Defining the SQL Query
addPersonQuery = """INSERT INTO people (name, 
                      email, 
                      address, 
                      city, 
                      province, 
                      country, 
                      phone, 
                      bio,
                      dob,
                      heatmap,
                      created_at, 
                      updated_at, 
                      ipv4)
                  VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?);"""

fake = Faker()

#Loop to populate the table with 1000 entries
for person in range(1000) :
  myPerson = (fake.name() , 
            fake.ascii_email() , 
            fake.street_address() , 
            fake.city() , 
            fake.state() , 
            fake.country() , 
            fake.phone_number() , 
            fake.text() ,
            fake.date_of_birth() ,
            randint(999,2500) ,
            datetime.now() , 
            datetime.now() ,
            fake.ipv4() )

  myCursor.execute(addPersonQuery, myPerson)

#To confirm the table was created successfully
myCursor.execute("SELECT * FROM people")
pprint(myCursor.fetchall())

#Commit and close connection
myConnection.commit()
myConnection.close()

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
  'greenephilip@martinez-bauer.org',
  '292 Brooks Inlet',
  'South Jenniferville',
  'Georgia',
  'Tajikistan',
  '583.727.2179',
  'Get inside like media. Who east also yes.\n'
  'Research claim drug serious. Great reach us old. Remember morning feel '
  'any.\n'
  'Politics serve bag show a answer class.',
  '1981-12-05',
  1160,
  '2021-02-11 19:35:01.657011',
  '2021-02-11 19:35:01.657012',
  '108.208.121.210'),
 (685,
  'Bailey Baker DDS',
  'anthony90@hotmail.com',
  '45159 Emily Hill Apt. 068',
  'Curtisfurt',
  'Colorado',
  'Albania',
  '935.938.6368x78109',
  'Road message political daughter memory stuff already. Billion but others TV '
  'if whom camera.\n'
  'Gas in tonight push look water their. Standard per history vote available.',
  '2000-06-12',
  1480,
  '2021-02-11 19:35:01.657752',
  '2021-02-11 19:35:01.657753',
  '154.145.79.87'),
 (686,
  'Jenna Evans',
  'friley@cook.net',
  '9364 Burns Dam Suite 

Finally, the last step,
Using the code block below, combined with what you have learned above, and the Lecture notes,
Craft a SQL Query that will return the `name` of no more than `20` `people` with a `heatmap` greater than `1500`

In [18]:
import sqlite3
from pprint import pprint
#Retreive the Connection object
myConnection = sqlite3.connect('social_network.db')

#Once we have a Connection object, we can generate a Cursor object, and use that to run our SQL Queries
myCursor = myConnection.cursor()

selectStatement = """SELECT * FROM people
WHERE heatmap > 1500
LIMIT 20 """

myCursor.execute(selectStatement)
results = myCursor.fetchall()
pprint(results)

[(2,
  'Tiffany Harvey',
  'fwhite@gmail.com',
  '051 Marquez Hill',
  'Port Christopher',
  'Oregon',
  'Taiwan',
  '148.121.3403',
  'Out interesting whatever direction. Trade media according avoid research '
  'serious smile. Finish fill speech record them.',
  '1974-09-08',
  1962,
  '2021-02-11 19:35:01.031086',
  '2021-02-11 19:35:01.031088',
  '139.185.71.15'),
 (3,
  'Brenda Bass',
  'ljackson@gmail.com',
  '609 Mejia Underpass',
  'New Tracyland',
  'Michigan',
  'Vanuatu',
  '001-045-147-9440x991',
  'Everybody very hand enjoy foot.\n'
  'Yet make before mother. Admit represent rather task here necessary anyone.\n'
  'Successful action within raise stop. Foreign figure everyone name appear '
  'through tell.',
  '1996-02-19',
  2261,
  '2021-02-11 19:35:01.032130',
  '2021-02-11 19:35:01.032133',
  '147.90.9.137'),
 (4,
  'Charles Mccullough',
  'brandonbrown@yahoo.com',
  '04500 Tony Expressway',
  'South Lynn',
  'Maine',
  'Mali',
  '835.596.6189',
  'Realize could others 

Your submission will contain, as usual, a link to your completed colab notebook, but in addition to that, you will download a copy of your social_network.db file and upload it to D2L. To download the file, right click it from the Files menu on the left of the Notebook.