# Module 8 Assignment


A few things you should keep in mind when working on assignments:

1. Run the first code cell to import modules needed by this assignment before proceeding to problems.
2. Make sure you fill in any place that says `# YOUR CODE HERE`. Do not write your answer anywhere else other than where it says `# YOUR CODE HERE`. Anything you write elsewhere will be removed or overwritten by the autograder.
3. Each problem has an autograder cell below the answer cell. Run the autograder cell to check your answer. If there's anything wrong in your answer, the autograder cell will display error messages.
4. Before you submit your assignment, make sure everything runs as expected. Go to the menubar, select Kernel, and Restart & Run all. If the notebook runs through the last code cell without an error message, you've answered all problems correctly.
5. Make sure that you save your work (in the menubar, select File â†’ Save and CheckPoint).

-----

# Run Me First!

In [None]:
import sqlite3 as sql
import pandas as pd

from nose.tools import assert_equal, assert_true

-----

## Problem 1: Get database connection

In the code cell below, we declare a function named `get_connection` that takes one function parameter: `file_path`, which is a string that contains the file path for the database file you will create.

To complete this problem, finish writing the function `get_connection`:
- Establish a sqlite3 connection to the database 'file_path'.
- Return the sqlite3 connection.

-----

In [None]:
def get_connection(file_path):
    '''
    Establishe and return a connection to a database.
    
    Parameters
    ----------
    file_path: string containing path to create database
    
    Returns
    -------
    con: sqlite3 connection
    '''
    
    # YOUR CODE HERE
    

In [None]:
con = get_connection('sql_files/m8-high.db')
cur = con.cursor()
cur.execute("PRAGMA table_Info('Checker')")
result = cur.fetchall()

assert_true(result[0][1].upper()=='CNUM', msg='Database connection is not correct.')

-----

## Problem 2: Read in data

In the code cell below, we declare a function named `read_data` that takes one function parameter: `read_path`, which is a string that contains the file path for the **csv** file you will use.

To complete this problem, finish writing the function `read_data`:
- Use Pandas read_csv() to read data from the csv file located at `read_path`.
- Return the resulting Pandas DataFrame.

-----

In [None]:
def read_data(read_path):
    '''
    Read data from csv file "read_path" by using Pandas
    
    Parameters
    ----------
    read_path: string containing the file path from which to read data
    
    Returns
    -------
    Pandas DataFrame
    '''
    
    # YOUR CODE HERE
    

In [None]:
df = read_data('data/ACCY_desc.csv')
assert_equal(set(df.columns.tolist()),
             set(['year', 'semester', 'dept', 'course_num', 'course_name', 'description']),
            msg='Your column names do not match the solutions')
assert_equal(len(df), 36, msg='There are 36 unique ACCY courses offered in Fall 2018 not %s'%(len(df)))

assert_equal(set(df['course_num'].tolist()),
            set([199, 200, 201, 202, 290, 301, 302, 303,
                 304, 312, 321, 398, 405, 410, 415, 451,
                 499, 500, 501, 502, 503, 504, 505, 510,
                 512, 515, 517, 518, 551, 556, 570, 585,
                 592, 593, 594, 599]),
             msg='Your answer does not match the solutions.')

print("First five ACCY Courses offered Fall 2018 at UIUC:")
df.head(5)

-----

## Problem 3: Insert data

In the code cell below, we declare a function named `insert_data` that takes two function parameters: `con`, which is a sqlite3 connection object, and `df`, which is the Pandas DataFrame created in Problem 2.

To complete this problem, finish writing the function `insert_data`:
- Use DataFrame function to_sql() to insert all values from `df` to `Courses` table.
- **Donot** write index of df as a column into `Courses`.
- Replace table if `Courses` already exists in the database.

-----

In [None]:
def insert_data(con, df):
    '''
    Parameters
    ----------
    con: sqlite3 connection
    df: pandas dataframe
    
    Returns
    -------
    None
    '''
    
    # YOUR CODE HERE
    

In [None]:
insert_data(con, df)
query = "SELECT year, semester, dept FROM Courses"
cur.execute(query)
result = cur.fetchall()
sol = set(result)
ans = set([(2018, 'Fall', 'ACCY')])

assert_equal(len(result), 36, msg="Your answer does not match the solution.")
assert_equal(ans, sol, msg="Your answer does not match the solution.")

-----

## Problem 4: Select 500-level courses 

In the code cell below, we declare a function named `select_500` that takes one function parameter: `con`, which is the database connection.

To complete this problem, finish writing the function `select_500`:
- Use Pandas read_sql() to load all courses with course_num greater or equal to 500 in `Courses` table to a DataFrame.
- Return the DataFrame.

-----

In [None]:
def select_500(con):
    '''
    Select courses that are 500 level.
    
    Parameters
    ----------
    con: sqlite3 connection
    
    Returns
    -------
    a DataFrame that contains the course info
    '''
    
    # YOUR CODE HERE
    

In [None]:
df500 = select_500(con)
assert_equal(df500.shape[0], 19, msg="Your answer does not match the solution")

In [None]:
#Release database connection and cursor
cur.close()
con.close()