# Module 8 Assignment


A few things you should keep in mind when working on assignments:

1. Run the first code cell to import modules needed by this assignment before proceeding to problems.
2. Make sure you fill in any place that says `# YOUR CODE HERE`. Do not write your answer anywhere else other than where it says `# YOUR CODE HERE`. Anything you write elsewhere will be removed or overwritten by the autograder.
3. Each problem has an autograder cell below the answer cell. Run the autograder cell to check your answer. If there's anything wrong in your answer, the autograder cell will display error messages.
4. Before you submit your assignment, make sure everything runs as expected. Go to the menubar, select Kernel, and Restart & Run all. If the notebook runs through the last code cell without error message, you've answered all problems correctly.
5. Make sure that you save your work (in the menubar, select File → Save and CheckPoint).

-----

# Run Me First!

In [1]:
import sqlite3 as sql
import pandas as pd

from nose.tools import assert_equal, assert_true

-----

## Problem 1: Creating a Table

In the code cell below, we declare a function named `create` that takes one function parameter: `file_path`, which is a string that contains the file path for the database file you will create.

To complete this problem, finish writing the function `create`:
- establish a sqlite3 connection to the database 'file_path',
- create a cursor using the connection to the db at "file_path",
- use this cursor to drop the `Checker` table if it exists,
- use this cursor to create the `Checker` table with the following column:
    - `cNum` with type int and should not accept NULL values,
- return the sqlite3 connection and cursor object.

-----

In [2]:
# connect to a datbase create a table
def create(file_path):
    '''
    Creates and establishes a connection to a database. Create table Checker.
    
    Parameters
    ----------
    file_path: string containing path to create database
    
    Returns
    -------
    con: sqlite3 connection
    cur: sqlite3 cursor
    '''
    
    ###BEGIN SOLUTION###
    con = sql.connect(file_path)
    cur = con.cursor() 
    query = 'DROP TABLE IF EXISTS Checker;'
    cur.execute(query)
    query = '''
    CREATE TABLE Checker (
        cNum INT NOT NULL
    );
    '''
    cur.execute(query)
    return con, cur
    ###END SOLUTION###

In [3]:
con, cur = create('sql_files/m12-high.db')
cur.execute("PRAGMA table_Info('Checker')")
result = cur.fetchall()

assert_true(result[0][1].upper()=='CNUM', msg='No cNum column created')
assert_true(result[0][2].upper()=='INT', msg='cNum should be of type integer')
assert_true(result[0][3]==1, msg='cNum should be not null')

-----

## Problem 2: Reading In Data

In the code cell below, we declare a function named `read` that takes one function parameter: `read_path`, which is a string that contains the file path for the **csv** file you will use.

To complete this problem, finish writing the function `read`:
- use Pandas read_csv() to read data from the csv file located at `read_path`,
- return the resulting Pandas DataFrame.

-----

In [4]:
def read(read_path):
    '''
    Read data from csv file "read_path" by using Pandas
    
    Parameters
    ----------
    read_path: string containing the file path from which to read data
    
    Returns
    -------
    Pandas DataFrame
    '''
    
    ###BEGIN SOLUTION###
    df = pd.read_csv(read_path)
    return df
    ###END SOLUTION###

In [5]:
df = read('data/ACCY_desc.csv')
assert_equal(set(df.columns.tolist()),
             set(['year', 'semester', 'dept', 'course_num', 'course_name', 'description']),
            msg='Your column names do not match the solutions')
assert_equal(len(df), 36, msg='There are 36 unique ACCY courses offered in Fall 2018 not %s'%(len(df)))

assert_equal(set(df['course_num'].tolist()),
            set([199, 200, 201, 202, 290, 301, 302, 303,
                 304, 312, 321, 398, 405, 410, 415, 451,
                 499, 500, 501, 502, 503, 504, 505, 510,
                 512, 515, 517, 518, 551, 556, 570, 585,
                 592, 593, 594, 599]),
             msg='Your answer does not match the solutions.')

print("First five ACCY Courses offered Fall 2018 at UIUC:")
df.head(5)

First five ACCY Courses offered Fall 2018 at UIUC:


Unnamed: 0,year,semester,dept,course_num,course_name,description
0,2018,Fall,ACCY,199,Undergraduate Open Seminar,May be repeated.
1,2018,Fall,ACCY,200,Fundamentals of Accounting,Survey course in the principles of accounting ...
2,2018,Fall,ACCY,201,Accounting and Accountancy I,Introduction to the role of accounting informa...
3,2018,Fall,ACCY,202,Accounting and Accountancy II,Continuation of ACCY 201 with focus on strate...
4,2018,Fall,ACCY,290,Prof Internship in Accountancy,Formalized learning experience in combination ...


-----

## Problem 3: Inserting Data

In the code cell below, we declare a function named `write_inserts` that takes two function parameters: `con`, which is a sqlite3 connection object, and `df`, which is the Pandas DataFrame created in Problem 2.

To complete this problem, finish writing the function `write_inserts`:
- insert all values from `df` to `Courses` table
- **donot** write index of df as a column into `Courses`
- replace table first if `Courses` already exists in the database

-----

In [6]:
def write_inserts(con, df):
    '''
    Parameters
    ----------
    con: sqlite3 connection
    df: pandas dataframe
    
    Returns
    -------
    None
    '''
    
    ###BEGIN SOLUTION###
    df.to_sql('Courses', con, index=False, if_exists='replace')
    ###END SOLUTION###       

In [7]:
write_inserts(con, df)
query = "SELECT year, semester, dept FROM Courses"
cur.execute(query)
sol = set(cur.fetchall())
ans = set([(2018, 'Fall', 'ACCY')])

assert_equal(ans, sol, msg="Your answer does not match the solution")

-----

## Problem 4: Select 500 level ACCY Courses 

In the code cell below, we declare a function named `select_500` that takes one function parameter.

To complete this problem, finish writing the function `select_500`:
- load all courses with course_num greater or equal to 500 in `Courses` table to a dataframe
- return the dataframe.

-----

In [8]:
def select_500(con):
    '''
    Selects ACCY courses that are 500 level
    
    Parameters
    ----------
    con: sqlite3 connection
    
    Returns
    -------
    a dataframe that contains the course info
    '''
    
    ###BEGIN SOLUTION###
    df = pd.read_sql("SELECT * FROM Courses Where course_num >= 500", con)
    return df
    ###END SOLUTION###

In [9]:
df500 = select_500(con)

assert_equal(df500.shape[0], 19, msg="Your answer does not match the solution")

-----

## Problem 5: Selecting Course by Descriptions

In the code cell below, we declare a function named `select_by_desc` that takes one function parameter.

To complete this problem, finish writing the function `select_by_desc`:
- load course if description **contains** 'Analytics' from the `Courses` table into a dataframe
- return the dataframe

-----

In [10]:
def select_by_desc(con):
    '''
    Grabs the courses if description contains 'Analytics'
    
    Parameters
    ----------
    con: sqlite3 connection
    
    Returns
    -------
    a dataframe
    '''
    
    ###BEGIN SOLUTION###
    df = pd.read_sql("SELECT * FROM Courses Where description LIKE '%Analytics%'", con)
    return df
    ###END SOLUTION###

In [11]:
df_analytics = select_by_desc(con)
assert_true(df_analytics.shape[0]==2, msg="Your answer is wrong.")
assert_true(570 in df_analytics.course_num.values, msg="Your answer is wrong.")
assert_true(512 in df_analytics.course_num.values, msg="Your answer is wrong.")
con.close()