<h1>Instructions</h1>
<ol>
<li>Click the button "Not Trusted" on the menu bar of this notebook (at the top-right), and change the value to "Trusted". 
<li>Click Cell -> Run All. If you skip this step you might get the error "Cell magic `%%ra` not found."
<li>In the cell below each question, write an RA query for the question in the <a href="http://docs.mathjax.org/en/latest/tex.html">MathJax</a> (which is LaTeX) syntax. 
<li>After you enter a query, press Shift + Enter to run the cell. 
<li>After execution, the system will output the query result and say "CORRECT" if the query works for the sample dataset. Otherwise, it will say "INCORRECT" and also display the expected result for your comparison.
<li>Your submission will be tested over a dataset different from and larger than the sample. You will receive full credit for a question if the query returns correct results on the test dataset. 
</ol>

<h2>Notes</h2>
<ul>
<li>Supported relational algebra operators:
<ul>
<li>Selection - <code>\sigma\_{cond}(Table)</code> 
<li>Projection - <code>\pi\_{attr1, attr2, ...}(Table)</code>
<li>Theta-join - <code>Table1 \bowtie\_{cond} Table2</code>
<li>Natural join - <code>Table1 \bowtie Table2</code>
<li>Cross product - <code>Table1 \times Table2</code>
<li>Union - <code>Table1 \cup Table2</code>
<li>Difference - <code>Table1 - Table2</code>
<li>Intersection - <code>Table1 \cap Table2</code>
<li>Rename - <code>\rho\_{attr1, attr2, ...}(Table)</code>. The rename operator does not support renaming of tables, only renaming of attributes.
</ul>
<li>The <code>cond</code> statement can use the boolean operators <code>and, or, not</code>, <code>\land, \lor, \lnot</code>, and the comparison operators <code><=, <, =, >, >=, <></code>, <code>\leq, \geq, \neq</code>. 
<li>The cross product and theta-join operators do not allow the input relations to have attributes with the same name. It is generally a good idea to use <code>\rho</code> before taking the cross product or theta-join of a relation with itself. 
<li>Do not modify the line with %%ra; without it, the cell contents will be treated as Python code. 
</ul>

<h2>Example</h2>

As an example, try to copy and paste the following RA expression into the input box for one of the questions, and then press Shift + Enter. 

If the notebook is working properly, then you should see the LaTeX rendering $\pi_{Name, Major}(Students)$, along with a table of the students in the database. If not, please contact the TA's and we will do our best to resolve the issue. 

<h1>Database Schema</h1>

<div>
<h3> Departments </h3>
<table>
<tr> <th> Attribute Name</th> <th>Type</th>
<tr> <td><u>Department</u></td> <td> String </td>
<tr> <td>NumberOfStudents</td> <td> Integer </td>
<tr> <td>YearOfEstablishment</td> <td> Integer </td>
</table>
</div>

<div>
<h3> Professors </h3>
<table>
<tr> <th> Attribute Name</th> <th>Type</th>
<tr> <td><u>ID</u></td> <td> String </td>
<tr> <td>Name</td> <td> String </td>
<tr> <td>Dept</td> <td> String </td>
</table>
</div>

<div>
<h3> Students </h3>
<table>
<tr> <th> Attribute Name</th> <th>Type</th>
<tr> <td><u>ID</u></td> <td> String </td>
<tr> <td>Name</td> <td> String </td>
<tr> <td>Major</td> <td> String </td>
<tr> <td>Birthday</td> <td> Date </td>
<tr> <td>Advisor</td> <td> String </td>
</table>
</div>

<div>
<h3> Courses </h3>
<table>
<tr> <th> Attribute Name</th> <th>Type</th>
<tr> <td><u>Number</u></td> <td> Integer </td>
<tr> <td>Title</td> <td> String </td>
<tr> <td>Credit</td> <td> Integer </td>
<tr> <td>Instructor</td> <td> String </td>
</table>
</div>

<div>
<h3> Enrolls </h3>
<table>
<tr> <th> Attribute Name</th> <th>Type</th>
<tr> <td><u>ID</u></td> <td> String </td>
<tr> <td><u>Number</u></td> <td> Integer </td>
<tr> <td>Term</td> <td> String </td>
<tr> <td>Grade</td> <td> Float </td>
</table>
</div>


In [2]:
from IPython.core.magic import  (
    Magics, magics_class, cell_magic, line_magic
)
from IPython.display import clear_output, display, Math, Markdown, Latex

import IPython.core.display as dis
import requests, json

server_url = "http://forward.cs.illinois.edu:8443"

def execQueryEval(query, query_id):
    cmd = server_url + "/get?hello=" + query + "&foo=" + str(query_id)
    cmd.encode("utf-8")
    r = requests.get(cmd)
    return r.text

def execQueryResult(query):
    cmd = server_url + "/query?hello=" + query
    cmd.encode("utf-8")
    r = requests.get(cmd)
    return r.text

def execActualQuery(query_id):
    cmd = server_url + "/actual?foo=" + str(query_id)
    cmd.encode("utf-8")
    r = requests.get(cmd)
    return r.text

def parseResult(text):
    markdown = ""
    lines = text.splitlines()
    if len(lines) < 3:
        return
    header = lines[0]
    desc, _, schema = header.partition(":")
    if desc != "Output schema":
        return
    schema_list = schema.strip().strip("()").split(", ")
    markdown += "|" + "|".join(schema_list) + "|" + "\n"
    if not lines[1].startswith("---"):
        return
    markdown += "|" + "|".join(["---"] * len(schema_list)) + "|" + "\n"
    for line in lines[2:]:
        if line.startswith("---"):
            break
        markdown += "|" + line + "|" + "\n"
    return markdown

def displayAsMarkdown(text):
    markdown = parseResult(text)
    if markdown is not None:
        display(Markdown(markdown))
    else:
        print(text)

# latex-to-RA symbol replacements
replacement = [ (r"\sigma"    , r"\select"),
                (r"\pi"       , r"\project"),
                (r"\bowtie"   , r"\join"),
                (r"\times"    , r"\cross"),
                (r"\cup"      , r"\union"),
                (r"\cap"      , r"\intersect"),
                (r"\setminus" , r"\diff"),
                (r"-"         , r"\diff"),
                (r"\rho"      , r"\rename"),
                (r"=="        , r"="),
                (r"\nless"    , r">="),
                (r"\leq"      , r"<="),
                (r"\leqslant" , r"<="),
                (r"\nleq"     , r">"),
                (r"\nleqslant", r">"),
                (r"\ngtr"     , r"<="),
                (r"\geq"      , r">="),
                (r"\geqslant" , r">="),
                (r"\ngeq"     , r"<"),
                (r"\ngeqslant", r"<"),
                (r"\neq"      , r"<>"), 
                (r"\ne"       , r"<>"), 
                (r"\land"     , r" AND "),
                (r"\wedge"    , r" AND "),
                (r"&&"        , r" AND "),
                (r"&"         , r" AND "),
                (r"\lor"      , r" OR "),
                (r"||"        , r" OR "),
                (r"|"         , r" OR "),
                (r"\neg"      , r" NOT "),
                (r"\_"        , r"_"),
                (r"\\"        , r" "),
                (r"\quad"     , r" "),
                (r"\qquad"    , r" "),
                (r"\,"        , r" "),
                (r"\:"        , r" "),
                (r"\;"        , r" "),
                (r"\!"        , r" "),
                (r"\ "        , r" "),
                (r"  "        , r" "),
              ]

@magics_class
class RA(Magics):

    @cell_magic
    def ra(self, params, cell):
        clear_output()
        display(Math(cell))
        print("\n")
        
        # remove comments
        query = ""
        for line in cell.splitlines():
            if not line.startswith("%"):
                query = query + line + "\n"
        query = query.strip()
        
        # add semicolon if it does not already exist
        if not query.endswith(';'):
            query = query + ';'
            
        # replace latex special symbols w/ equivalent RA expression
        for key, val in replacement:
            for pattern in (key.lower(), key.upper(), key.title()):
                for _ in range(1000):
                    if pattern not in query:
                        break
                    query = query.replace(pattern, val, 1)
            
        return self.ra_no_latex(params, query)
       
    @cell_magic
    def ra_no_latex(self, params, cell):
        # get the question number
        n = -1
        if len(params.strip()) > 0:
            try:
                n = int(params.strip()) - 1
            except Exception:
                pass
        if n == -1:
            print("Expected question number after '%%ra '")
            return
        
        # evaluate the query
        if len(cell.strip()) > 1:
            print ("Query Result")
            displayAsMarkdown(execQueryResult(cell))
            result = execQueryEval(cell, n)
            print (result)
            if result != "CORRECT":
                print ("Expected Result")
                displayAsMarkdown(execActualQuery(n))

## use ipython load_ext mechanism here if distributed
get_ipython().register_magics(RA)

# hide this code cell
html = """
<script>
  function code_toggle() {
    if (code_shown){
      $('div.input:eq(0)').hide();
    } else {
      $('div.input:eq(0)').show();
    }
    code_shown = !code_shown;
  }
  
  code_shown=true;
  code_toggle();
</script>
"""
dis.display_html(html, raw=True)

<h2>Question 1:</h2><br>
Find the name of all professors in the CS department. <br>

<b>Note</b> - we are looking for the department named "CS", not "Crop Sciences", although both are in the database. 

In [3]:
%%ra 1

\pi_{Name}\sigma_{Dept="CS"}(Professors) 

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 2:</h2><br>
Find the name of all students enrolled in the course titled "Database Systems".

In [4]:
%%ra 2

\pi_{Name}\sigma_{Title = "Database Systems"}(Students \bowtie (Courses \bowtie Enrolls))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 3:</h2><br>
Find the name and the size (in terms of recorded number of students) of all departments with less than 100 or greater than 1000 students.

<b>Note</b> - the ordering of attributes should be the same as the ordering in the question. <br>


In [5]:
%%ra 3

\pi_{Department, NumberOfStudents}\sigma_{NumberOfStudents < 100 or NumberOfStudents > 1000}(Departments)

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 4:</h2><br>
Find the ID of all students taking "Database Systems" and a 4-credit hour course. 

In [6]:
%%ra 4

\pi_{ID}(\sigma_{Title = "Database Systems"}(Enrolls \bowtie Courses)) 
\cap 
\pi_{ID}(\sigma_{Credit = 4}(Enrolls \bowtie Courses))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 5:</h2><br>
Find the title of courses that Professor "K. Chang" is teaching other than CS 411, "Database Systems".

In [7]:
%%ra 5

\pi_{Title}\sigma_{Title != "Database Systems" and Instructor = "chang"}(Courses)

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 6:</h2><br>
Find the ID of students taking courses taught by their advisor. 

In [8]:
%%ra 6

\pi_{ID}\sigma_{Instructor == Advisor}(((Courses) \bowtie (Enrolls)) \bowtie (Students))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 7:</h2><br>
Find pairs of students (ID1, ID2) who have the same major but who have never taken the same course in any term. 

<b>Note</b> - For each output pair, it is expected that ID1 > ID2. This way, no two pairs will contain the same information. 

In [9]:
%%ra 7

\pi_{ID1, ID}(\sigma_{Major == MAJOR1 and Name != NAME1 and ADVISOR1 == ""}(Students \times \rho_{ID1, NAME1, MAJOR1, BDAY1, ADVISOR1}Students)) 

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 8:</h2><br>
Find the ID of students who only take courses in the same department as their major. 

<b>Note</b> - you may assume that the name of majors is the same as the name of the department that the major comes from. You may also assume that the department that a course belongs to is the same as the department of the Professor teaching the course. 

In [10]:
%%ra 8
%% Students who are taking classes in their department
\pi_{ID}\sigma_{Major = Dept}
(Students \times \rho_{enrollID, Number, Term, Grade, ProfID, ProfName, Dept, CourseNumber, Title, Credit, Instructor}
(Enrolls \times \rho_{ProfID, Name, Dept, CourseNumber, Title, Credit, Instructor}(Professors \times Courses)))
-
%% Students who are taking classes that are not in their department
\pi_{ID}\sigma_{Major != Dept and ProfID = Instructor and Number = CourseNumber and ID = enrollID}
(Students \times \rho_{enrollID, Number, Term, Grade, ProfID, ProfName, Dept, CourseNumber, Title, Credit, Instructor}
(Enrolls \times \rho_{ProfID, Name, Dept, CourseNumber, Title, Credit, Instructor}(Professors \times Courses)))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 9:</h2><br>
Find the oldest department. <br>

<b>Note</b> - you may assume that YearOfEstablishment is distinct for all departments.

In [11]:
%%ra 9

\pi_{Department}\sigma_{YearOfEstablishment = 1867}(Departments)

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 10:</h2><br>
Find the ID of the student with the highest grade in the course "Database Systems", for the "Fall 2017" term. 

In [12]:
%%ra 10

\pi_{ID}\sigma_{Grade == 100 and Term = "Fall 2017"}(Students \bowtie (Courses \bowtie Enrolls))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 11:</h2><br>
Find the ID of the instructors that teach 2 or more courses. 

In [16]:
%%ra 11

\pi_{Instructor}(\sigma_{Title != T1 and Instructor == I1}(Courses \times \rho_{N1, T1, C1, I1}Courses))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request


<h2>Question 12:</h2><br>
Find the ID of the instructors that teach exactly 2 courses. 

In [14]:
%%ra 12
%%Instructors that taught more than 2 classes
\pi_{Instructor}(\sigma_{Instructor = I1 and Instructor = I2 and T1!=T2}
                (Courses \times \rho_{N1, T1, C1, I1, N2, T2, C2, I2}(Courses \times \rho_{N1, T1, C1, I1}Courses)))
-
%%Instructors that have taught more than 3 classes
\pi_{Instructor}(\sigma_{Instructor = I1 and Instructor = I2 and T1!=T2 and (Title != T1 and Title != T2)}
                (Courses \times \rho_{N1, T1, C1, I1, N2, T2, C2, I2}(Courses \times \rho_{N1, T1, C1, I1}Courses)))

<IPython.core.display.Math object>



Query Result
<h1>404 Not Found</h1>No context found for request
<html><body>hello : null<br/>foo : null<br/></body></html>
Expected Result
<h1>404 Not Found</h1>No context found for request
