# Querying hierarchical data with Rabbit
We reproduce the queries from the section on querying hierarcical data using Rabbit syntax.

## The database
We start with loading a sample hierarchical database.  Our sample database is derived from the dataset of all employees of the city of Chicago ([source](https://data.cityofchicago.org/Administration-Finance/Current-Employee-Names-Salaries-and-Position-Title/xzkq-xp2w)).

In [1]:
ENV["LINES"] = 15
push!(LOAD_PATH, "..")
using RBT
include("../citydb.jl")

citydb

Department:
  name: UTF8String {unique}
  employee: Employee {plural}
Employee:
  surname: UTF8String
  name: UTF8String
  department: Department
  position: UTF8String
  salary: Int64

We can execute a query using `fetch()` command:

In [2]:
fetch(citydb, "6*(3+4)")

42

## Traversing the hierarchy
*Find the names of all departments.*

In [3]:
fetch(citydb, "department.name")

35-element Array{UTF8String,1}:
 "WATER MGMNT"      
 "POLICE"           
 "GENERAL SERVICES" 
 "CITY COUNCIL"     
 "STREETS & SAN"    
 ⋮                  
 "BOARD OF ETHICS"  
 "POLICE BOARD"     
 "BUDGET & MGMT"    
 "ADMIN HEARNG"     
 "LICENSE APPL COMM"

*Find the names of all employees.*

In [4]:
fetch(citydb, "department.employee.name")

32181-element Array{UTF8String,1}:
 "ELVIA"     
 "VICENTE"   
 "MUHAMMAD"  
 "GIRLEY"    
 "DILAN"     
 ⋮           
 "NANCY"     
 "DARCI"     
 "THADDEUS"  
 "RACHENETTE"
 "MICHELLE"  

We are not restricted by the hierarchical structure of the database, so we can query employees directly.

In [5]:
fetch(citydb, "employee.name")

32181-element Array{UTF8String,1}:
 "ELVIA"     
 "VICENTE"   
 "MUHAMMAD"  
 "GIRLEY"    
 "DILAN"     
 ⋮           
 "NANCY"     
 "DARCI"     
 "THADDEUS"  
 "RACHENETTE"
 "MICHELLE"  

## Summarizing data
*Find the number of departments.*

In [6]:
fetch(citydb, "count(department)")

35

*Find the number of employees for each department.*

In [7]:
fetch(citydb, "department.count(employee)")

35-element Array{Int64,1}:
  1848
 13570
   924
   397
  2090
     ⋮
     9
     2
    43
    39
     1

*Find the total number of employees.*

In [8]:
fetch(citydb, "count(department.employee)")

32181

Again, we can query `employee` directly.

In [9]:
fetch(citydb, "count(employee)")

32181

*Find the top salary among all employees.*

In [10]:
fetch(citydb, "max(employee.salary)")

Nullable(260004)

*Find the maximum number of employees per department.*

In [11]:
fetch(citydb, "max(department.count(employee))")

Nullable(13570)

## Tabular output
*For each department, find the number of employees.*

In [12]:
fetch(citydb, "department:select(name,count(employee))")

35-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",1848)    
 ("POLICE",13570)        
 ("GENERAL SERVICES",924)
 ("CITY COUNCIL",397)    
 ("STREETS & SAN",2090)  
 ⋮                       
 ("BOARD OF ETHICS",9)   
 ("POLICE BOARD",2)      
 ("BUDGET & MGMT",43)    
 ("ADMIN HEARNG",39)     
 ("LICENSE APPL COMM",1) 

The `:select` notation is a syntax sugar for regular function call where the first argument is placed before the function name (postfix notation).

In [13]:
fetch(citydb, "select(department,name,count(employee))")

35-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",1848)    
 ("POLICE",13570)        
 ("GENERAL SERVICES",924)
 ("CITY COUNCIL",397)    
 ("STREETS & SAN",2090)  
 ⋮                       
 ("BOARD OF ETHICS",9)   
 ("POLICE BOARD",2)      
 ("BUDGET & MGMT",43)    
 ("ADMIN HEARNG",39)     
 ("LICENSE APPL COMM",1) 

It is easy to add new columns to the output.  Let us add *the top salary per department.*

In [14]:
fetch(citydb, """
    department
    :select(
        name,
        count(employee),
        max(employee.salary))
""")

35-element Array{Tuple{UTF8String,Int64,Nullable{Int64}},1}:
 ("WATER MGMNT",1848,Nullable(169512))    
 ("POLICE",13570,Nullable(260004))        
 ("GENERAL SERVICES",924,Nullable(157092))
 ("CITY COUNCIL",397,Nullable(160248))    
 ("STREETS & SAN",2090,Nullable(157092))  
 ⋮                                        
 ("BOARD OF ETHICS",9,Nullable(131688))   
 ("POLICE BOARD",2,Nullable(97728))       
 ("BUDGET & MGMT",43,Nullable(169992))    
 ("ADMIN HEARNG",39,Nullable(156420))     
 ("LICENSE APPL COMM",1,Nullable(69888))  

## Filtering data
*Find the employees with salary greater than $200k.*

In [15]:
fetch(citydb, """
    employee
    :filter(salary>200000)
    :select(name,surname,position,salary)
""")

3-element Array{Tuple{UTF8String,UTF8String,UTF8String,Int64},1}:
 ("GARRY","M","SUPERINTENDENT OF POLICE",260004)
 ("JOSE","S","FIRE COMMISSIONER",202728)        
 ("RAHM","E","MAYOR",216210)                    

*Find the number of employees with salary in the range from \$100k to \$200k.*

In [16]:
fetch(citydb, """
    employee
    :filter((salary>100000)&(salary<=200000))
    :count
""")

3916

*Find the departments with mode than 1000 employees.*

In [17]:
fetch(citydb, """
    department
    :filter(count(employee)>1000)
    .name
""")


Use "filter(count(employee)>1000)." instead.


7-element Array{UTF8String,1}:
 "WATER MGMNT"  
 "POLICE"       
 "STREETS & SAN"
 "AVIATION"     
 "FIRE"         
 "OEMC"         
 "TRANSPORTN"   

*Find the number of departments with more than 1000 employees.*

In [18]:
fetch(citydb, """
    count(
        department
        :filter(count(employee)>1000))
""")

7

*For each department, find the number of employees with salary higher than $100k.*

In [19]:
fetch(citydb, """
    department
    :select(
        name,
        count(employee:filter(salary>100000)))
""")

35-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",179)    
 ("POLICE",1493)        
 ("GENERAL SERVICES",79)
 ("CITY COUNCIL",54)    
 ("STREETS & SAN",39)   
 ⋮                      
 ("BOARD OF ETHICS",2)  
 ("POLICE BOARD",0)     
 ("BUDGET & MGMT",12)   
 ("ADMIN HEARNG",3)     
 ("LICENSE APPL COMM",0)

*For each department with the number of employees less than 1000, find the employees with salary higher than $125k.*

In [20]:
fetch(citydb, """
    department
    :filter(count(employee)<1000)
    :select(
        name,
        employee
            :filter(salary>125000)
            :select(name,surname,position))
""")

28-element Array{Tuple{UTF8String,Array{Tuple{UTF8String,UTF8String,UTF8String},1}},1}:
 ("GENERAL SERVICES",[("DAVID","R","COMMISSIONER OF FLEET & FACILITY MANAGEMENT"),("PHILLIP","S","EQUIPMENT SERVICES COORD")])                                                            
 ("CITY COUNCIL",[("JAMES","C","DEPUTY CHIEF ADMINISTRATIVE OFFICER"),("MARLA","K","CHIEF ADMINISTRATIVE OFFICER")])                                                                      
 ("FAMILY & SUPPORT",[("EVELYN","D","COMMISSIONER OF FAMILY AND SUPPORT SERVICES"),("MARY","G","DEPUTY COMMISSIONER"),("JENNIFER","W","FIRST DEPUTY COMMISSIONER")])                      
 ("IPRA",[("SCOTT","A","CHIEF ADMINISTRATOR"),("STEVEN","H","DEPUTY CHIEF ADMINISTRATOR"),("STEVEN","M","FIRST DEPUTY CHIEF ADMINISTRATOR"),("WILLIAM","W","DEPUTY CHIEF ADMINISTRATOR")])
 ("PUBLIC LIBRARY",[("BRIAN","B","COMMISSIONER OF CHICAGO PUBLIC LIBRARY"),("MICHELLE","F","DIRECTOR OF LIBRARY TECHNOLOGY"),("ANDREA","S","FIRST DEPUTY COMMISSIONE

## Compiling and executing queries
You can compile and execute queries separately.  To compile a query, use the `RBT.compile()` function.

In [21]:
q1 = RBT.compile(citydb, "department.employee.name")

Set(<department>) >> SeqMap(<employee>) >> IsoMap(<name>) : Array{UTF8String,1}

In [22]:
q2 = RBT.compile(citydb, "count(department)")

Count(Set(<department>)) : Int64

In [23]:
q3 = RBT.compile(citydb, "department:select(name,count(employee))")

Set(<department>) >> Tuple(IsoMap(<name>), Count(SeqMap(<employee>))) : Array{Tuple{UTF8String,Int64},1}

In [24]:
q4 = RBT.compile(citydb, "count(employee:filter((salary>100000)&(salary<200000)))")

Count(Set(<employee>) >> Sieve(((IsoMap(<salary>) > Const(100000)) & (IsoMap(<salary>) < Const(200000))))) : Int64

To execute a query, call the compiled query as a function.

In [25]:
q1()

32181-element Array{UTF8String,1}:
 "ELVIA"     
 "VICENTE"   
 "MUHAMMAD"  
 "GIRLEY"    
 "DILAN"     
 ⋮           
 "NANCY"     
 "DARCI"     
 "THADDEUS"  
 "RACHENETTE"
 "MICHELLE"  

In [26]:
q2()

35

In [27]:
q3()

35-element Array{Tuple{UTF8String,Int64},1}:
 ("WATER MGMNT",1848)    
 ("POLICE",13570)        
 ("GENERAL SERVICES",924)
 ("CITY COUNCIL",397)    
 ("STREETS & SAN",2090)  
 ⋮                       
 ("BOARD OF ETHICS",9)   
 ("POLICE BOARD",2)      
 ("BUDGET & MGMT",43)    
 ("ADMIN HEARNG",39)     
 ("LICENSE APPL COMM",1) 

In [28]:
q4()

3916