# Ver Introduction

Ver is a data discovery system that identifies project-join views over large repositories of tables that do not contain join path information. We deploy Ver on [Chicago Open Data](https://data.cityofchicago.org/) and showcase its capacity to assist a school counselor in identifying a view related to school information. Ver enables the counselor to specify the data needs via an example query, searches for relevant views, distills the results, and finally guides them to the right view by asking a series of data-related questions.

<p align="center">
     <img src="../docs/img/architecture.jpeg" width="400">
</p>


# Setup Ver

Initiate a Ver instance

In [None]:
from demo import Ver
ver = Ver(data_path="/Users/yuegong/Desktop/chicago_open_data_all_05_18/")

# View Specification

Anna is a school counselor in Chicago and she wants to help parents and students choose which schools to attend. The data she needs is a table that contains information about every public school in Chicago, including the school name, its school type (e.g., Charter, Neighborhood, etc), and its school day (e.g., Full day, half day).

Anna knows the information about a few schools in Chicago. For example, Ogden International High School has the type Charter and its school day is half day; Hyde Park High School has the type Neighborhood and its school day is full day. She would like the final view to include these examples. The QBE interface of View-Specification enables Anna to specify the example data to illustrate the view she desires.

In [None]:
ver.view_specification()

# Examine candidate columns and join graphs

Ver provides APIs for users to examine candidate columns and the join graphs to assemble these views.

## Show candidate columns

In [None]:
ver.show_candidate_columns()

## Show join graphs

In [None]:
ver.show_join_graphs()

# Help users locate the right view

We find more than 100 views! It is overwhelming to go through these views manually. How to reduce the veiw search space and help users find the right view?

## View Distillation

View-Distillation reduces the view search space by first classifying candidate views into four categories and then applying a distillation strategy.

In [None]:
ver.view_distillation()

## View Presentation

View-Presentation analyzes the views and generates questions that aid in ranking and selecting the most relevant views. 

In [None]:
ver.view_presentation()