Skip to content

UCDBG/PEDS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PEDS Overview

PEDS is a framework that allows users to estimate the price of data shared between buyers and sellers and generate explanations of the estimated price. It is an extension of GProM (https://github.com/IITDBGroup/gprom) that adds provenance support for complex queries on relational database systems. Provenance is information about how a query's result was produced over several database operations. That is, for a row in a table returned by a query we capture from which rows it was derived from the input table and by which operations. PEDS builds on the capabilities of GProM to rewrite input queries into rewritten queries for more complex actions. PEDS captures where and how provenance through annotations and their respective columns along with calculating a distance metric between two tuples during integration of data. PEDS also provides meaningful top-k patterns as an explanation that are extracted based on various metrics determining the pattern's contribution to the estimated price.

Simple Demo

To run a simple PEDS sinerio, you can write a command in the following format:

  • to estimate the price
    • ./scripts/eig_run.sh ${log_level} "IG OF (${query});"
    • Example: ./scripts/eig_run.sh 3 "IG OF (select * from owned o FULL OUTER JOIN shared s ON(o.county = s.county AND o.year = s.year));"
  • to compute explanations
    • ./scripts/eig_run.sh ${log_level} "IGEXPL TOP ${k} OF (${query});"
    • Example: ./scripts/eig_run.sh 3 "IGEXPL TOP 10 OF (select * from owned o FULL OUTER JOIN shared s ON(o.county = s.county AND o.year = s.year));"

Below, we show sample data from a real-world Air Quality Index dataset(AQI) for the example queries above. This demo shows a simple sinerio to familiarize the users with two of PEDS functionality.

  • (i) That computed the degree of new information and
  • (ii) That shows meaningful patterns found after integration step.
sample data for owned 

 year | county    | dayswaqi | maqi | 
-------------------------------------
 2021 | Colbert   | 274      | 200  |
 2021 | Jackson   | 366      | 200  |
 2022 | Jefferson | 348      | 271  |
 2022 | Autauga   | 179      | 177  |

sample data for shared

 year | county    | gdays | maqi | 
----------------------------------
 2021 | Jackson   | 85   | 156  |
 2022 | Colbert   | 66   | 200  |
 2022 | Jefferson | 66   | 221  |
 2021 | Colbert   | 66   | 168  |
 2022 | Autauga   | 122  | 177  |

output for first command. Shows IG only

 year | county    | dayswaqi | maqi | gdays | IG_year | IG_county | IG_dayswaqi | IG_maqi | IG_gdays | Total_IG |
-----------------------------------------------------------------------------------------------------------------
 2021 | Colbert   | 274      | 168  | 66    | 0       | 0         | 0           | 2       | 2        | 4        |
 2021 | Jackson   | 366      | 156  | 85    | 0       | 0         | 0           | 3       | 4        | 7        |
 2022 | Jefferson | 348      | 221  | 66    | 0       | 0         | 0           | 5       | 2        | 7        |
 2022 | Autauga   | 179      | 177  | 122   | 0       | 0         | 0           | 0       | 5        | 5        |
 2022 | Colbert   | null     | 200  | 66    | 0       | 0         | 0           | 3       | 2        | 5        |

output for second command. Shows the best patterns and the f_score based on which they are ranked on

 year | county    | dayswaqi | maqi | gdays | imp | info | cov | f_score |
-------------------------------------------------------------------------- 
 2022 | *         | *       | *     | 66    | 12  | 2    | 2   | 11.29   |
 *    | Colbert   | *       | *     | 66    | 9   | 2    | 2   | 10.29   |
 *    | *         | *       | *     | 66    | 16  | 1    | 3   | 9.14    |
 2021 | *         | *       | *     | *     | 17  | 1    | 3   | 5.87    |
 2022 | *         | *       | *     | *     | 11  | 1    | 2   | 4.25    |

Installation

PEDS installation follows the installation of GProM. The wiki has detailed installation instructions. The installation follows the standard procedure using GNU build tools. Checkout the git repository, install all dependencies and run:

./autogen.sh
./configure
make
sudo make install

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published