Skip to content
This repository has been archived by the owner on Mar 8, 2022. It is now read-only.

SQLancer (incomplete)

bialesdaniel edited this page Dec 13, 2020 · 2 revisions

Overview

SQLancer is a tool that runs metamorphic tests against database management systems. We were not able to finish the work relating to SQLancer. This document explains what has been done and what our vision was. The goal was to improve the ability to find correctness issues in the NoisePage system.

Technologies

  • SQLancer Test Script (Not Started)
    • Python
  • SQLancer
  • Performance Storage Service
    • Django
    • Github API
  • Github App
  • Database
    • TimescaleDB

Current State

Currently, Jeff is working to get SQLancer to work with NoisePage. Once this is complete the plan is to submit a PR to the SQLancer repository to add the NoisePage configuration files. Then to work to understand the types of faults that are exposed by running SQLancer.

Runtime View (Future State)

The SQLancer test script will be run in the nightly pipeline. Any faults found will be sent via API call to the performance storage service. The python service stores the faults in a database. After all faults from the SQLancer execution have been stored the python service will evaluate the priority of all unresolved faults in the database. The priority will initially be determined by some rules engine (the rules have yet to be determined). Eventually, we could use some simple ML to prioritize the faults. Then the performance storage service will create a Github issue for the top X faults. The goal is to never exceed X SQLancer issues in Github. Therefore, if X=5 and there are 3 SQLancer issues in Github already and 5 new faults are found, then the performance storage service will create 2 new issues.

When a developer resolves a SQLancer generated issue the GitHub bot tells the performance storage service to update the database to indicate that the issue was resolved. Then the performance storage service will find the highest priority fault that does not have a corresponding Github issue and it will generate an issue for that fault.

When a developer marks an issue created by the python service as "won't resolve" the GitHub bot makes an API call to tell the performance storage service to update the database. It records that the issue won't be resolved and the reason provided by the developer. This information is helpful to improve the rule engine and eventually the ML model.

Schema

TBD. Currently, no schema for storing faults has been designed. The idea would be to capture the information from SQLancer along with important contextual information that could be helpful in debugging the fault. Additionally, we would want to capture the priority that our system assigned the fault along with any feedback from the developers confirming the priority or contradicting the priority. We would also want to capture which Github issues the fault is associated with and whether the fault was resolved or not.

Design Decisions

Top X Faults

The reason that we are not creating Github issues for every fault is that we feel it would overwhelm the developers. This could result in all SQLancer generated issues being ignored. By limiting it to X Github issues then we reduce it to a manageable amount of work. When one issue is resolved it is replaced by the next highest priority fault at the time. This ensures that developers are only working on the most important faults.

Unknowns

  • Database: We didn't decide what database makes the most sense. It probably makes sense to use the existing TimescaleDB database but use it as a traditional PostgresDB instead of time-series with a hypertable.
  • Database Schema: We didn't figure out what data would be available from SQLancer and what would be stored in the database.
  • Rule Engine: Currently, we don't know what type of faults will be discovered by SQLancer so we weren't able to determine what rules would go into determining fault priority. To start we could just create issues randomly and record the time to resolve. That would help to start gathering information about which faults are more critical.
  • API: We didn't design the API endpoints. However, it seems like the API would need to support the following features:
    • Create a fault (the posted fault could already exist so it would need to check for that)
    • Update a fault (to mark as resolved or won't resolve)
    • Generate Github issues (A way to force the evaluation of the faults and generate top X faults as Github issues)