Skip to content

xerial/silk

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.

Silk: A framework for managing SQL data flows.

http://xerial.org/silk

Examples

import xerial.silk.core._

import sampledb._

// SELECT count(*) FROM nasdaq
def dataCount = nasdaq.size

// SELECT time, close FROM nasdaq WHERE symbol = 'APPL'
def appleStock = nasdaq.filter(_.symbol is "APPL").select(_.time, _.close)

// You can use a raw SQL statjement as well:
def appleStockSQL = sql"SELECT time, close FROM nasdaq where symbol = 'APPL'"

// SELECT time, close FROM nasdaq WHERE symbol = 'APPL' LIMIT 10
appleStock.limit(10).print

// time-column based filtering
appleStock.between("2015-05-01", "2015-06-01")

for(company <- Seq("YHOO", "GOOG", "MSFT")) yield {
  nasdaq.filter(_.symbol is company).selectAll
}

Milestones

  • Build SQL + local analysis workflows

  • Submit queries to Presto / Treasure Data

  • Run scheduled queries

  • Retry upon failures

  • Cache intermediate results

  • Resume workflow

  • Partial workflow executions

  • Sampling display

    • Interactive mode
  • Split a large query into small ones

    • Differential computation for time-series data
  • Windowing for stream queries

  • Object-oriented workflow

  • Input Source: fluentd/embulk

  • Output Source:

  • Workflow Executor

    • Local-only mode
    • Register SQL part to Treasure Data
    • Run complex analysis on local cache
    • UNIX command executor