Skip to content

scimas/Druid.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Druid.jl

Apache Druid querying library.

Installation

pkg> add Druid

Usage

Native Query

Druid native queries documentation

using Druid

client = Client("http://localhost:8888")

timeseries_query = Timeseries(
    dataSource=Table("wikipedia"),
    intervals=[Interval("2015-09-12","2015-09-13")],
    granularity=SimpleGranularity("hour"),
    aggregations=[Count("total_rows"), SingleField("longSum", "added", "documents_added")]
)

println(execute(client, timeseries_query))

SQL Query

Druid SQL documentation

using Druid

client = Client("http://localhost:8888")

sql_query = Sql(query="""
    SELECT FLOOR(__time TO HOUR) AS "timestamp", COUNT(*) AS "total_rows", SUM("added") AS "documents_added"
    FROM wikipedia
    WHERE __time >= TIMESTAMP '2015-09-12' AND __time < TIMESTAMP '2015-09-13'
    GROUP BY FLOOR(__time TO HOUR)
    ORDER BY "timestamp" ASC
""")

println(execute(client, sql_query))

Tables.jl compatibility

Most queries return the query response as an object compatible with the Tables.jl interface. So it is quite easy to convert the result into another compatible type, like DataFrame.

result = execute(client, query)
df = DataFrame(result)

Compatible queries: Timeseries, TopN, GroupBy, Scan, Search, Sql.

Sql query returns the result as either a Druid.SqlResult{ResultFormat} or a CSV.File depending on the resultFormat provided in the SQL query. Both are compatible with the Tables.jl interface.

TimeBoundary, SegmentMetadata and DatasourceMetadata return their results as Dicts.