Skip to content

A Python framework for defining and querying BI models in your data warehouse

License

Notifications You must be signed in to change notification settings

hashboard-hq/hashquery

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hashquery

Hashquery is a Python framework for defining and querying BI models in your data warehouse.

Hashquery expressions are defined in Python, compiled into SQL, and run directly against your data warehouse. It is capable of expressing complex, multi-layered data queries, way beyond the capabilities of standard SQL. It natively integrates with upstream semantic layers and can be used together with Hashboard as a headless BI interface.

Hashquery is currently in early beta. This means we’ll do our best to keep it stable, but there are significant caveats to keep in mind as you try it out:

  • Backwards compatibility will not be preserved between version upgrades, and you may need to update your package regularly to keep things working.
  • You may encounter dialect-specific SQL syntax errors. If this happens, let us know by reporting an issue.
  • There may be differences in the SQL logic generated by hashquery compared to the Hashboard app.
  • Some errors might be hard to debug on your own (but we're happy to help!)

Project overview and live examples: https://hashquery.dev/

Full documentation: https://hashquery.dev/docs

Getting started

Install hashquery with:

pip install hashquery

You can verify everything's working by running models against the demo project:

>>> from hashquery.demo import demo_project
>>> demo_project.models.products.limit(5).run().df
                     id pizza_size pizza_shape  pizza_type  price
0  14278024243148112051      Large       Round      Custom  13.20
1   9154428932967574098     Medium       Round      Custom  11.00
2   2699960497371169210      Large      Square  Margherita  11.52
3  12910763551309720704     Medium       Round  Margherita   8.00
4   1363512534661457390     Medium      Square      Veggie  13.20

A full guide to querying your own datasources can be found in the documentation.

Examples

from hashquery import attr, column, func
from hashquery.demo import demo_project

# Simple metric
sales = demo_project.models.sales
sales_by_year = (
    sales.aggregate(
        groups=[column("timestamp").by_year],
        measures=[func.count()],
    )
    .sort(column("timestamp"))
)
print(sales_by_year.run().df)
"""
   timestamp  count
0 2018-01-01   1025
1 2019-01-01   1190
2 2020-01-01   2720
3 2021-01-01   4941
4 2022-01-01   4018
5 2023-01-01   3589
6 2024-01-01    642
"""
# Funnel analysis
events = demo_project.models.events_model
funnel = (
  events
	.with_activity_schema(
      group=attr.user_id,
      timestamp=attr.timestamp,
      event_key=attr.event_type
  )
  .funnel(
	  "ad_impression",
	  "add_to_cart",
	  "buy")
)
print(funnel.run().df)
"""
            step  count
0          count    501
1  ad_impression    495
2    add_to_cart    465
3            buy    264
"""

Supported integrations

During the beta, hashquery integrates with Hashboard to compile and execute SQL queries and supports the following databases:

  • BigQuery
  • Snowflake
  • PostgreSQL
  • MotherDuck / DuckDB
  • Redshift
  • Databricks
  • AWS Athena
  • ClickHouse

FAQ

  • How does hashquery fetch data? Hashquery models are compiled into SQL and run directly against your data warehouse. The backend caches results for performance, but there are no intermediate datastores — the computation all happens on your own infrastructure.
  • How does this differ from other analytics DSLs like LookML or Malloy? We built hashquery because we loved the expressiveness and power of data modeling languages like Malloy but wanted to avoid the portability and developer experience drawbacks of custom DSLs. Hashquery is a python package that can run anywhere, and it can be used as a querying language, a semantic layer, or a headless BI API — or all three at once.
  • Do I need to be a Hashboard user to use hashquery? During the beta, the hashquery SQL compiler is not available to run locally, so you do need to define your data connections inside of Hashboard and use its API to execute your queries. But we do plan on making the full hashquery stack available to run locally in the near future.

About

A Python framework for defining and querying BI models in your data warehouse

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages