Replies: 3 comments 3 replies
-
Hi! This is a very interesting idea and the performance of databend is very impressive according to the report 🚀 . As you said, both databend and datafusion have memory models based on Arrow, so there is not much difference between us. But I'm not familiar with databend's execution engine, so I don't know how much it differs from datafusion in terms of functionality and data type support. And, although we have abstracted the execution engine, there are still many places where we are tied to DataFusion, so switching (or adding support for the second engine) would require a lot of work for both parties. But I'm very optimistic about this proposal. Let's keep communicating on the details ❤️ ! |
Beta Was this translation helpful? Give feedback.
-
The query engine and pipeline framework in Databend look promising. If we were to adopt them, where to start first? @sundy-li |
Beta Was this translation helpful? Give feedback.
-
Cool! I'm very excited about this. @sundy-li GreptimeDBThere are some blockers that greptimedb needs to resolve or workaround:
DatabendThere are also some points that databend might need to consider:
Share People's EffortsRecently, I'm going to re-design and refactor our type system, including:
I just realize that there are might be some other projects repeating the same work:
Could we implement something like query engine building blocks, such as the type system, expression framework, and execution framework (like velox)? So we could combine people's efforts and eliminate some repeating work? DataFusion tries to solve similar things, but it's highly coupled with the official arrow's API. ConclusionIt'd be great if users can have more choices. Maybe extracting databend's type system into an independent library would be a good starting point. I can also help bridge arrow and arrow2. |
Beta Was this translation helpful? Give feedback.
-
Hello, greptimedb members.
I noticed that greptimedb's query engine is powered by datafusion, which it's similar way like
influxdb-iox
.Databend has a high-performance computing engine based on the Arrow memory model. I saw that greptimdb is depended on opendal and its type system and expression referred to the old version of databend's.
So I am wondering if we can build the computation layer of greptimedb on top of it.
Advantages:
We have a high-performance computing engine with
expression
,function
, andaggregate function
support, reference to clickbench. greptimedb can focus on optimizing for time-series scenarios.Greptimedb
can reuse thepipeline
framework of databend.Disadvantages:
Beta Was this translation helpful? Give feedback.
All reactions