Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to a distributed kv backend? #202

Closed
jh125486 opened this issue Oct 30, 2017 · 8 comments
Closed

Refactor to a distributed kv backend? #202

jh125486 opened this issue Oct 30, 2017 · 8 comments

Comments

@jh125486
Copy link

Has anyone looked at libkv?
It just abstracts a key value store, be it local (BoltDB) or distributed (Consul, Etcd, Zookeeper).

While not a full solution, it might be a good enough stopgap between maintaining one monolithic binary and distributing the kv store. (libkv is licensed Apache 2.0)

@nilslice
Copy link
Contributor

I haven't, but will give it a look. How would you foresee this fitting in to Ponzu?

@jh125486
Copy link
Author

I can take a look this weekend into refactoring the Bolt reads/writes to libkv instead.
Going forward I think that would alleviate a lot of the concern around a single point of failure for the API (DB node going offline).

@nilslice
Copy link
Contributor

That would be a huge improvement -- especially if it can be opted into. I see a lot of use cases where a micro-service approach could use a more available data layer, but many Ponzu instances I've created or have been shown are pretty simple, single node applications and I'd like to continue building for that kind of user. Ideally we can support both, but the goal is to make Ponzu simple and usable for Go programmers of all levels.

Thank you for exploring this -- I really like the idea.

@ghost
Copy link

ghost commented Jan 20, 2018

Part of the reason for using boltdb is because the freetext search system called bleve that is part of Ponzi needs boltdb.

But what about if we split the storage db from the indexing DB ! I think that will give a much better architecture:

  • pipelined so that write to each DB are eventually consistent
  • opens up the ability to store mutations and rebuilt from those.
  • use badger / dgraph to get multi master HA, and bolt DB for indexing. Because the first part of the pipeline would hit the storage layer of dgraph. / Badger that spreads and replicated the 2nd part of the pipeline that being the indexing would automatically get replicated. Pretty powerful solution.

I think Ponzi add a small layer to do the pipelining without going overboard with a control plane like NATS. This is because Ponzi has a central core where everything passes through and so it can do the pipelining into the 2 databases.

This also opens the door to adding other specialised databases and of course exposing them through the Ponzi API.
For example there is an amazingly fast indexer for structured data call Pilosa that is written in golang and can be used by the code generator and the API. Pilosa can do very very fast queries on data.

In the end I guess I am discussing a semi CQRS style architecture where the master master database of dgraph is your primary storage layer, which can also hold mutations and so make sure event fired into other DB stores are resilient. Normally people put a message queue with storage like NATS on top of many databases / microservices. NATS is amazing but perhaps too heavy.

Also it would at the very least give Ponzi a HA multimaster DB that can do pretty much anything. The Ponzi concept of references would have to change because you now get edges and nodes as a way of things referencing each other. This is more flexible.

Anyway this was a long stream of an idea about changing databases :)

@ghost
Copy link

ghost commented Jan 20, 2018

Oh I forgot.. dgraph uses bleve for fulltext search and facets btw.. bonus

@nilslice
Copy link
Contributor

Interesting ideas, @gedw99. I think we'd all benefit from an enhanced DB architecture and to support HA across more nodes. I haven't had a chance to evaluate the projects you mentioned but fully intend to. Thanks for keeping an eye on this!

@ghost
Copy link

ghost commented Jan 22, 2018

Control plane, pub sub, cqrs
https://github.com/nats-io/go-nats

  • Runs as its own Service btw, rather than a lib..
  • Uses telnet protocol as its core communication layer between nodes.
  • Can wrap the API with websockets, grpc, etc as needed or not.
  • Read about the Difference between Orchestration & Choreography to really understand the WHY around all this. In a nutshell, you can produce a pipeline using either design pattern. Orchestration is when you code the pipeline ( liek to do and then do y). Choreography is when you get different systems to sublish and subscribe to events ( type X pubscribes to type Y's created item event, and type Z subscribes to type X's deleted event, and turtles all the way down as they say) and so through everyone subscribing to everyone else a pipeline EMERGES, Its an emergent design pattern if you want to think about top level Design pattern ways of thinking..

Crazy fast, adaptable indexer for structured data
https://github.com/pilosa

Dgraph code that handles the Indexer for non structured data that encompasses what is often called "Full Text Search (fts)" & "Faceted search"
code: https://github.com/dgraph-io/dgraph/blob/master/tok/fts.go

In a nutshell facets allows you to search across all data by slicing it. Like on Amazon when you shopping for a TV and you choose filters on the left. LED, 50 to 60 inches. Or when applied to big data you can say all users ages between 10 and 20, female, live in Australia, and then as you get a result set the facets on the left CHANGE to show the sub facets you can choose. This is really the Unique thing about facets the fact that the you get a result set and then all the possibel facets then update on the left of the GUI. Very intuitive and allows normal users to essentially query very complex and difference data. Seem very nice for Ponzu because ponzu is all about creating types and then usin them

@olliephillips
Copy link
Contributor

Closing this issue. No activity in 12 months. Please feel free to reopen if need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants