New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementing Event storage/search with a timeseries database or a lucene indexed database #110

Open
farodin91 opened this Issue Nov 3, 2016 · 9 comments

Comments

Projects
None yet
5 participants
@farodin91
Member

farodin91 commented Nov 3, 2016

Possible Databases

  • Elasticsearch (lucene)
  • Influxdb (timeseries)
  • Cassandra (Clustered Database)
  • TiKV key Value

I would like to hear your ideas.
For the start we could start, capsulated event handling a bit more.

@farodin91 farodin91 changed the title from Implementing Event storage and search with a timeseries database or a lucene indexed database to Implementing Event storage/search with a timeseries database or a lucene indexed database Nov 20, 2016

@sphinxc0re

This comment has been minimized.

sphinxc0re commented Nov 20, 2016

Would then part of the event lookup move into ruma/ruma-events ?

@farodin91

This comment has been minimized.

Member

farodin91 commented Nov 20, 2016

Currently, I don't think to move it into ruma-events, because this repos are used to define structures these could use by client or servers.

@sphinxc0re

This comment has been minimized.

sphinxc0re commented Nov 20, 2016

Also, I learned from working with InfluxDB, that TimeSeries DBMS are working best if they are filled with data by a rate of 1Set/(5sec to 5min)

@sphinxc0re

This comment has been minimized.

sphinxc0re commented Nov 20, 2016

I don't think this is the case with these events so I find this a little overkill

@mujx

This comment has been minimized.

Contributor

mujx commented Nov 20, 2016

What would be the benefit of this?

Seems like over engineering to me, at least at this point. Also adding an extra dependency will create problems with deployment. Synapse works fine without it.

@farodin91

This comment has been minimized.

Member

farodin91 commented Nov 20, 2016

I think, if we use Elasticsearch could reduce the complexity of sync massively. It could increase performance.

@sphinxc0re

This comment has been minimized.

sphinxc0re commented Nov 20, 2016

What about adding the possibility to choose whether the event processing should be done through ElasticSearch/redis/ on startup or through the config file?

@jimmycuadra

This comment has been minimized.

Member

jimmycuadra commented Nov 22, 2016

When I was originally trying to decide on the primary data store for Ruma, I was strongly considering RethinkDB, as its concept of a client subscribing to a updates on a query seemed like a great fit for Matrix's /sync endpoint. Since then, the company behind RethinkDB has gone out of business, which is a real shame, but there is an effort to keep the project going by the community. I'm definitely supportive of the idea of using a data store that better fits the use case for Ruma. I would prioritize homeserver performance over operational/deployment complexity. We can worry about how to make deployment easy for layman users when we start writing docs about deployment. I'm more concerned with Ruma being able to support a homeserver with a huge number of users than I am about making it easy for a layman to deploy it in the simplest case.

@skade

This comment has been minimized.

skade commented Dec 12, 2016

How would Elasticsearch reduce the complexity of sync? None of the mentioned products are particularly good at syncing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment