Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional storage backends #638

Open
yurishkuro opened this issue Jan 8, 2018 · 23 comments
Open

Additional storage backends #638

yurishkuro opened this issue Jan 8, 2018 · 23 comments

Comments

@yurishkuro
Copy link
Member

@yurishkuro yurishkuro commented Jan 8, 2018

Opening this issue to keep track of other related issues.

Relevant issue: plugin support #422 (done).

@nbettiol
Copy link

@nbettiol nbettiol commented Jan 9, 2018

Did you remove the flags for elasticsearch in jaeger-collector? Because I'm doing a test using the image docker, which version is:

{"gitCommit":"dbd5db721fc59431b1e64874cc7d6265d89ec917","GitVersion":"v1.1.0","BuildDate":"2018-01-08T21:56:21Z"}

and I cannot see the elasticsearch flags.

@black-adder
Copy link
Collaborator

@black-adder black-adder commented Jan 9, 2018

It looks like you're using latest instead of 1.1. We recently moved around some of the flags so that we can support plugins better #625. Using latest, you have to instead use env variable SPAN_STORAGE=elasticsearch to use the elasticsearch flags. I'd recommend that you use 1.1 since this change will be apart of 1.2 and will be documented at that time.

@nbettiol
Copy link

@nbettiol nbettiol commented Jan 9, 2018

Thanks for the reply, yes I was using the latest version. I will use the 1.1

@fzakaria
Copy link

@fzakaria fzakaria commented Jan 16, 2018

I would love to see a SQL option (whatever ANSI SQL that will be least vendor lock-in).
Setting up Cassandra / ElasticSearch might be too ambitious for projects that want distributed tracing but honestly don't have the TPS to warrant a distributed datastore.

@ringerc
Copy link

@ringerc ringerc commented Feb 3, 2018

Since I work with PostgreSQL, I sure wouldn't complain. But honestly I'm not sure a SQL db is an optimal store for largely free-form metrics of this nature. PostgreSQL at least offers the jsonb type for indexable free-form data. If you're trying to do this in a vendor neutral way you'll land up with your own json blobs, or doing EAV, and both of those are terrible. ANSI SQL is a poor fit for variable-structured or key/value form data and you'll need some vendor extensions to get usable performance.

But you inevitably land up with someone putting an ORM on top to "abstract" the DB. Then the ORM performs terribly, gobbles memory and everyone says "the SQL backend is slow, use instead".

@pavolloffay
Copy link
Member

@pavolloffay pavolloffay commented Feb 5, 2018

Related issue to this one is #551. Upvote if you are interested in it.

@SwarnimRaj
Copy link

@SwarnimRaj SwarnimRaj commented Jun 29, 2018

New related issue-
Files - #894

@wy100101
Copy link

@wy100101 wy100101 commented Aug 1, 2018

We are looking at using BigQuery as a storage layer. Presumably this could work with a SQL storage option. SQL can be a generic way to deal with columnar data stores in a generic way. I would complain about a BigQuery specific solution, but I think there is a place for generic SQL interface beyond RDBs.

@yurishkuro
Copy link
Member Author

@yurishkuro yurishkuro commented Aug 1, 2018

I assume that even if some database can be treated as SQL and accessed via standard database/sql API, we still need to statically import the actual driver. Granted, this may be less maintenance than a dedicated SpanStorage implementation. However, now that the protobuf model has been merged, nothing is blocking us from moving on the storage plugin dev, eg using something like harshicorp grpc plugin framework.

@isaachier
Copy link
Contributor

@isaachier isaachier commented Aug 1, 2018

Our model is sufficiently simple to warrant looking into using an ORM to support a large number of backends. I'll take a look at what's available. Reread above and understand what @yurishkuro means.

@bruth
Copy link

@bruth bruth commented Aug 6, 2018

Giving my two cents.. an ANSI SQL could work for small workloads, so may be useful for lower-throughput applications that still want to benefit from this tool.

I will also throw out there that Timescale (a Postgres extension) may be a good fit for the required high write throughput.

@mcarbonneaux
Copy link

@mcarbonneaux mcarbonneaux commented May 27, 2019

Clickhouse are SQL high performance storage very efficient for log and trace storage and whold be perfect storage alternative to cassandra original one... they are a true column db... distributed...compressed...

they are near to the CQL (sql like query language)... they use an SQL like language to...

https://clickhouse.yandex/

@chvck
Copy link
Contributor

@chvck chvck commented May 31, 2019

I just thought that I'd drop something here to say that there is also support for using Couchbase as a storage backend (via the grpc plugin), currently at https://github.com/chvck/couchbase-jaeger-storage-plugin. Will likely move to the couchbase-labs organisation in time.

@omerlh
Copy link

@omerlh omerlh commented Jul 18, 2019

Has someone started to work on Azure CosmosDB integration? It has support for Cassandra API, but I couldn't manage to make it work...

@rleiwang
Copy link

@rleiwang rleiwang commented Oct 2, 2020

I just created an issue proposing Chronowave as storage backend. #2534

@DjinNO
Copy link

@DjinNO DjinNO commented Oct 15, 2020

What about ClickHouse? Clickhouse is very cool

@jpkrohling
Copy link
Member

@jpkrohling jpkrohling commented Oct 15, 2020

What about ClickHouse? Clickhouse is very cool

It's already linked in the issue's description, but here's the tracking issue for it: #1438

@robross0606
Copy link

@robross0606 robross0606 commented Jan 15, 2021

What about Apache Solr?

@robross0606
Copy link

@robross0606 robross0606 commented Jan 15, 2021

With the recent changes to ElasticSearch licensing, this just because SUPER important.

@jpkrohling
Copy link
Member

@jpkrohling jpkrohling commented Jan 19, 2021

The ES changes do look worrying, but not sure this justifies supporting Solr. It does help the case of advancing with some other storage.

@yurishkuro
Copy link
Member Author

@yurishkuro yurishkuro commented Jan 22, 2021

AWS announced an Apache-2 licensed fork of ES, logz.io also said something similar (could be the same effort). So I don't think there's reason to panic.

I don't think Elastic changed the terms for the Go driver which we started using in OTEL-based collector, but we'd need to watch for that.

@jpkrohling
Copy link
Member

@jpkrohling jpkrohling commented Jan 22, 2021

So I don't think there's reason to panic.

Absolutely, especially because they can't change the license for something that was released already. So, folks currently using ES don't have a reason to change immediately. If they need to update for some reason, like due to a security problem or general bug fix, then it might become problematic.

While I appreciate the work that logz.io is doing in this front, having multiple sources of ES doesn't help us in supporting our users. On day 1, they'll all be compatible among each other, but each fork will tend to follow its own path over time. Meaning: we'd need to decide which one we'll be "officially" supporting.

@jkowall
Copy link

@jkowall jkowall commented Jan 22, 2021

We are working with several organizations including AWS on an open source version of ES and Kibana which will be Apache licensed and hopefully part of the ASF. I will know more in the coming days.

I've had a hard time getting RedHat engaged (they have canceled twice now) so if you can help with that @jpkrohling then we'd love it :)

Anyone interested in contributing or taking part in the community is welcome. I am collecting info here to start: https://docs.google.com/forms/d/e/1FAIpQLSfykAk4Bhc-dhjR0AXFP7T2oFmsLUxONbD6NwmgMz4usXSGkw/viewform?usp=sf_link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet