Loki: Allow configuring query_store_max_look_back_period when running a filesystem store and boltdb-shipper #2073

slim-bean · 2020-05-14T02:37:32Z

This should allow horizontal scaling of a filesystem store on separate nodes as the ingesters can query the store

Signed-off-by: Ed Welch edward.welch@grafana.com

…esystem store, which allows for boltdb-shipper to horizontally scale filesystem store. Signed-off-by: Ed Welch <edward.welch@grafana.com>

sandeepsukhani

Overall the code looks good to me. Just some minor nits and questions.

docs/operations/storage/boltdb-shipper.md

sandeepsukhani · 2020-05-14T06:35:23Z

docs/operations/storage/boltdb-shipper.md

+
+Scaling up is as easy as adding more loki instances and letting them talk to the same ring.
+
+Scaling down is possible but manual, you would need to shutdown the loki instance and then physically copy the chunks directory and its index files in their entirety to another Loki instance.


I think the other instance would also have index files with same name so it would not be just copying of files. I guess we would have to provide a tool to merge boltdb files. We will have to make it clear in the docs until we have that tool.
What do you think?

Each ingester should be writing files with a unique name such that you should be able to copy ones file to another without conflicts

uploader := fmt.Sprintf("%s-%d", s.cfg.IngesterName, time.Now().Unix())

However, I think I may have found a bug here, this is what I see for my current test ingesters, it looks like the lifecycler ID was empty when they created their files

ed@ed-VirtualBox:/tmp$ ls loki1/chunks/index/index_2628 -1589422114 ed@ed-VirtualBox:/tmp$ ls loki2/chunks/index/index_2628 -1589422113

These still wouldn't collide because the timestamps were different but they should also have the unique lifecycler ID in front

My suspicion here is they created the files before joining the ring? Can you take a look at this?

Looking at how this works, there is a real chicken and egg problem here when initializing the shipper and trying to get the ingester ID to use in the name file.

I'm not sure it's possible to do this, but maybe you have some ideas?

I'm wondering instead if we should try a different strategy and use the hostname or allow for the name to be configured in the yaml directly?

sandeepsukhani · 2020-05-14T06:39:26Z

pkg/ingester/ingester.go

@@ -84,6 +84,7 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) {
 	f.Float64Var(&cfg.SyncMinUtilization, "ingester.sync-min-utilization", 0, "Minimum utilization of chunk when doing synchronization.")
 	f.IntVar(&cfg.MaxReturnedErrors, "ingester.max-ignored-stream-errors", 10, "Maximum number of ignored stream errors to return. 0 to return all errors.")
 	f.DurationVar(&cfg.MaxChunkAge, "ingester.max-chunk-age", time.Hour, "Maximum chunk age before flushing.")
+	f.DurationVar(&cfg.QueryStoreMaxLookBackPeriod, "ingester.query-store-max-look-back-period", 0, "How far back should an ingester be allowed to query the store for data, for use only with boltdb-shipper index and filesystem object store. -1 for infinite.")


I think the requirement should be boltdb(not boltdb_shipper) as index store and filesystem as object store because there is no use of having boltdb_shipper without shared object store otherwise you would be copying files somewhere on the same filesystem which is essentially same as not using boltdb_shipper at all. I might be wrong here or might be missing your point. What do you think?

If my above assumption is true, that the file names are unique, I specified boltdb_shipper so that you could copy the files between ingesters to allow scaling down, otherwise what you said is true that there isn't much reason to use boltdb_shipper for this rather than boltdb.

With the exception of one other reason I could think of, it would be easier to take backups or even move the Loki to another host because the index files can be directly copied with the chunks directory while Loki is still running, I don't think this is true for boltdb I don't think you can safely copy an open boltdb file.

sandeepsukhani · 2020-05-14T06:49:09Z

pkg/ingester/ingester.go

@@ -64,7 +64,7 @@ type Config struct {
 	ingesterClientFactory func(cfg client.Config, addr string) (client.HealthAndIngesterClient, error)

 	QueryStore                  bool          `yaml:"-"`


we can maybe get rid of this because now we mostly rely on QueryStoreMaxLookBackPeriod to enable/disable querying the store from ingesters.

It seems correct what @sandeepsukhani is saying ?

Co-authored-by: Sandeep Sukhani <sandeep.d.sukhani@gmail.com>

… just to make sure there are not collisions if for some reason the ID is empty (which can happen if the hostname doesn't resolve for some reason) Signed-off-by: Ed Welch <edward.welch@grafana.com>

docs/operations/storage/boltdb-shipper.md

…abled. Signed-off-by: Ed Welch <edward.welch@grafana.com>

…ore so they don't duplicate query efforts Signed-off-by: Ed Welch <edward.welch@grafana.com>

Signed-off-by: Ed Welch <edward.welch@grafana.com>

cyriltovena · 2020-05-14T18:36:09Z

pkg/loki/modules.go

+	return cfg.Configs[i]
+}
+
+func calculateMaxLookBack(pc chunk.PeriodConfig, maxLookBackConfig, maxChunkAge time.Duration) (time.Duration, error) {


That deserve a test and it seems simple.

lol, yeah, i thought i put in the description or somewhere I wanted to test this but hadn't done it yet

cyriltovena

I also think you should remove QueryStore and add more tests.

But otherwise LGTM

owen-d

haven't finished review yet (dont wait for me), but thanks for the doc.

owen-d · 2020-05-14T18:36:52Z

docs/operations/storage/filesystem.md

+    directory: /tmp/loki/
+```
+
+A folder is created for every tenant all the chunks for one tenant are stored in that directory.


Suggested change

A folder is created for every tenant all the chunks for one tenant are stored in that directory.

A folder is created for every tenant and all the chunks for one tenant are stored in that directory.

owen-d · 2020-05-14T18:37:18Z

docs/operations/storage/filesystem.md

+
+A folder is created for every tenant all the chunks for one tenant are stored in that directory.
+
+If loki is run in single-tenant mode, all the chunks are put in a folder named `fake` which is the synthesized tenant name used for single tenant mode.


Suggested change

If loki is run in single-tenant mode, all the chunks are put in a folder named `fake` which is the synthesized tenant name used for single tenant mode.

If loki is run in single-tenant mode, all the chunks are put in a folder named `fake` which is the placeholder tenant name used for single tenant mode.

Signed-off-by: Ed Welch <edward.welch@grafana.com>

slim-bean · 2020-05-14T19:12:58Z

I will do a follow up to remove QueryStore as well as refactor the Select method to be able to test the logic within. If I don't do this in 30 days please fire me from the project

Allow configuring query_store_max_look_back_period when running a fil…

b59b88f

…esystem store, which allows for boltdb-shipper to horizontally scale filesystem store. Signed-off-by: Ed Welch <edward.welch@grafana.com>

pull-request-size bot added the size/M label May 14, 2020

slim-bean requested a review from sandeepsukhani May 14, 2020 02:37

sandeepsukhani reviewed May 14, 2020

View reviewed changes

slim-bean and others added 2 commits May 14, 2020 07:07

Update docs/operations/storage/boltdb-shipper.md

56fe4e2

Co-authored-by: Sandeep Sukhani <sandeep.d.sukhani@gmail.com>

add some more precision to the timestamp used to make the name unique…

820f909

… just to make sure there are not collisions if for some reason the ID is empty (which can happen if the hostname doesn't resolve for some reason) Signed-off-by: Ed Welch <edward.welch@grafana.com>

owen-d reviewed May 14, 2020

View reviewed changes

docs/operations/storage/boltdb-shipper.md Outdated Show resolved Hide resolved

Allow the ingester to query the store for labels if query store is en…

46b495f

…abled. Signed-off-by: Ed Welch <edward.welch@grafana.com>

pull-request-size bot added size/L and removed size/M labels May 14, 2020

slim-bean added 2 commits May 14, 2020 12:11

make queriers aware of the fact that ingesters may be querying the st…

af2ac6d

…ore so they don't duplicate query efforts Signed-off-by: Ed Welch <edward.welch@grafana.com>

improve the docs

e9442ea

Signed-off-by: Ed Welch <edward.welch@grafana.com>

cyriltovena reviewed May 14, 2020

View reviewed changes

cyriltovena approved these changes May 14, 2020

View reviewed changes

owen-d reviewed May 14, 2020

View reviewed changes

add tests to the calculateMaxLookBack

c15f5a9

Signed-off-by: Ed Welch <edward.welch@grafana.com>

slim-bean merged commit 9803eab into master May 14, 2020

slim-bean deleted the boltdb-shipper-separate-filesystem branch May 14, 2020 19:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Loki: Allow configuring query_store_max_look_back_period when running a filesystem store and boltdb-shipper #2073

Loki: Allow configuring query_store_max_look_back_period when running a filesystem store and boltdb-shipper #2073

slim-bean commented May 14, 2020

sandeepsukhani left a comment

sandeepsukhani May 14, 2020

slim-bean May 14, 2020

slim-bean May 14, 2020

sandeepsukhani May 14, 2020

slim-bean May 14, 2020

sandeepsukhani May 14, 2020

cyriltovena May 14, 2020

cyriltovena May 14, 2020

slim-bean May 14, 2020

cyriltovena left a comment

owen-d left a comment

owen-d May 14, 2020

owen-d May 14, 2020

slim-bean commented May 14, 2020


		Scaling up is as easy as adding more loki instances and letting them talk to the same ring.

		Scaling down is possible but manual, you would need to shutdown the loki instance and then physically copy the chunks directory and its index files in their entirety to another Loki instance.

		@@ -64,7 +64,7 @@ type Config struct {
		ingesterClientFactory func(cfg client.Config, addr string) (client.HealthAndIngesterClient, error)

		QueryStore bool `yaml:"-"`

	A folder is created for every tenant all the chunks for one tenant are stored in that directory.
	A folder is created for every tenant and all the chunks for one tenant are stored in that directory.


		A folder is created for every tenant all the chunks for one tenant are stored in that directory.

		If loki is run in single-tenant mode, all the chunks are put in a folder named `fake` which is the synthesized tenant name used for single tenant mode.

Loki: Allow configuring query_store_max_look_back_period when running a filesystem store and boltdb-shipper #2073

Loki: Allow configuring query_store_max_look_back_period when running a filesystem store and boltdb-shipper #2073

Conversation

slim-bean commented May 14, 2020

sandeepsukhani left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cyriltovena left a comment

Choose a reason for hiding this comment

owen-d left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

slim-bean commented May 14, 2020