Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose age of oldest entry in the translog #28189

Closed
scottsom opened this issue Jan 11, 2018 · 10 comments
Closed

Expose age of oldest entry in the translog #28189

scottsom opened this issue Jan 11, 2018 · 10 comments
Labels
:Data Management/Stats Statistics tracking and retrieval APIs :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. good first issue low hanging fruit help wanted adoptme

Comments

@scottsom
Copy link

As per the faster recovery benefits discussed here, it is desirable to find the right size for your particular setup. However, unless I am not looking in the right places, I haven't been able to find a good way to figure out the right size of the translog from the information that is currently exposed. I think it would be helpful if the translog stats were extended to also expose the age of oldest entry to aid with the monitoring and sizing of the translog (index.translog.retention.size).

For example, suppose I typically expect my nodes to go down for ~30 minutes (i.e. maintenance/upgrades/patches). I would want to make sure that my translog is sized properly to hold operations for at least that long (probably more as buffer).

If this feature were present then I might see that my shards have their translog's sized at 512mb and the oldest entry is 20m. I might then conclude that doubling my translog will retain enough entries for 40 minutes, covering my expected downtime.

I realize for the scenario I have described, you could do something like setting index.translog.retention.age=40m and index.translog.retention.size=$big_number but it would be nice to take the guess work out of the equation (what is $big_number?). By adding this stat, you could monitor it over time and alarm on it.

@bleskes
Copy link
Contributor

bleskes commented Jan 12, 2018

it would be nice to take the guess work out of the equation (what is $big_number?)

I think you can set the translog size limitation to big number and just monitor the size of the translog.

That said, I also think it make sense to expose this information in the stats. I'm marking this as an adopt me and a low hanging fruit.

@bleskes bleskes added good first issue low hanging fruit help wanted adoptme :Data Management/Stats Statistics tracking and retrieval APIs :Translog labels Jan 12, 2018
@bleskes
Copy link
Contributor

bleskes commented Jan 12, 2018

Also, if you want to put up a PR, I can help with guidance where needed.

@scottsom
Copy link
Author

Thanks, I'll put something together for this

@justinwyer
Copy link
Contributor

Had a look at this as a starting point for contribution. Looking how to implement this it looks like it would require:

  • Adding a field to TranslogStats
  • Adding a method to Translog
  • Adding a field to Checkpoint

As this is labelled low hanging fruit I am wondering if I have thought about this correctly or not?

@scottsom
Copy link
Author

scottsom commented Feb 7, 2018

I did an initial pass at an implementation for this and it is a bit trickier than it appears. The translog itself does not appear to track any timestamps. Instead, it is relying on the file system metadata (Files.getLastModifiedTime(path)) to figure out if a translog generation is a candidate for deletion. There is a similar method in Java to get the creation time (Files.readAttributes(path, BasicFileAttributes.class).creationTime()) but not all file systems track creation time.

While it might be nice if we just made operations in the translog tracked a timestamp, I think that would be overkill for solving the problem at hand.

What I was thinking is in BaseTranslogReader constructor we capture the current time and store it as a member variable, which will be returned as the creation time for the generation. Since we are not persisting this anywhere, on node restart, this value will reset to Files.readAttributes(path, BasicFileAttributes.class).creationTime() (may not be right on some file systems) but I think that is a reasonable trade-off.

Alternatively, we continue to rely on the file system metadata and say that this feature only supports file systems that track creation time.

@justinwyer
Copy link
Contributor

justinwyer commented Feb 8, 2018

Thanks for the insight! Implementation of the file create time with fallback to System.currentTimeMillis() is pretty straight forward, I am however struggling with the test implementation as the transaction log is created at the start of the test resulting in a creation time of 0 milliseconds ago. How are similar test cases handled in the elastic code base? Don't want to introduce new constructors for the sake of testing unless that is already an established and accepted pattern.

@bleskes
Copy link
Contributor

bleskes commented Feb 8, 2018

@justinwyer thanks for your interest.

Adding a field to TranslogStats

This seems reasonable.

Adding a method to Translog

Why do think we need something more? we already have something to get a stats object?

Adding a field to Checkpoint

What were thinking to do here? I think we can use the current list of readers. Something like org.elasticsearch.index.translog.Translog#sizeInBytesByMinGen but using other properties of the readers.

There is a similar method in Java to get the creation time (Files.readAttributes(path, BasicFileAttributes.class).creationTime()

I'd prefer to use exactly the same information that the deletion policy uses. Why did you think it's need to change to creation time? I'm fine with caching things on the readers. We have to figure out what the age stats should say about the writer. I'm ok with just returning now for that one.

@scottsom
Copy link
Author

scottsom commented Feb 8, 2018

I'd prefer to use exactly the same information that the deletion policy uses. Why did you think it's need to change to creation time? I'm fine with caching things on the readers. We have to figure out what the age stats should say about the writer. I'm ok with just returning now for that one.

The idea was the age of the oldest operation would be the current time minus the timestamp of the first operation in the oldest generation. Since there isn't a timestamp per operation then this can be approximated by the creation time of the oldest generation. This would tell you how long the translog has you covered for.

It sounds like I was over complicating it though, the last modified time on the oldest generation should be good too.

@justinwyer
Copy link
Contributor

What were thinking to do here? I think we can use the current list of readers. Something like org.elasticsearch.index.translog.Translog#sizeInBytesByMinGen but using other properties of the readers.
I was thinking something similar I just wasn't sure where we would get a timestamp, but the oldest last modified of any generation makes sense, thanks!

Do we have a preference for time handling ie. using Joda DateTimeUtils.currentTimeMillis vs System.currentTimeMillis() if we use Joda then overriding the time provider is not allowed which means non deterministic testing.

So my question for the testing, are we happy with testing that we get a number and that it's older than some arbitrary amount (1 at least) of milliseconds?

justinwyer added a commit to justinwyer/elasticsearch that referenced this issue Feb 10, 2018
…translog stats item. This uses the last modified time of the earliest generation.
@clintongormley clintongormley added :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. and removed :Translog :Distributed/Distributed A catch all label for anything in the Distributed Area. If you aren't sure, use this one. labels Feb 13, 2018
@scottsom
Copy link
Author

Excellent, thanks @justinwyer and @bleskes

dnhatn pushed a commit that referenced this issue Feb 16, 2018
Expose the age of translog files in the translog stats. This is useful to reason about your translog retention policy.

Closes #28189
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/Stats Statistics tracking and retrieval APIs :Distributed/Engine Anything around managing Lucene and the Translog in an open shard. good first issue low hanging fruit help wanted adoptme
Projects
None yet
Development

No branches or pull requests

4 participants