Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introducing ttl and timestamp in state #1915

Merged
merged 1 commit into from Jun 29, 2016
Merged

Conversation

ruflin
Copy link
Member

@ruflin ruflin commented Jun 27, 2016

Each state has a timestamp and a ttl. -1 means ttl is disable, 0 means it should be directly removed.
This moves the logic on what should happen with a state completely to the state itself and makes it possible to use states in different use cases.

The advantage of the ttl is that it does not depend in filebeat on the modification time which can be incorrect. This means, in case a file is rotated, the timestamp of a file is updated and it also counts as a new state. This makes sense as otherwise it could happen that the state of a rotate file is removed and then after rotation the file is picked up again as a completely new file. The downside is that people using filebeat must understand the difference between the state timestamp and modtime. In general timestamp is neweer then the modtime, as filebeat finishes reading later.

On the registrar side, the cleanup happens every time before the registry is written. On the prospector side the state is cleaned up after each scan. It can happen that the prospector state list and registrar state list are not 100% in sync as they don't cleanup the states at the same time. The prospector state is the one that will always overwrite the registrar state. No cleanup is done before shutdown.

It is important, that on startup first a full scan is done to update the states before the state is cleaned up, otherwise still needed states could be removed.

@ruflin ruflin added in progress Pull request is currently in progress. Filebeat Filebeat labels Jun 27, 2016
@ruflin ruflin force-pushed the fb-clean-older branch 3 times, most recently from 61e1178 to 60c4969 Compare June 28, 2016 07:31
@ruflin ruflin mentioned this pull request Jun 28, 2016
s.states = []State{}
} else {
s.states = append(s.states[:i], s.states[i+1:]...)
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filter loop alternative (not as many potential copies on sub-splices when removing multiple elements):

states := s.states[:0]
for _, state := range.stats {
    ttl := state.Ttl * time.Second
    if ttl >= 0 && time.Since(state.Timestamp) <= ttl {
          debugf("State removed for %v because of older: %v", state.Source, ttl)
          continue // drop state
    }
    states = append(states, state) // in-place copy old state
}
s.states = states

When converting ttl into time.Duration it nicely prints the unit being used. time.Duration can be negative too.

Copy link

@urso urso Jun 28, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe get timestamp before removing, so all states are compared against same timestamp.

Are removed states somehow reported to registrar for remove? If so, how can we be sure some state is not updated by some prospector (in case clean_older is too small)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The registrar does the cleanup himself. As all states are reported to prospector and registrar, they will have the same information -> do the same cleanup. The cleanup on the registrar side will always happen with a delay. It can happen in an edge case, that cleanup removes a state on the registrar side that was not removed on the prospector side because a new event already exist. But the new state will be sent to the registrar and reintroduce the state again. Prospector always overwrites registrar state.

We should discuss in a second step on the requirements for variables as some combinations / variables can lead to very strange situations. If clean_older is too small, filebeat will "stateless".

Each state has a timestamp and a ttl. -1 means ttl is disable, 0 means it should be directly removed.
This moves the logic on what should happen with a state completely to the state itself and makes it possible to use states in different use cases.

The advantage of the ttl is that it does not depend in filebeat on the modification time which can be incorrect. This means, in case a file is rotated, the timestamp of a file is updated and it also counts as a new state. This makes sense as otherwise it could happen that the state of a rotate file is removed and then after rotation the file is picked up again as a completely new file. The downside is that people using filebeat must understand the difference between the state timestamp and modtime. In general timestamp is neweer then the modtime, as filebeat finishes reading later.

On the registrar side, the cleanup happens every time before the registry is written. On the prospector side the state is cleaned up after each scan. It can happen that the prospector state list and registrar state list are not 100% in sync as they don't cleanup the states at the same time. The prospector state is the one that will always overwrite the registrar state. No cleanup is done before shutdown.

It is important, that on startup first a full scan is done to update the states before the state is cleaned up, otherwise still needed states could be removed.

This is part of elastic#1600

Additional:
* Fixed offset change for prospector to start harvester for old files.

Note:
* Nice part of this is that registrar does not have to now about expiry and removal of files, state is communication channel.
@ruflin ruflin added review and removed in progress Pull request is currently in progress. labels Jun 29, 2016
}
p.lastScan = time.Now()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be removed

@urso urso merged commit 8afb523 into elastic:master Jun 29, 2016
@ruflin ruflin deleted the fb-clean-older branch July 4, 2016 10:10
#close_eof: false

# Files for the modification data is older then clean_older the state from the registry is removed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin we need to rephrase this sentence here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Will take care of it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants