Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The dictionary is never cleared of deleted keys #24

Closed
magnusbaeck opened this issue Mar 27, 2016 · 7 comments
Closed

The dictionary is never cleared of deleted keys #24

magnusbaeck opened this issue Mar 27, 2016 · 7 comments

Comments

@magnusbaeck
Copy link
Contributor

magnusbaeck commented Mar 27, 2016

When a lookup table is loaded from disk, Hash#merge! is used to merge the newly loaded data into the hash used for the lookups. While this overwrites existing keys with the new values, keys that are no longer present in the on-disk file won't be deleted from the in-memory hash and the plugin will happily continue to use the obsolete keys.

@pemontto
Copy link

👍

1 similar comment
@jqlh
Copy link

jqlh commented Dec 30, 2016

👍

@sokratisg
Copy link

sokratisg commented Feb 26, 2017

There doesn't seem to be great participation on this one.

Do you happen to know if there are other alternatives to filter-translate that aren't affected by this bug or any other workaround? Maybe really low TTL or records?
We are also affected by this issue and trying to find viable alternatives.

@elvarb
Copy link

elvarb commented Mar 16, 2017

Could this also cause a memory leak?

I have a rather large dictionary that is reloaded every minute because if no match is found a separate ruby process gets the data through other means and then adds the data to the event and appends to the file. The Logstash memory usage just grows and grows extremely fast until it crashes.

@magnusbaeck
Copy link
Contributor Author

I have a rather large dictionary that is reloaded every minute because if no match is found a separate ruby process gets the data through other means and then adds the data to the event and appends to the file. The Logstash memory usage just grows and grows extremely fast until it crashes.

It doesn't sound like something caused by this. The updates made by your script are, presumably, incremental rather than rewriting the whole table with totally new data. There could definitely be some garbage collection churn when the objects in the hash's values are overwritten every minute, but I don't think it should lead to an out of memory crash. Have you tried changing the reload interval or turning off the translate filter completely?

@elvarb
Copy link

elvarb commented Apr 18, 2017

It doesn't sound like something caused by this. The updates made by your script are, presumably, incremental rather than rewriting the whole table with totally new data. There could definitely be some garbage collection churn when the objects in the hash's values are overwritten every minute, but I don't think it should lead to an out of memory crash. Have you tried changing the reload interval or turning off the translate filter completely?

The pipeline I had was something along with this

Translate filter runs, if no match is found add a default value to the field instead
If default value is present run ruby filter that runs a golang app
That app goes through the dictionary file to search for a match as well
If no matches are found then query a rest api, add results to the dictionary file and add results to the event.

I was trying to skip having to query the api as much as possible, which is why the golang application reads the dictionary file as well.

I tested this in many different ways and always had a out of memory crash when I had it reload the dictionary so often. There might be a missing line to clear the old array or that the garbage collection does not happen often enough to mitigate the problem.

This is not an issue for me at the moment because I had to abandon that way, it never was happy with it anyways. Have now a much bigger pre populated dictionary file, no issues.

@guyboertje
Copy link

Closing, later versions allows for a replace strategy where old key value mappings are forgotten.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants