Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a general purpose gzip codec #1817

Open
cdenneen opened this issue Oct 1, 2014 · 14 comments
Open

Add a general purpose gzip codec #1817

cdenneen opened this issue Oct 1, 2014 · 14 comments
Assignees
Labels

Comments

@cdenneen
Copy link

@cdenneen cdenneen commented Oct 1, 2014

Add gzip codec or file input option for gzipped files

Edit: Add a general purpose gzip codec which can be used in inputs and outputs

@jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Oct 2, 2014

gzip codec is something we totally should have.

To make it work with file input, we'll have to fix how the file input is implemented. We need to do this improvement anyway, but it is a prerequisite for any gzip codec being usable on the file input.

<3 for the idea

@cdenneen
Copy link
Author

@cdenneen cdenneen commented Oct 3, 2014

@jordansissel thanks! Another thought is possibly a complete flag of some sort.
Let's say I have a directory of logs and I point logstash file input at the glob. What would be cool in some cases is "do something when Logstash is done processing them" like

  • gzip them
  • move them to an archive directory
  • delete
  • send email
  • anything

Basically like a shell exec

Of course you'd have to know this directory of files is static and doesn't have open file handles but that's up to admin to determine.

@jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Oct 3, 2014

The file input currently has no concept of "done processing them". Files are assumed to be live streams that live forever, and as a result have no end. Reaching EOF on a log file generally means "wait a while and more data will show up".

Unfortunately, this 'files are live streams' means that folks doing archival or backfilling with old and "complete" logs will be caught without a way to inform Logstash about way to terminate.

@suyograo suyograo changed the title Gzip files Add a general purpose gzip codec May 29, 2015
@suyograo suyograo added the new plugin label May 29, 2015
@suyograo
Copy link
Member

@suyograo suyograo commented May 29, 2015

@khornberg gist here: #1895 (comment)

Gzip output:

@yukti-kaura
Copy link

@yukti-kaura yukti-kaura commented Sep 25, 2015

Hello Everyone,

Has this been implemented?

@suyograo
Copy link
Member

@suyograo suyograo commented Sep 25, 2015

@Yukti nope, not implemented. PR welcome :)

@tan-tan-kanarek
Copy link

@tan-tan-kanarek tan-tan-kanarek commented Feb 18, 2016

This is a quick, not nicely implemented, working alternative:
https://github.com/tan-tan-kanarek/logstash-input-gzfile

@jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Aug 18, 2016

Everytime you comment +1 to this ticket, 25-75 emails are sent to out. Instead, please use Github's "reaction" feature to +1 this issue. It looks like this:

image

I will delete the +1 comments now to disuade this further. I appreciate y'alls eagerness for this feature.

@jordansissel
Copy link
Contributor

@jordansissel jordansissel commented Aug 18, 2016

I have deleted approximately 15 +1 comments.

@lmpampaletakis
Copy link

@lmpampaletakis lmpampaletakis commented Nov 24, 2016

Do we have any news about this? Using PIPE which is another official alternative is probably inefficient.

@gaurs21
Copy link

@gaurs21 gaurs21 commented Aug 25, 2017

Hi, how can we use gzip_lines plugin, logstash to read .gz files?

@kunisen
Copy link

@kunisen kunisen commented Mar 23, 2018

Hello Everyone,
Has this been implemented? 😄

As from this comment, not yet?
#1817 (comment)

@dwdii
Copy link

@dwdii dwdii commented May 19, 2018

It's not quite what you all are talking about, but a grassroots codec has recently popped up on RubyGems: https://rubygems.org/gems/logstash-codec-json_gz

It is specific to GZIP'd JSON, but the version I downloaded was working well for me.

@alepuccetti
Copy link

@alepuccetti alepuccetti commented Aug 8, 2018

What about bzip2? It would be possible to leverage parallel decompression and probably easier to track the progress and pick up processing where it was left if logstash failed or stopped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.