Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Promtail, scrape logs from Object store #2270

Closed
wants to merge 20 commits into from

Conversation

adityacs
Copy link
Collaborator

@adityacs adityacs commented Jun 28, 2020

What this PR does / why we need it:
Based on the design doc in #2107, this PR adds a new feature to Promtail where users can scrape logs from object store. This version specifically supports just AWS S3. Also, this PR just covers the items mentioned as part of first iteration

Which issue(s) this PR fixes:
Fixes #2045

Checklist

  • Documentation added
  • Tests updated

@adityacs adityacs changed the title Object storage scrape Feature: Promtail, scrape logs from Object store Jun 28, 2020
Comment on lines 193 to 212
func (r *objectReader) getReader(reader io.ReadCloser) (*bufio.Reader, error) {
// identify the file type
buf := [3]byte{}
n, err := io.ReadAtLeast(reader, buf[:], len(buf))
if err != nil {
return nil, err
}

rd := io.MultiReader(bytes.NewReader(buf[:n]), reader)
if isGzip(buf) {
r, err := gzip.NewReader(rd)
if err != nil {
return nil, err
}
rd = r
}

bufReader := bufio.NewReader(rd)
return bufReader, nil
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cyriltovena I need some help here. I am not sure If I am doing this in a proper way.

@codecov-commenter
Copy link

codecov-commenter commented Jun 28, 2020

Codecov Report

Merging #2270 into master will increase coverage by 0.02%.
The diff coverage is 58.82%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2270      +/-   ##
==========================================
+ Coverage   61.65%   61.67%   +0.02%     
==========================================
  Files         160      163       +3     
  Lines       13565    13836     +271     
==========================================
+ Hits         8363     8533     +170     
- Misses       4579     4658      +79     
- Partials      623      645      +22     
Impacted Files Coverage Δ
pkg/promtail/positions/positions.go 60.71% <0.00%> (+13.39%) ⬆️
pkg/promtail/scrapeconfig/scrapeconfig.go 40.00% <ø> (ø)
...kg/promtail/targets/objectstore/s3targetmanager.go 0.00% <0.00%> (ø)
pkg/promtail/targets/objectstore/object_target.go 65.95% <65.95%> (ø)
pkg/promtail/targets/objectstore/object_reader.go 76.56% <76.56%> (ø)
pkg/promtail/targets/file/tailer.go 73.86% <0.00%> (-4.55%) ⬇️
pkg/promtail/targets/file/filetarget.go 68.67% <0.00%> (-1.81%) ⬇️
... and 2 more


// NewFileTarget create a new FileTarget.
func NewS3Target(logger log.Logger, handler api.EntryHandler, positions positions.Positions, jobName string, objectClient chunk.ObjectClient, s3Config *scrape.S3Targetconfig) (*S3Target, error) {
// // object store sync period should not be less than 1 minute
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some code comments :)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😃 will fix

for {
select {
case <-ticker.C:
objects, _, err := t.objectClient.List(context.Background(), t.prefix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this will hang on large buckets, but I think that's fine for an initial feature.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you are right. We have to add multi part download for larger size files. Added a note here https://github.com/grafana/loki/blob/ab130b09eb965c3baa8f94d02c63b15704b1c300/docs/clients/promtail/configuration.md#s3_config

}

// skip any object which is already read completely
if modifiedAt == object.ModifiedAt.UnixNano() && pos > 0 && pos == size {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need the pos > 0 check? It seems unnecessary.

Copy link
Collaborator Author

@adityacs adityacs Jun 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a chance that Promtail shutdowns without reading a single line. On restart pos and size will be 0.So, just checking pos == size will ignore the object.

)

var (
s3ReadBytes = promauto.NewGaugeVec(prometheus.GaugeOpts{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid using path for these metrics. We're creating metrics in accordance with our stream cardinality here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path here means object name. What do you suggest here? Just consider the bucker name instead of each individual object?

Copy link
Member

@owen-d owen-d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool first draft!

I'm worried about the metric cardinality (more in comments) and that promtail would try to list entire s3 buckets per sync period. I suspect this will work as long as users configure low per-bucket retention and don't have lots of throughput in said bucket.

Another way to ease this could be an optional flag for deleting s3 files once read. WDYT?

I'll defer to @cyriltovena's judgment for your compression question.

@adityacs
Copy link
Collaborator Author

@owen-d

I suspect this will work as long as users configure low per-bucket retention and don't have lots of throughput in said bucket.

Yep, you are correct. This is just a very simple way of fetching and scraping logs. In future versions we want to support SQS integration. Using SQS we don't have to List the objects in bucket, we just look for the object we want. Mentioned the same in docs as well

@adityacs adityacs force-pushed the object_storage_scrape branch 5 times, most recently from fbc1093 to 0a2b793 Compare July 13, 2020 11:20
Comment on lines 54 to 61
BucketNames: cfg.S3Config.BucketName,
Endpoint: cfg.S3Config.Endpoint,
Region: cfg.S3Config.Region,
AccessKeyID: cfg.S3Config.AccessKeyID,
SecretAccessKey: cfg.S3Config.SecretAccessKey,
Insecure: cfg.S3Config.Insecure,
S3ForcePathStyle: cfg.S3Config.S3ForcePathStyle,
HTTPConfig: awsHTTPConfig,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dependent on #2318

@adityacs adityacs marked this pull request as ready for review July 13, 2020 11:27
@stale
Copy link

stale bot commented Aug 21, 2020

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale A stale issue or PR that will automatically be closed. label Aug 21, 2020
@adityacs adityacs added the keepalive An issue or PR that will be kept alive and never marked as stale. label Aug 24, 2020
@stale stale bot removed the stale A stale issue or PR that will automatically be closed. label Aug 24, 2020
@cyriltovena cyriltovena added this to To do in Loki Project Oct 27, 2020
@cyriltovena cyriltovena moved this from To do to Review in progress in Loki Project Oct 27, 2020
@angeloskaltsikis
Copy link

Hey @adityacs @owen-d ,
Any way to push for this PR to get reviewed 😄 ?

@CLAassistant
Copy link

CLAassistant commented Apr 20, 2021

CLA assistant check
All committers have signed the CLA.

@matthiaslee
Copy link

matthiaslee commented May 20, 2021

Any update on this? I'd be happy to contribute if i can be of help.

cyriltovena pushed a commit to cyriltovena/loki that referenced this pull request Jun 11, 2021
…ge (grafana#2270)

* added FIFO cache metrics for current number of entries and memory usage
fixed bug in updating last element of FIFO
updated unit tests

Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@sysdig.com>

* fixed "make mod-check"

Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@sysdig.com>

* Revert "fixed "make mod-check""

This reverts commit fe00ab880b7d7dfacfb9a7af1f38ebd3fe74e1ba.

Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@sysdig.com>

* addressed comments

Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@sysdig.com>

* addressed comments

Signed-off-by: Dmitry Shmulevich <dmitry.shmulevich@sysdig.com>
@sushantkumar-amagi
Copy link

Would love to help out with getting this PR across. There is a use case where we need it too.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to hear more about the deployment model of Promtail in this case.

It doesn't seems that you can scale them because the position file is local.

I also think the documentation is a bit light on how to configure SQS at least a reference on how to activate SQS with update on S3 seems required.

You need some sort of disclaimer like and explain how:

This target uses SQS queue on the region same as S3 bucket. We must setup SQS queue and S3 event notification before use this plugin.

1. Create new SQS queue (use same region as S3)
2. Set proper permission to new queue
3. [Configure S3 event notification](http://docs.aws.amazon.com/AmazonS3/latest/dev/NotificationHowTo.html)
4. Start a Promtail instance

I think the code is almost there, but small effort on the documentation is required.

Lastly I'd like to see if there's a way to add discovered labels __XXX from file attribute/metadata ? That could be done in another PR though.

@adityacs
Copy link
Collaborator Author

@cyriltovena Will work on documentation. Regarding scaling Promtail, just pasting my comment above again

To scale to process more files, running Promtail with same SQS queue would be tricky. One solution would be to create multiple SQS queues based on some pattern like bucket/a/file.gz will go to queue A and bucket/b/file.gz will go to queue B. Then multiple Promtail instances can be configured with different queues (A, B etc...)

@kavirajk
Copy link
Collaborator

Closing this as no activities for long time. Feel free to send new PR if anyone want's to revive the work

@kavirajk kavirajk closed this Mar 18, 2022
Loki Project automation moved this from Review in progress to Done Mar 18, 2022
@sandstrom
Copy link

This may be relevant for any repo watchers: #5065

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keepalive An issue or PR that will be kept alive and never marked as stale. size/XXL
Projects
Development

Successfully merging this pull request may close these issues.

Object client scrape configs