Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming of Build Logs to Additional Target #2104

Closed
keithkroeger opened this issue Mar 21, 2018 · 17 comments
Closed

Streaming of Build Logs to Additional Target #2104

keithkroeger opened this issue Mar 21, 2018 · 17 comments
Labels
accepted enhancement ops/dayN size/medium An easily manageable amount of work. Well-defined scope, few unknowns.
Projects
Milestone

Comments

@keithkroeger
Copy link

Feature Request

What challenge are you facing?

We would like to be able to store logs for our builds offline (outside of concourse).
We see that the logs are stored within a PostgreSQL table for subsequent use by fly and/or the UI. But, would there be a way to also stream these to some some other location?

This would allow us not only to store logs as part of the provenance of a build artifact but also to be able to possibly support audit needs.

A Modest Proposal

We know that we could stream the output of the table and or the SQL used on the table for build events. But, could this information be somehow also streamed to another source such as Influx or to a file for filebeat use?

@jama22
Copy link
Member

jama22 commented Jun 21, 2018

Sounds like we'd like to stream build output to an external serbice for all builds, across all pipelines and teams

@jama22 jama22 added this to Icebox in Runtime via automation Jun 21, 2018
@jama22 jama22 moved this from Icebox to Backlog in Runtime Jun 21, 2018
@vito
Copy link
Member

vito commented Jul 30, 2018

@jama-pivotal said this doesn't need to be 100% real time streaming, we just need to ensure a build's logs can be sent somewhere other than Postgres.

@vito vito added the size/medium An easily manageable amount of work. Well-defined scope, few unknowns. label Jul 30, 2018
@YoussB YoussB moved this from Backlog to In Flight in Runtime Jul 31, 2018
@topherbullock
Copy link
Member

Related: #645

@mhuangpivotal
Copy link
Contributor

We pushed a spike to syslog branch. In the spike, we created a new runner that sends completed build logs to some configured syslog server. We tested with papertrail and saw the logs being received. We also wrote a migration that added drained column on the builds table to indicate the logs have been drained.

Some questions and problems that remain:

  • Our test for syslog is flakey: failing/hanging sometimes
  • Need to backfill a test for updating the drained column
  • What metadata do we want to send with the build log? team/pipeline/job/build names?
  • Is there a possibility for the build logs to be reaped before they are drained?
  • What kind of database locks do we need for the operation?

@jchesterpivotal
Copy link
Contributor

I'd suggest making it possible to send everything the API exposes to the UI: the build metadata, resources and plan. Downstream integrations can build on this. I hacked together a resource to do it, but that won't scale for fully automated capture and analysis.

@jama22
Copy link
Member

jama22 commented Aug 17, 2018

I can't speak for the last two questions, but regarding metadata...could we just emit all of the above?

@jchesterpivotal
Copy link
Contributor

If you do, I'd suggest providing independent logs for each kind of thing. Plain logs, build-plan logs, build-resources logs, resource-version logs.

Or possibly just plain logs and then look at ways to push the more structured stuff into metrics emission.

@jama22
Copy link
Member

jama22 commented Aug 27, 2018

To get this unblocked, let's make some reasonable assumptions around tagging and push a v1 and let people provide feedback

YoussB added a commit that referenced this issue Sep 4, 2018
Story #2104

Signed-off-by: Saman Alvi <salvi@pivotal.io>
YoussB added a commit that referenced this issue Sep 4, 2018
Story #2104

Signed-off-by: Saman Alvi <salvi@pivotal.io>
@pivotal-saman-alvi
Copy link
Contributor

Feature:

A new syslog drainer functionality was implemented, allowing the atc to stream build logs every drain interval.

Concerns

This functionality will forward build logs to the configured syslog location. If there is private information in the logs that is currently hidden due to private pipelines or teams, this has the potential to become exposed. The operator should be aware when configuring syslog.

Configuring via BOSH

atc spec has been modified to include the following syslog properties:

  • syslog_hostname (default: atc-syslog-drainer)
  • syslog_address
  • syslog_transport
  • syslog_drain_interval (default: 30s)
    The manifest would need to be updated accordingly.

Configuring via ATC binary

The following flags will allow you to configure syslog:

  • --syslog-hostname (default: atc-syslog-drainer)
  • --syslog-address
  • --syslog-transport
  • --syslog-drain-interval (default: 30s)

YoussB added a commit that referenced this issue Sep 4, 2018
#2104

Submodule src/github.com/concourse/atc b079f50..aa86856:
  > Sending syslog packet information for build events
Submodule src/github.com/concourse/topgun 6213dca..c22ca7d:
  > Topgun tests for syslog drainer

Signed-off-by: Saman Alvi <salvi@pivotal.io>
@cirocosta
Copy link
Member

cirocosta commented Sep 4, 2018

Hey,

Given that we're providing TLS, maybe we should enable the client to accept --insecure-skip-verify like we do for other options that allow TLS? Maybe configuring ca-cert would be good as well for those who have internal deployments.

Wdyt?

Thx!

@YoussB
Copy link
Member

YoussB commented Sep 5, 2018

I think that makes sense. As we were developing this first MVP we only plain tcp and udp in-mind.
Adding the ca-cert is fairly simple.

@vito
Copy link
Member

vito commented Sep 5, 2018 via email

YoussB pushed a commit that referenced this issue Sep 7, 2018
#2104

Submodule src/github.com/concourse/atc 8e2e856..05e5bd7:
  > support TLS in syslog drainer
Submodule src/github.com/concourse/topgun c22ca7d..1c4f47a:
  > update syslog tests with bosh spec update

Signed-off-by: Bishoy Youssef <byoussef@pivotal.io>
@YoussB
Copy link
Member

YoussB commented Sep 7, 2018

We added a new flag to the atc: --syslog-ca-cert for the tls connection.
also modified bosh flags to be with a parent syslog and all the syslog configurations under it, as follows:

syslog:
  hostname:
  address:
  transport:
  ca_cert: 

ca-cert flag is only used for the tls connections.

@YoussB YoussB moved this from In Flight to Done in Runtime Sep 10, 2018
@topherbullock
Copy link
Member

Looking good. I tested this out by searching for "go syslog server" and using the first one I could get running : Ekanite (http://www.philipotoole.com/tag/ekanite/).. and it worked!

@keithkroeger
Copy link
Author

Thank you, all.
Could the target therefore be a syslog storer, such as https://bosh.io/releases/github.com/cloudfoundry/syslog-release?all=1?

@YoussB
Copy link
Member

YoussB commented Sep 13, 2018

@keithkroeger Yes it can.

  • Actually the topgun tests are run using this release.
  • Just note that as per this release's documentation the syslog storer can only be used for testing purposes not for production.

@marco-m
Copy link
Contributor

marco-m commented Sep 24, 2018

@pivotal-saman-alvi @YoussB is this feature streaming also the line-by-line timestamps as seen from the Concourse web UI or is it timestamp-less as seen from fly watch ? In my experience, timestamps are very useful to detect unexpected slowdowns / bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted enhancement ops/dayN size/medium An easily manageable amount of work. Well-defined scope, few unknowns.
Projects
No open projects
Runtime
Accepted
Development

No branches or pull requests