Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configurable access logger with JSON support #1293

Closed
wants to merge 23 commits into from

Conversation

rickb777
Copy link

This is a new access logger derived from (and replacing) the old one. It exists in its own accesslog package under middlewares and has similar-ish net/http filters to the existing solution. Its first concern is to implement the existing extended Common Log Format behaviour, to provide a clean upgrade path for existing users.

But it's now split into pluggable parts, the important part being its formatter (the CLF formatter being the default). A new JSON formatter exists.

The file writing part is termed the log appender and is slightly enhanced: it allows buffered writing, or gzipped writing, and/or asynchronous writing via a channel/goroutine.

The final area of change is to the configuration: new settings are needed to control the new features. This is done with a new AccessLog type in types.go.

Notice that there are now three middlewares:

the first is the 'main' logger that tracks each request, builds log events, and despatches these to the log appender;
the second inserts knowledge about the frontend into the log-data context;
the third inserts knowledge about the backend into the log-data context.
There had previously been an error in tracking the frontend data: it operated on the incorrect assumption that there is a 1:N mapping from frontends to backends.

@rickb777
Copy link
Author

This PR supersedes #1230.

Fixes race condition when concurrently writing to the log file.
Added HTTPRequestLine to the core data available for logging. Fixed some test issues.
@rickb777 rickb777 changed the title Configurable access logger with JSON support WIP: Configurable access logger with JSON support Mar 17, 2017
@rickb777 rickb777 changed the title WIP: Configurable access logger with JSON support Configurable access logger with JSON support Mar 29, 2017
@timoreimann timoreimann self-requested a review April 3, 2017 16:56
Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about half way through this PR; sending out comments accumulated up until this point to give author a chance to respond.

A few general remarks:

  • This PR is way too big. Could you please consider moving out parts that are not strictly needed for an initial implementation of JSON logging into separate PRs? I'm thinking that both compression and asynchronous writing could easily be handled as follow-up PRs. Even if I made it through it some point, we should make it as simple as possible for the second reviewer.
  • Could you please use snake case instead of camel case for all file names (e.g., log_appender.go instead of logAppender.go?
  • Design-wise, I very much like the changes introduced by this PR, so 👍 .

@@ -5,6 +5,7 @@
*.iml
/traefik
/traefik.toml
vendor/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That folder must not be ignored as we have switched to vendoring a while ago.

Copy link
Author

@rickb777 rickb777 Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls exclude that change to .gitignore.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point 1 is the tricky one. I've already had three attempts at writing this PR and in the end it seemed more satisfactory to complete the implementation instead of feeding small parts of it one by one. I'm open to persuasion though.

Copy link
Author

@rickb777 rickb777 Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Point 2 - the existing codebase uses both styles. We should make it consistent as a separate PR I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to include the vendor folder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did point 2 refer to again? Seems like the comment was placed at the wrong spot, or the Github UI is fooling us once again.

@@ -59,13 +59,13 @@ build-no-cache: dist
shell: build ## start a shell inside the build env
$(DOCKER_RUN_TRAEFIK) /bin/bash

image: binary ## build a docker traefik image
image: build ## build a docker traefik image
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This effectively reverts #1329.

Was this a mistake, or do you have a particular reason to return to the old behavior?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls exclude that change.

@@ -82,5 +82,9 @@ lint:

fmt:
gofmt -s -l -w $(SRCS)

clean:
sudo rm -rf dist static site autogen/gen.go traefik
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Makefile ever compile a Traefik binary into ./traefik? If not, then we should maybe not remove it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./traefik is created by go build ..

There is a structural problem with the Makefile: artefacts built using Docker are owned by root, not by the current user.

It should be possible to make clean. When make clean is executed, all generated artefacts should be deleted. This is an important requirement for developers. As it stands, it may be necessary to sudo make clean, which is an adequate interim compromise. Ideally this problem would be solved more completely.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange, I never had a problem with deleting files created by the build container. Not in front of my laptop right now, but is this maybe something happening on some platforms other than Mac OS X only?

I'm okay with creating a clean target (though I personally never needed it). Does our Makefile call go build without specifying an output file somewhere below dist/ though? If not, then I still believe traefik (in the root directory) should not be removed.

docs/toml.md Outdated
# Optional
#
# accessLogsFile = "log/access.log"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't remove it from the docs while the parameter is still available. Better leave a deprecation warning.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

```

The file can be gzipped as it is written if you require; just add a
`.gz` extension:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Easy yet effective way to enable compression. I like this. 👍

wg.Add(n)
for i := 0; i < n; i++ {
go func(j int) {
fmt.Fprintf(lw, "Line %d\n", j)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the Line prefix to make the assertion below more readable?

scanner := bufio.NewScanner(r)
for i := 0; i < n; i++ {
assert.True(t, scanner.Scan())
line := scanner.Text()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about using ioutil/ReadFile instead and just casting each byte value into a string?

}

for i := 0; i < n; i++ {
assert.Equal(t, 1, counts[i], "Line %d should have count=1 but was %d", i, counts[i])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to inline this check into the main test loop body.

"compress/gzip"
"fmt"
"github.com/containous/traefik/log"
"github.com/containous/traefik/types"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import order.

func (l jsonLogFormatter) Write(w io.Writer, logDataTable *LogData) error {

buf := &bytes.Buffer{}
buf.WriteString("{")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that we are producing JSON (braces here, objects below) manually. Can we try to populate our content in some map/struct and call json.Marshal on it instead?

Implementing json.Marshaler on some type might be helpful too.

Copy link
Author

@rickb777 rickb777 Apr 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that and we started with that. But the handling of some fields requires bespoke treatment. We considered copying the data into a map (formatting values at the copy stage) and then marshalling the map. But the decision was taken to do the implementation as you see, handling the top-level of the JSON document explicitly and only using json.Marshal for the string values. It's a similar amount of code and requires less copying / memory footprint.

There is possibly scope for improving this.

Copy link
Member

@emilevauge emilevauge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow @rickb777, what a PR !
Like @timoreimann, I think it is way too big. It's really difficult for a reviewer to validate such a change set.
Would you consider splitting this one into smaller PRs ?

Copy link
Contributor

@timoreimann timoreimann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed two open points.

Prior to moving forward with the review, we definitely need to carve out or drop parts which are not strictly needed for an initial implementation of structured logging, namely the compression and asynchronous features. I realize that will mean more work on your end, but a ~3,000 LOC PR is just not reviewable otherwise. It should be possible to create follow-up PRs and make them base on each predecessor, respectively, in terms of target branches to merge into.

Again, I really like the overall improvement in design, but we need to get it into a digestible shape.

@@ -5,6 +5,7 @@
*.iml
/traefik
/traefik.toml
vendor/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to include the vendor folder.

@@ -82,5 +82,9 @@ lint:

fmt:
gofmt -s -l -w $(SRCS)

clean:
sudo rm -rf dist static site autogen/gen.go traefik
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange, I never had a problem with deleting files created by the build container. Not in front of my laptop right now, but is this maybe something happening on some platforms other than Mac OS X only?

I'm okay with creating a clean target (though I personally never needed it). Does our Makefile call go build without specifying an output file somewhere below dist/ though? If not, then I still believe traefik (in the root directory) should not be removed.

@@ -5,6 +5,7 @@
*.iml
/traefik
/traefik.toml
vendor/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did point 2 refer to again? Seems like the comment was placed at the wrong spot, or the Github UI is fooling us once again.

* `coreFields` : data gleaned from the HTTP request.
* `requestHeaders` : the set of headers received in the request.
* `originResponseHeaders` : the set of headers received from the origin (a.k.a. upstream) server
* `downstreamResponseHeaders` : the set of headers passed back to the client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. 👍

@rjshep rjshep mentioned this pull request Apr 10, 2017
@rjshep
Copy link
Contributor

rjshep commented Apr 10, 2017

In order to progress this new access logger I am going to submit a series of smaller PRs. Hopefully these will be easier to review :-)

I have created PR #1408 as the first set of changes.

@timoreimann
Copy link
Contributor

@rjshep shall we close this one in favor of #1408 and other PRs which are yet to come?

@rjshep
Copy link
Contributor

rjshep commented Apr 18, 2017

@timoreimann Yes, please close this PR. I'll follow up with additional PRs with the rest of the functionality once #1408 is accepted.

@timoreimann
Copy link
Contributor

Thanks for the update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants