Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(new source): Initial docker source implementation #787

Merged
merged 40 commits into from
Oct 8, 2019
Merged

feat(new source): Initial docker source implementation #787

merged 40 commits into from
Oct 8, 2019

Conversation

ktff
Copy link
Contributor

@ktff ktff commented Aug 23, 2019

Closes #613 with exception of ignore_* configuration, more details on that down in open issues.


There are a few details changed or added to the specification. Namely:

  • If docker source is used, vector must have sudo privileges, or docker group membership. This is unfortunately unavoidable (without extra external configuration and performance and security hit), since vector process needs to communicate with docker daemon, and that communication requires before mentioned. Otherwise Permission denied (os error 13) is thrown.
  • Logs are collected from start of docker source. That means, there are no backlogs.
  • include_containers matches on start of names. Most of docker ecosystem, that I encountered, has this behavior. So it makes sense to follow this, but this is worth discussing. It also makes code simplier and more performant.
  • If both include_containers and include_labels are not empty, then they interact as AND. In a sense that container must have one of the names and one of the labels to be logged. This is default docker behavior.
  • include_containers can contain both names and ID-s of containers. This is default docker behavior.
  • There exists a specific point in time after which logs from all applicable containers are captured. With one exception, mentioned bellow as delayed_logs.

Tests of docker source require presence of docker, internet connection to download busybox image, and above mentioned sudo privileges, as such I have putted them behind a feature flag docker-integration-tests, since I am not sure that the build system here can provide all of that.


Currently emitted Events have explicit fields: TIMESTAMP, MESSAGE. I am not sure if TIMESTAMP should be explicit or not, so do check this.


There are still some open issues/questions/extensions to be resolved, hence this draft.
Issues:

  • - ignore_* configuration (EDIT: delayed)

    • After coding this implementation, and after more exposure to docker ecosystem, I am not sure of it's usefulness or idiomaticity.
    • docker naturally supports includes but not excludes.
    • Implementation would require extra request to docker daemon per container start, if ignore_* is used.
    • This also has issues with renaming, removing label, and adding label to a running container. To solve that, scope of received events from docker will need to be greater, and an extra request to docker daemon per such new event.
    • All in all, it would add a lot of complex code for a small, if any, benefit.
  • - requirement_detection (EDIT: this is a part of other issues related of error reporting)

    • To detect when sudo requirements and requirements from specification are not fullfilled, and to report that to the user.
  • - async_docker

    • Currently, implementation uses two crates async_docker and bollard where one crate should be enough, because there are some features missing in both of them. Namely bollard is missing one feature, if not for that async_docker would not be needed. And out of them, I consider bollard as of more quality and liveness.
    • An issue could be raised in bollard to implement this feature, in the meanwhile async_docker is used.
  • - delayed_logs (EDIT: not worth mentioning)

    • There exists a special case which happens during docker source initialization, specifically between taking timestamp after which logs are taken, and requesting currently running containers. A container that happens to be running during timestamp, but stops before getting list of currently running containers, will not have it's logs pulled until the container starts running again.
    • This would rarely happen, and even then logs are only delayed if it starts again.
    • This could be fixed, but would require extra code + requests to docker daemon.
    • This is really a corner case, and probably not worth the effort.
  • - docker_restart (EDIT: should be part of a greater system)

    • As it stands, if docker restarts, docker source will finish execution with an error.
    • Is there some mechanism in vector which restarts sources if they fail?
  • - event_source

    • Event could be enriched with information on if log came from stderr or stdout.
    • If added, how should it be named in Event?
  • - event_container_id

    • Event could be enriched with information of container id from which log came.
    • HOST could be used as a name, but I am unsure, hence this.
  • - vector_in_docker (EDIT)

    • When vector is run in a container in docker, it's capable of picking up it's own logs. Which is interesting, like it's self aware or something, but if a positive feedback loop occurs there will be trouble.
    • I think that for now, docker source should detect it self running in container and exclude it.
  • - make_test (EDIT)

    • docker tests should be integrated to test command so that they run with make test.

- [ ] - fussybeaver/bollard#33
- A bug which causes for every received log to be printed to console.
- This needs to be resolved before merger.
(EDIT: not necessary anymore)


EDIT: But even without resolving any of the above at least vector_in_docker should be resolved, this implementation of docker source is solid and ready for use. And of course, after a review.

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@ktff
Copy link
Contributor Author

ktff commented Aug 23, 2019

Rust is hallucinating a test in the documentation.

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@binarylogic binarylogic added the needs: docs Needs documentation updates label Aug 23, 2019
@binarylogic
Copy link
Contributor

Thanks @ktff, this looks great. I'm going to have @LucioFranco review. He is out today and Monday, but will be able to review on Tuesday, but I can provide comments in the interim:

If docker source is used, vector must have sudo privileges, or docker group membership.

This is fine, we can document it.

ignore_* configuration

I am fine not doing this, for now, we can wait and see if a user presents a use cases where it would be needed.

requirement_detection

In my opinion, this should be done as part of the healthcheck. @lukesteensen can you confirm?

delayed_logs

This is fine, and let's defer resolving this for now. Perhaps we can open an issue or document the behavior?

Event could be enriched with information on if log came from stderr or stdout.

I'd use the stream field. In general, we want to conform to the default schema here. This should also be documented in the docker source.

I think that for now, docker source should detect it self running in container and exclude it.

Agree. I would like to document this and/or open an issue for an option to enable this.


In general, we need a thorough documentation review before merging this. Once code review completes I'll get in here and help write the docs.

This looks really good though! Thank you for thinking through all of the little details.

event_container_id - HOST could be used as a name, but I am unsure, hence this.

I would prefer that we use container since it is more descriptive of the source. host could be reserved for an actual hostname. If we can get that, I would like to include that also.

resolves event_source

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
resolves event_container_id

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@ktff
Copy link
Contributor Author

ktff commented Aug 25, 2019

delayed_logs

This is fine, and let's defer resolving this for now. Perhaps we can open an issue or document the behavior?

I am for this to be a low priority issue, if any. Since it's outshadowed by uncertainty of exact time of starting vector and docker source, so I wouldn't bother users with it.


vector_in_docker

Agree. I would like to document this and/or open an issue for an option to enable this.

I am for documentation and an issue.

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
resolves vector_in_docker

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@ktff
Copy link
Contributor Author

ktff commented Aug 25, 2019

vector_in_docker
Implementation excludes self container on best effort basis. It does this only if no include_* is present. That is, if it by default logs all containers. This should also enable users to enable vector to log itself if they wish so. But documentation should be clear on possible problems that can be created with that.

As such, I now think that a separate issue for this is not necessary.

@LucioFranco
Copy link
Contributor

LucioFranco commented Aug 27, 2019

@ktff just curious what is left to make this PR ready for review? I am gonna make a pass at this today.

@ktff
Copy link
Contributor Author

ktff commented Aug 27, 2019

@LucioFranco I guess it is.

I would like to add requirement_detection and docker_restart, but they could be added later separately with the remaining points, so that this isn't stalled.

So I'll undraft this now.

@ktff ktff marked this pull request as ready for review August 27, 2019 19:31
Copy link
Contributor

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is an excellent start! I'd like to see some of main fn here cleaned up so that it is a bit easier to read.

Cargo.toml Outdated Show resolved Hide resolved
src/sources/docker.rs Outdated Show resolved Hide resolved
src/sources/docker.rs Outdated Show resolved Hide resolved
src/sources/docker.rs Outdated Show resolved Hide resolved
src/sources/docker.rs Outdated Show resolved Hide resolved
src/sources/docker.rs Outdated Show resolved Hide resolved
src/sources/docker.rs Outdated Show resolved Hide resolved
Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@LucioFranco
Copy link
Contributor

@ktff I took a deeper look at the dependencies, why not use shiplift directly? https://github.com/softprops/shiplift I know the author and I am happy to ask any questions.

src/sources/docker.rs Outdated Show resolved Hide resolved
Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@ktff
Copy link
Contributor Author

ktff commented Aug 29, 2019

@ktff I took a deeper look at the dependencies, why not use shiplift directly? https://github.com/softprops/shiplift I know the author and I am happy to ask any questions.

You are right! I surveyed that crate before but stopped when I saw for maneuvering docker containers in their description, thought they only provide ways for, well, maneuvering docker containers. My mistake.

By their documentation, everything that is needed for this code is in there. So I'll replace those two crates with this one.

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@ktff
Copy link
Contributor Author

ktff commented Aug 29, 2019

@LucioFranco

I managed to decompose docker source into several separate structs.
I am also now leaning to separate docker.rs into a folder.

What do you think?

There are two minor features missing in shiplift, but both are just an optimization. Although one of them is an important one. Could you ask the author if he plans to add since filter to container logs command.

@LucioFranco
Copy link
Contributor

@ktff how hard do you think adding that would be? Would you be open to submitting a PR and I can try to nudge a release?

info!(message = "Listening docker events");

// Starting with logs from now.
// TODO: Is this exception acceptable?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is acceptable, but we should ensure this is documented.

Cargo.toml Outdated
@@ -120,7 +120,7 @@ exitcode = "1.1.2"
snafu = "0.4.3"
url = "1.7"
base64 = "0.10.1"
shiplift = "0.5"
shiplift = { git = "https://github.com/LucioFranco/shiplift", branch = "lucio/fix-encrypted-tcp" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses my fork which is related to this PR softprops/shiplift#193. I know the author so I will but them about getting a release but for now we can work off of my fork.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ed82e03 we now use the timber branch of my fork that contains a feature flag fix and rust 2018 idioms cargo fix pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

We should probably make an issue to remember to revert this to proper shiplift version.

This uses a different forked branch that removes openssl
vendored feature which was causing CI issues and cargo
fixed so that it does not produce warnings on >1.39.

Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
@ktff
Copy link
Contributor Author

ktff commented Sep 14, 2019

@LucioFranco we can finally merge this.

Signed-off-by: Kruno Tomola Fabro <krunotf@gmail.com>
@binarylogic
Copy link
Contributor

@ktff I am working on the docs now. Once that is done let's merge 🚀

Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Copy link
Contributor

@LucioFranco LucioFranco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Only thing left is to add docs then this should be ready to merge.

@LucioFranco LucioFranco removed the needs: docs Needs documentation updates label Oct 7, 2019
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
Signed-off-by: Lucio Franco <luciofranco14@gmail.com>
@LucioFranco
Copy link
Contributor

@binarylogic Ok, I think these should be good go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New docker source
3 participants