New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming support to certain endpoints #13

Closed
denibertovic opened this Issue Aug 4, 2016 · 17 comments

Comments

Projects
None yet
2 participants
@denibertovic
Owner

denibertovic commented Aug 4, 2016

We should look into adding streaming support for certain endpoints. Currently only the logs endpoint and the build endpoint come to mind.

The way reading logs is implemented now is not that great as we consume the entire response at once
rather than chunk by chunk. For large log outputs this can be a problem.

This even more important for the build endpoint as we're sending a tar.gz of the build context which
might end up being very large.

This will likely need a revamp of the HttpHandler type and making use of http-conduit.
I'm not yet sure how to best accomplish this especially since we only need streaming
for a few endpoints (it probably wouldn't work for all endpoints actually).

@denibertovic denibertovic added this to the 0.4.0.0 milestone Aug 4, 2016

@jprider63

This comment has been minimized.

Collaborator

jprider63 commented Sep 7, 2016

Here are two potential solutions Deni and I have been considering.

One option is to modify the DockerT monad transformer to hold a HTTP Manager instead of a HttpHandler. This is less than ideal since users are unable to replace the request handler with a custom implementation if they want.

Another option is to add an additional argument to HttpHandler that accepts a conduit sink. The disadvantage here is that users are constrained to use conduit.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Sep 9, 2016

Another option is to add an additional argument to HttpHandler that accepts a conduit sink. The disadvantage here is that users are constrained to use conduit.

I'm partial to this method. It seems like a simpler solution. I don't think it's that big of a deal to constrain the user to conduit...for now anyway...that is for the next release. If we find out that it's a huge issue we can always release a new major version where we address this.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Sep 9, 2016

It appears that modifying the code to support a streaming request (for the purpose of building images...ie. when sending the build context to the daemon) is pretty straightforward and doesn't even
break API, I don't think. I need to work on it a bit more before pushing but it seems to be working out ok
so far.

Hopefully I'll have time over the weekend to take a crack at the streaming response part of the equation as well.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Sep 12, 2016

@jprider63 I've pushed the changes I've made to get the buildImageFromDockerfile implemented to the streaming branch. It's still pretty rough as I'm not sure if I'm catching everything I need to be catching (IO heavy). But it's definitely streaming everything that needs to be streamed from what I can tell (I've tested with having a very large build context directory 5GB...ie. it doesn't load everything into memory first). I marked some things with "TODOs" as I didn't wanna waste time on those right now. I think that we at least need to add support for "X-Registry-Config" before we can consider this feature done but we can come back to that...

Anyhow, next step is to look at streaming the responses from certain endpoints (build endpoint included), but I haven't had time over the weekend to take a stab at it yet as the build stuff proved more time consuming than I initially anticipated.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Sep 14, 2016

Eh, it seems the process that should ignore some files/folders has a bug int it's implementation. Which is a show stopper mostly. The way it works is that .dockerignore can contain a pattern like ".git" which will not match ".git/description". One would assume that it would given that popular implementations (.gitignore, .dockerignore) work this way but every glob matching library will return False on this one.

So I guess I can't just filter out directories and focus on just files (I wanted to skip empty directories). A workable solution would be to decide to traverse a directory for it's files based on the exclude/include patterns only. This does not have anything to do with streaming per say but it's a bug in the implementation of the build endpoint.

@jprider63

This comment has been minimized.

Collaborator

jprider63 commented Sep 14, 2016

Wow I'm surprised that we don't need to use http-conduit to support upload streaming. That's kind of nice.

Looking at the dockerignore documentation, it looks relatively complicated. Maybe it's worthwhile to implement our own parser.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Sep 15, 2016

@jprider63 I did have to use conduit here: https://github.com/denibertovic/docker-hs/blob/streaming/src/Docker/Client/Internal.hs#L87

I had to construct a chunked request. But from what I can tell that's the only needed change and it didn't change the API of the lib.

Re dockerignore/build stuff: currently it's not really an issue to parse the file .. that was relatively straightforward...the issue is with the logic I wrote once I got the file :) (ie. I messed up somewhere).

@denibertovic denibertovic self-assigned this Sep 15, 2016

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Sep 16, 2016

@jprider63 I think I've fixed the issue I was seeing before. We are excluding the correct files on all of my projects that I've tested with. There is one small issue left (precedence rules) that we ignore for now,
and I've noted this in the commit message and in the comments in the code for future releases. I don't find this terribly important for now mainly because I haven't been using the include/exclude patterns in that way in any of my projects. So, if anyone feels that it's terribly important to them we can revisit later I guess.

I'm moving on to the streaming response parts of this issue next, hopefully I'll make some progress today.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Dec 1, 2016

I've been tied up with work for some time now. I'm probably going to revisit this approach in December again. The priority still stands....implement build, push, pull functionality first and go from there.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Jan 21, 2017

@jprider63 I've been trying on-and-off to wiggle http-conduit into this design for the pull image and logs endpoint but I'm not able to get it to work.

@jprider63

This comment has been minimized.

Collaborator

jprider63 commented Jan 23, 2017

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Jan 26, 2017

@jprider63 I haven't pushed anything new to the streaming branch since I haven't gotten it to work at all...And it feels wrong to push broken stuff especially when it comments out most of the code. I'm kind of rethinking the design at this point....if we should be using conduit or something else (like pipes).

@jprider63

This comment has been minimized.

Collaborator

jprider63 commented Feb 9, 2017

I can look at this a bit today. It might be helpful to see what you've tried. Maybe push to a different, temporary branch?

@jprider63

This comment has been minimized.

Collaborator

jprider63 commented Feb 10, 2017

I took a shot at implementing this. It compiles, but I haven't had a chance to test it much yet. Here are a few comments:

  • I had to change HttpHandler to a newtype wrapper since I had to use existentials to keep it in the ReaderT. I'm still trying to figure out a way to leave it as a type alias, but I'm not sure if this is possible.
  • We probably should use ResourceT in requestHelper'.
  • Could you look over getContainerLogsStream? The follow argument to ContainerLogsEndpoint should be True, right?
@denibertovic

This comment has been minimized.

Owner

denibertovic commented Feb 15, 2017

@jprider63 Thanks! 🍰 It compiles, and yes the second argument to ContainerLogsEndpoint (follow) should be true since this will indicate to the daemon that we want the stream.

I haven't had time to play around with it yet other than trying a few minutes with ghci but this is what I get:

λ: runDockerT (defaultClientOpts, defaultHttpHandler) $ getDockerVersion  

<interactive>:2:32:
Couldn't match kind ‘*’ with ‘* -> *’
When matching types
  m0 :: * -> *
  HttpHandler :: (* -> *) -> *
Expected type: HttpHandler m
  Actual type: m0 (HttpHandler m0)
In the expression: defaultHttpHandler
In the first argument of ‘runDockerT’, namely
  ‘(defaultClientOpts, defaultHttpHandler)’
@jprider63

This comment has been minimized.

Collaborator

jprider63 commented Mar 10, 2017

I've fixed the tests for the streaming branch and merged dev into streaming (I needed some of it's functionality). Think we can release this new version soon? I'll create a pull request soon.

@denibertovic

This comment has been minimized.

Owner

denibertovic commented Mar 18, 2017

Closed via #26 🍰 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment