Data.Conduit.Resumable: resumable sources, sinks, and conduits #92

joeyadams · 2013-03-09T09:15:42Z

This pull request adds a new module, Data.Conduit.Resumable. The conduit package already has ResumableSource; this adds ResumableSink and ResumableConduit, with similar interfaces. ResumableSink is heavily inspired by the [conduit-resumablesink package] by Andrew Miller (@A1kmm).

ResumableSink and ResumableConduit both have tests, though they might not be 100% comprehensive. I marked the Data.Conduit.Resumable module "experimental" in case we find a better interface or discover semantic issues with the existing interface.

I did a couple "clumsy" things in the test code:

Import Data.Conduit, Data.IORef, etc. unqualified.
Use partial do patterns instead of shouldBe.

I know this is inconsistent with the existing code, but I left it this way for a philosophical reason: writing tests should first and foremost be easy. It's better to have some tests that produce ugly failure output than it is to not have those tests at all. If a test fails later and we want to know what the failure output was, we can replace partial patterns with shouldBe as part of the troubleshooting process.

@m

Move ResumableSource implementation to Data.Conduit.Resumable, but re-export it from Data.Conduit. ResumableSink is inspired by the conduit-resumablesink package by Andrew Miller. Changes from the original: * Return completion as Right instead of Left. Right is "after" left, and completion is "after" resumption. On the other hand, a resumable sink usually doesn't complete, and Left is normally used for error values. * Rename -++$$ to -+$$, for consistency with $$+- * Remove connectResumeSink and closeResumableSink, favoring the infix alternatives. However, keep newResumableSink, and add newResumableSource and newResumableConduit. I was originally reluctant to add these, as they might lead to less consistency in application code (you can also say `src $$+ return ()`, which seems just as easy). However, creating an initial ResumableSink this way is particularly inconvenient because of the 'Either' return value in '+$$'. * Use type parameter order "i m r", for consistency with Sink. ConduitM puts the @m@ parameter on the right (to support MonadTrans), while the original ResumableSink had @m@ on the left. * Don't export the data constructor of ResumableSink. If conduit changes finalization semantics in the future, the definition may change.

Test resumable source finalization, too. The ResumableSink tests come from the conduit-resumablesink package by Andrew Miller, with minor modifications. I did a couple "clumsy" things in my tests: * Import Data.Conduit, Data.IORef, etc. unqualified. * Use partial do patterns instead of shouldBe. I know this is inconsistent with the existing code, but I left it this way for a philosophical reason: writing tests should first and foremost be easy. It's better to have some tests that produce ugly failure output than it is to not have those tests at all. If a test fails later and we want to know what the failure output was, we can replace partial patterns with `shouldBe` as part of the troubleshooting process.

snoyberg · 2013-03-11T14:29:10Z

I'm still not fully comfortable about adding resumable sinks and conduits to conduit itself. I feel confident that the use case addressed by resumable sources cannot be addressed by another means, but I don't have a deep enough grasp of resumable conduit/sink to say the same thing yet. For now, I think it would make the most sense to let this module live in a separate package, and in the future we can consider folding it back into conduit itself.

joeyadams · 2013-03-11T16:12:22Z

Fair enough. I'll put together a conduit-resumable package.

However, I still want to add a feature to ResumableSource:

-- | Finalize a 'ResumableSource' by turning it into a 'Source'.
-- The resulting 'Source' may only be used once.
--
-- This may be used instead of '$$+-' to finalize a 'ResumableSource'.
finishResumableSource :: Monad m => ResumableSource m o -> Source m o

This can be useful when you need to supply ResumableSource data to a function requiring a plain Source. I don't think you can do this with the external API as-is, short of an awkward await/yield loop. The finalizer returned by unwrapResumable is invalid once the source produces another output.

finishResumableSource works by making the Source call the finalizer when it finishes, unless it overrides that finalizer with HaveOutput.

snoyberg · 2013-03-11T16:20:50Z

But finishResumableSource has a problem. Consider:

(rsrc, _) <- sourceFile "foo" $$+ await
src <- finishResumableSource rsrc
src $$ return ()

In this case, the finalizer will not be called. Another example would be:

(rsrc, _) <- sourceFile "foo" $$+ await
src <- finishResumableSource rsrc
(yield "foo" >> src) $$ await

joeyadams · 2013-03-12T22:51:22Z

Darn, you're right. The finalizer would only run if the source is prompted for output. Thanks for pointing this out.

It would be nice if the finalizer were built into Pipe or ConduitM, so we could just have:

($$+) :: Monad m => Source m a -> Sink a m b -> m (Source m a, b)

This would reduce API complexity. For example, the http function in http-conduit could return a plain Source instead of a ResumableSource (meaning no need to explain ResumableSource vs Source). It would also enable the intuition that any pipe (Source, Sink, or Conduit) is a state machine that can be stepped or closed. On the other hand, it might constrain future optimizations.

This problem shouldn't hurt my conduit-resumable package too badly, but it will make it harder to glue resumables together.

snoyberg · 2013-03-13T03:43:23Z

If ConduitM had that functionality, there'd be no point for resumables. I've actually considered adding something like that, but (IIRC) it breaks the monad instance and kills performance.

joeyadams · 2013-03-14T05:55:37Z

Thanks for all the insight. I pushed the conduit-resumable package to GitHub, but not to Hackage yet.

snoyberg · 2013-09-08T14:11:35Z

Another possible location for this is the new conduit-extra package. @ppetr already created a Resumable module over there: https://github.com/snoyberg/conduit/blob/master/conduit-extra/Data/Conduit/Extra/Resumable.hs

joeyadams added 3 commits March 8, 2013 20:30

Minor grammar tweaks (that vs which)

d56401e

snoyberg closed this Sep 8, 2013

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Data.Conduit.Resumable: resumable sources, sinks, and conduits #92

Data.Conduit.Resumable: resumable sources, sinks, and conduits #92

joeyadams commented Mar 9, 2013

snoyberg commented Mar 11, 2013

joeyadams commented Mar 11, 2013

snoyberg commented Mar 11, 2013

joeyadams commented Mar 12, 2013

snoyberg commented Mar 13, 2013

joeyadams commented Mar 14, 2013

snoyberg commented Sep 8, 2013

Data.Conduit.Resumable: resumable sources, sinks, and conduits #92

Data.Conduit.Resumable: resumable sources, sinks, and conduits #92

Conversation

joeyadams commented Mar 9, 2013

snoyberg commented Mar 11, 2013

joeyadams commented Mar 11, 2013

snoyberg commented Mar 11, 2013

joeyadams commented Mar 12, 2013

snoyberg commented Mar 13, 2013

joeyadams commented Mar 14, 2013

snoyberg commented Sep 8, 2013