Data.Conduit.Resumable: resumable sources, sinks, and conduits #92

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
2 participants
@joeyadams
Contributor

joeyadams commented Mar 9, 2013

This pull request adds a new module, Data.Conduit.Resumable. The conduit package already has ResumableSource; this adds ResumableSink and ResumableConduit, with similar interfaces. ResumableSink is heavily inspired by the [conduit-resumablesink package] by Andrew Miller (@A1kmm).

ResumableSink and ResumableConduit both have tests, though they might not be 100% comprehensive. I marked the Data.Conduit.Resumable module "experimental" in case we find a better interface or discover semantic issues with the existing interface.

I did a couple "clumsy" things in the test code:

  • Import Data.Conduit, Data.IORef, etc. unqualified.
  • Use partial do patterns instead of shouldBe.

I know this is inconsistent with the existing code, but I left it this way for a philosophical reason: writing tests should first and foremost be easy. It's better to have some tests that produce ugly failure output than it is to not have those tests at all. If a test fails later and we want to know what the failure output was, we can replace partial patterns with shouldBe as part of the troubleshooting process.

joeyadams added some commits Mar 8, 2013

conduit: Add Data.Conduit.Resumable
Move ResumableSource implementation to Data.Conduit.Resumable,
but re-export it from Data.Conduit.

ResumableSink is inspired by the conduit-resumablesink package by
Andrew Miller.  Changes from the original:

 * Return completion as Right instead of Left.  Right is "after" left,
   and completion is "after" resumption.

   On the other hand, a resumable sink usually doesn't complete,
   and Left is normally used for error values.

 * Rename -++$$ to -+$$, for consistency with $$+-

 * Remove connectResumeSink and closeResumableSink, favoring the
   infix alternatives.

   However, keep newResumableSink, and add newResumableSource and
   newResumableConduit.  I was originally reluctant to add these, as
   they might lead to less consistency in application code (you can
   also say `src $$+ return ()`, which seems just as easy).  However,
   creating an initial ResumableSink this way is particularly
   inconvenient because of the 'Either' return value in '+$$'.

 * Use type parameter order "i m r", for consistency with Sink.
   ConduitM puts the @m@ parameter on the right (to support
   MonadTrans), while the original ResumableSink had @m@ on the left.

 * Don't export the data constructor of ResumableSink.  If conduit
   changes finalization semantics in the future, the definition
   may change.
conduit: add tests for Data.Conduit.Resumable
Test resumable source finalization, too.

The ResumableSink tests come from the conduit-resumablesink package by
Andrew Miller, with minor modifications.

I did a couple "clumsy" things in my tests:

 * Import Data.Conduit, Data.IORef, etc. unqualified.

 * Use partial do patterns instead of shouldBe.

I know this is inconsistent with the existing code, but I left it this way
for a philosophical reason: writing tests should first and foremost
be easy.  It's better to have some tests that produce ugly failure output
than it is to not have those tests at all.  If a test fails later and we
want to know what the failure output was, we can replace partial patterns
with `shouldBe` as part of the troubleshooting process.
@snoyberg

This comment has been minimized.

Show comment Hide comment
@snoyberg

snoyberg Mar 11, 2013

Owner

I'm still not fully comfortable about adding resumable sinks and conduits to conduit itself. I feel confident that the use case addressed by resumable sources cannot be addressed by another means, but I don't have a deep enough grasp of resumable conduit/sink to say the same thing yet. For now, I think it would make the most sense to let this module live in a separate package, and in the future we can consider folding it back into conduit itself.

Owner

snoyberg commented Mar 11, 2013

I'm still not fully comfortable about adding resumable sinks and conduits to conduit itself. I feel confident that the use case addressed by resumable sources cannot be addressed by another means, but I don't have a deep enough grasp of resumable conduit/sink to say the same thing yet. For now, I think it would make the most sense to let this module live in a separate package, and in the future we can consider folding it back into conduit itself.

@joeyadams

This comment has been minimized.

Show comment Hide comment
@joeyadams

joeyadams Mar 11, 2013

Contributor

Fair enough. I'll put together a conduit-resumable package.

However, I still want to add a feature to ResumableSource:

-- | Finalize a 'ResumableSource' by turning it into a 'Source'.
-- The resulting 'Source' may only be used once.
--
-- This may be used instead of '$$+-' to finalize a 'ResumableSource'.
finishResumableSource :: Monad m => ResumableSource m o -> Source m o

This can be useful when you need to supply ResumableSource data to a function requiring a plain Source. I don't think you can do this with the external API as-is, short of an awkward await/yield loop. The finalizer returned by unwrapResumable is invalid once the source produces another output.

finishResumableSource works by making the Source call the finalizer when it finishes, unless it overrides that finalizer with HaveOutput.

Contributor

joeyadams commented Mar 11, 2013

Fair enough. I'll put together a conduit-resumable package.

However, I still want to add a feature to ResumableSource:

-- | Finalize a 'ResumableSource' by turning it into a 'Source'.
-- The resulting 'Source' may only be used once.
--
-- This may be used instead of '$$+-' to finalize a 'ResumableSource'.
finishResumableSource :: Monad m => ResumableSource m o -> Source m o

This can be useful when you need to supply ResumableSource data to a function requiring a plain Source. I don't think you can do this with the external API as-is, short of an awkward await/yield loop. The finalizer returned by unwrapResumable is invalid once the source produces another output.

finishResumableSource works by making the Source call the finalizer when it finishes, unless it overrides that finalizer with HaveOutput.

@snoyberg

This comment has been minimized.

Show comment Hide comment
@snoyberg

snoyberg Mar 11, 2013

Owner

But finishResumableSource has a problem. Consider:

(rsrc, _) <- sourceFile "foo" $$+ await
src <- finishResumableSource rsrc
src $$ return ()

In this case, the finalizer will not be called. Another example would be:

(rsrc, _) <- sourceFile "foo" $$+ await
src <- finishResumableSource rsrc
(yield "foo" >> src) $$ await
Owner

snoyberg commented Mar 11, 2013

But finishResumableSource has a problem. Consider:

(rsrc, _) <- sourceFile "foo" $$+ await
src <- finishResumableSource rsrc
src $$ return ()

In this case, the finalizer will not be called. Another example would be:

(rsrc, _) <- sourceFile "foo" $$+ await
src <- finishResumableSource rsrc
(yield "foo" >> src) $$ await
@joeyadams

This comment has been minimized.

Show comment Hide comment
@joeyadams

joeyadams Mar 12, 2013

Contributor

Darn, you're right. The finalizer would only run if the source is prompted for output. Thanks for pointing this out.

It would be nice if the finalizer were built into Pipe or ConduitM, so we could just have:

($$+) :: Monad m => Source m a -> Sink a m b -> m (Source m a, b)

This would reduce API complexity. For example, the http function in http-conduit could return a plain Source instead of a ResumableSource (meaning no need to explain ResumableSource vs Source). It would also enable the intuition that any pipe (Source, Sink, or Conduit) is a state machine that can be stepped or closed. On the other hand, it might constrain future optimizations.

This problem shouldn't hurt my conduit-resumable package too badly, but it will make it harder to glue resumables together.

Contributor

joeyadams commented Mar 12, 2013

Darn, you're right. The finalizer would only run if the source is prompted for output. Thanks for pointing this out.

It would be nice if the finalizer were built into Pipe or ConduitM, so we could just have:

($$+) :: Monad m => Source m a -> Sink a m b -> m (Source m a, b)

This would reduce API complexity. For example, the http function in http-conduit could return a plain Source instead of a ResumableSource (meaning no need to explain ResumableSource vs Source). It would also enable the intuition that any pipe (Source, Sink, or Conduit) is a state machine that can be stepped or closed. On the other hand, it might constrain future optimizations.

This problem shouldn't hurt my conduit-resumable package too badly, but it will make it harder to glue resumables together.

@snoyberg

This comment has been minimized.

Show comment Hide comment
@snoyberg

snoyberg Mar 13, 2013

Owner

If ConduitM had that functionality, there'd be no point for resumables. I've actually considered adding something like that, but (IIRC) it breaks the monad instance and kills performance.

Owner

snoyberg commented Mar 13, 2013

If ConduitM had that functionality, there'd be no point for resumables. I've actually considered adding something like that, but (IIRC) it breaks the monad instance and kills performance.

@joeyadams

This comment has been minimized.

Show comment Hide comment
@joeyadams

joeyadams Mar 14, 2013

Contributor

Thanks for all the insight. I pushed the conduit-resumable package to GitHub, but not to Hackage yet.

Contributor

joeyadams commented Mar 14, 2013

Thanks for all the insight. I pushed the conduit-resumable package to GitHub, but not to Hackage yet.

@snoyberg

This comment has been minimized.

Show comment Hide comment
@snoyberg

snoyberg Sep 8, 2013

Owner

Another possible location for this is the new conduit-extra package. @ppetr already created a Resumable module over there: https://github.com/snoyberg/conduit/blob/master/conduit-extra/Data/Conduit/Extra/Resumable.hs

Owner

snoyberg commented Sep 8, 2013

Another possible location for this is the new conduit-extra package. @ppetr already created a Resumable module over there: https://github.com/snoyberg/conduit/blob/master/conduit-extra/Data/Conduit/Extra/Resumable.hs

@snoyberg snoyberg closed this Sep 8, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment