diff --git a/book/asciidoc/http-conduit.asciidoc b/book/asciidoc/http-conduit.asciidoc index 54c442fe..cc74090f 100644 --- a/book/asciidoc/http-conduit.asciidoc +++ b/book/asciidoc/http-conduit.asciidoc @@ -1,211 +1,6 @@ [appendix] == http-conduit -Most of Yesod is about serving content over HTTP. But that's only half the -story: someone has to receive it. And even when you're writing a web app, -sometimes that someone will be you. If you want to consume content from other -services or interact with RESTful APIs, you'll need to write client code. And -the recommended approach for that is http-conduit. - -This chapter is not directly connected to Yesod, and will be generally useful -for anyone wanting to make HTTP requests. - -=== Synopsis - -[source, haskell] ----- -{-# LANGUAGE OverloadedStrings #-} -import Network.HTTP.Conduit -- the main module - --- The streaming interface uses conduits -import Data.Conduit -import Data.Conduit.Binary (sinkFile) - -import qualified Data.ByteString.Lazy as L -import Control.Monad.IO.Class (liftIO) -import Control.Monad.Trans.Resource (runResourceT) - -main :: IO () -main = do - -- Simplest query: just download the information from the given URL as a - -- lazy ByteString. - simpleHttp "http://www.example.com/foo.txt" >>= L.writeFile "foo.txt" - - -- Use the streaming interface instead. We need to run all of this inside a - -- ResourceT, to ensure that all our connections get properly cleaned up in - -- the case of an exception. - runResourceT $ do - -- We need a Manager, which keeps track of open connections. simpleHttp - -- creates a new manager on each run (i.e., it never reuses - -- connections). - manager <- liftIO $ newManager conduitManagerSettings - - -- A more efficient version of the simpleHttp query above. First we - -- parse the URL to a request. - req <- liftIO $ parseUrl "http://www.example.com/foo.txt" - - -- Now get the response - res <- http req manager - - -- And finally stream the value to a file - responseBody res $$+- sinkFile "foo.txt" - - -- Make it a POST request, don't follow redirects, and accept any - -- status code. - let req2 = req - { method = "POST" - , redirectCount = 0 - , checkStatus = \_ _ _ -> Nothing - } - res2 <- http req2 manager - responseBody res2 $$+- sinkFile "post-foo.txt" ----- - -=== Concepts - -The simplest way to make a request in +http-conduit+ is with the +simpleHttp+ -function. This function takes a +String+ giving a URL and returns a -+ByteString+ with the contents of that URL. But under the surface, there are a -few more steps: - -* A new connection +Manager+ is allocated. - -* The URL is parsed to a +Request+. If the URL is invalid, then an exception is thrown. - -* The HTTP request is made, following any redirects from the server. - -* If the response has a status code outside the 200-range, an exception is thrown. - -* The response body is read into memory and returned. - -* +runResourceT+ is called, which will free up any resources (e.g., the open socket to the server). - -If you want more control of what's going on, then you can configure any of the -steps above (plus a few more) by explicitly creating a +Request+ value, -allocating your +Manager+ manually, and using the +http+ and +httpLbs+ -functions. - -=== Request - -The easiest way to creating a +Request+ is with the +parseUrl+ function. This -function will return a value in any +Failure+ monad, such as +Maybe+ or +IO+. -The last of those is the most commonly used, and results in a runtime exception -whenever an invalid URL is provided. However, you can use a different monad if, -for example, you want to validate user input. - -[source, haskell] ----- -import Network.HTTP.Conduit -import System.Environment (getArgs) -import qualified Data.ByteString.Lazy as L -import Control.Monad.IO.Class (liftIO) - -main :: IO () -main = do - args <- getArgs - case args of - [urlString] -> - case parseUrl urlString of - Nothing -> putStrLn "Sorry, invalid URL" - Just req -> withManager $ \manager -> do - res <- httpLbs req manager - liftIO $ L.putStr $ responseBody res - _ -> putStrLn "Sorry, please provide exactly one URL" ----- - -The +Request+ type is abstract so that +http-conduit+ can add new settings in -the future without breaking the API (see the Settings Type chapter for more -information). In order to make changes to individual records, you use record -notation. For example, a modification to our program that issues +HEAD+ -requests and prints the response headers would be: - -[source, haskell] ----- -{-# LANGUAGE OverloadedStrings #-} -import Network.HTTP.Conduit -import System.Environment (getArgs) -import qualified Data.ByteString.Lazy as L -import Control.Monad.IO.Class (liftIO) - -main :: IO () -main = do - args <- getArgs - case args of - [urlString] -> - case parseUrl urlString of - Nothing -> putStrLn "Sorry, invalid URL" - Just req -> withManager $ \manager -> do - let reqHead = req { method = "HEAD" } - res <- http reqHead manager - liftIO $ do - print $ responseStatus res - mapM_ print $ responseHeaders res - _ -> putStrLn "Sorry, please provide example one URL" ----- - -There are a number of different configuration settings in the API, some noteworthy ones are: - -proxy:: Allows you to pass the request through the given proxy server. - -redirectCount:: Indicate how many redirects to follow. Default is 10. - -checkStatus:: Check the status code of the return value. By default, gives an exception for any non-2XX response. - -requestBody:: The request body to be sent. Be sure to also update the +method+. For the common case of url-encoded data, you can use the +urlEncodedBody+ function. - -=== Manager - -The connection manager allows you to reuse connections. When making multiple -queries to a single server (e.g., accessing Amazon S3), this can be critical -for creating efficient code. A manager will keep track of multiple connections -to a given server (taking into account port and SSL as well), automatically -reaping unused connections as needed. When you make a request, +http-conduit+ -first tries to check out an existing connection. When you're finished with the -connection (if the server allows keep-alive), the connection is returned to the -manager. If anything goes wrong, the connection is closed. - -To keep our code exception-safe, we use the +ResourceT+ monad transformer. All -this means for you is that your code needs to be wrapped inside a call to -+runResourceT+, either implicitly or explicitly, and that code inside that -block will need to +liftIO+ to perform normal IO actions. - -There are two ways you can get ahold of a manager. +newManager+ will return a -manager that will not be automatically closed (you can use +closeManager+ to do -so manually), while +withManager+ will start a new +ResourceT+ block, allow you -to use the manager, and then automatically close the +ResourceT+ when you're -done. If you want to use a +ResourceT+ for an entire application, and have no -need to close it, you should probably use +newManager+. - -One other thing to point out: you obviously don't want to create a new manager -for each and every request; that would defeat the whole purpose. You should -create your +Manager+ early and then share it. - -=== Response - -The +Response+ datatype has three pieces of information: the status code, the -response headers, and the response body. The first two are straight-forward; -let's discuss the body. - -The +Response+ type has a type variable to allow the response body to be of -multiple types. If you want to use ++http-conduit++'s streaming interface, you -want this to be a +Source+. For the simple interface, it will be a lazy -+ByteString+. One thing to note is that, even though we use a lazy -+ByteString+, _the entire response is held in memory_. In other words, we -perform no lazy I/O in this package. - -NOTE: The +conduit+ package does provide a lazy module which would allow you to -read this value in lazily, but like any lazy I/O, it's a bit unsafe, and -definitely non-deterministic. If you need it though, you can use it. - -=== http and httpLbs - -So let's tie it together. The +http+ function gives you access to the streaming -interface (i.e., it returns a +Response+ using a +ResumableSource+) while -+httpLbs+ returns a lazy +ByteString+. Both of these return values in the -+ResourceT+ transformer so that they can access the +Manager+ and have -connections handled properly in the case of exceptions. - -NOTE: If you want to ignore the remainder of a large response body, you can -connect to the +sinkNull+ sink. The underlying connection will automatically be -closed, preventing you from having to read a large response body over the -network. +This content has been superceded by +link:https://github.com/commercialhaskell/jump/blob/master/doc/http-client.md[Jump +content on http-client].