Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob download does not use url escaping, so slashes in object names causes failures #87

Open
mgsloan opened this issue Dec 7, 2017 · 3 comments

Comments

@mgsloan
Copy link

mgsloan commented Dec 7, 2017

Here is a repro:

#!/usr/bin/env stack
-- stack script --resolver lts-9.17
--   --package gogol --package gogol-storage
--   --package lens --package servant

{-# LANGUAGE OverloadedStrings #-}

import Control.Lens
import Control.Monad
import Control.Monad.IO.Class
import Data.ByteString (ByteString)
import Data.Proxy
import Network.Google
import Network.Google.Storage
import Network.HTTP.Conduit (newManager, tlsManagerSettings)
import Servant.API.ContentTypes
import System.IO
import qualified Data.Text as T

type S = Scopes ObjectsInsert

main :: IO ()
main = do
  manager <- liftIO $ newManager tlsManagerSettings
  credentials <- liftIO $ getApplicationDefault manager
  logger <- newLogger Trace stdout
  env <- newEnvWith credentials logger manager
  runResourceT $ runGoogle env $ do
    insertAndGet "gogol-bug:a-key"
    insertAndGet "gogol-bug:a-key/with-slashes"

insertAndGet :: T.Text -> Google S ()
insertAndGet key = do
  let bucket = "a-bucket"
      body = "a-body" :: ByteString
  let ureq = objectsInsert bucket object' & oiName ?~ key
  upload ureq (toBody (Proxy :: Proxy OctetStream) body)
  let dreq = objectsGet bucket key
  void $ download dreq
  liftIO $ putStrLn $ T.unpack key ++ " download successful"

Put it in a file called gogol-bug.hs. Set the bucket variable to a bucket in your account. Then, chmod u+x gogol-bug.hs. Run ./gogol-bug.hs. Get this as output:

[Client Request] {
  host      = www.googleapis.com:443
  secure    = True
  method    = POST
  timeout   = ResponseTimeoutMicro 70000000
  redirects = 10
  path      = /upload/storage/v1/b/a-bucket/o
  query     = ?name=gogol-bug%3Aa-key&alt=json&uploadType=multipart
  headers   = <REDACTED>
  body      =  <msger:253>
}
[Client Response] {
  status  = 200 OK
  headers = <REDACTED>
}
[Client Request] {
  host      = www.googleapis.com:443
  secure    = True
  method    = GET
  timeout   = ResponseTimeoutMicro 70000000
  redirects = 10
  path      = /storage/v1/b/a-bucket/o/gogol-bug:a-key
  query     = ?alt=media
  headers   = <REDACTED>
  body      = 
}
[Client Response] {
  status  = 200 OK
  headers = <REDACTED>
}
gogol-bug:a-key download successful
[Client Request] {
  host      = www.googleapis.com:443
  secure    = True
  method    = POST
  timeout   = ResponseTimeoutMicro 70000000
  redirects = 10
  path      = /upload/storage/v1/b/a-bucket/o
  query     = ?name=gogol-bug%3Aa-key%2Fwith-slashes&alt=json&uploadType=multipart
  headers   = <REDACTED>
  body      =  <msger:253>
}
[Client Response] {
  status  = 200 OK
  headers = <REDACTED>
}
[Client Request] {
  host      = www.googleapis.com:443
  secure    = True
  method    = GET
  timeout   = ResponseTimeoutMicro 70000000
  redirects = 10
  path      = /storage/v1/b/a-bucket/o/gogol-bug:a-key/with-slashes
  query     = ?alt=media
  headers   = <REDACTED>
  body      = 
}
[Client Response] {
  status  = 404 Not Found
  headers = 
}
gogol-bug.hs: ServiceError (ServiceError' {_serviceId = ServiceId "storage:v1", _serviceStatus = Status {statusCode = 404, statusMessage = "Not Found"}, _serviceHeaders = [("X-GUploader-UploadID",<REDACTED>),("Vary","Origin"),("Vary","X-Origin"),("Content-Type","text/html; charset=UTF-8"),("Date","Thu, 07 Dec 2017 14:43:31 GMT"),("Expires","Thu, 07 Dec 2017 14:43:31 GMT"),("Cache-Control","private, max-age=0"),("Content-Length","9"),("Server","UploadServer"),("Alt-Svc",<REDACTED>)], _serviceBody = Just "Not Found"})

In other words, uploading an object with a slash in the key works, but downloading fails. Specifically, note the path of the request, /storage/v1/b/a-bucket/o/gogol-bug:a-key/with-slashes. The google docs specify that path parts must be url encoded - https://cloud.google.com/storage/docs/json_api/#encoding . However, it appears that they are being substituted verbatim.

Is this a gotcha with servant's Capture for Text?

@mgsloan
Copy link
Author

mgsloan commented Dec 8, 2017

Ah, I see now that the docs mention it

Name of the object. For information about how to URL encode object names to be path safe, see Encoding URI Path Parts.

Shouldn't the API handle this for you? Can supply the current thing as a more raw API. #

@bergey
Copy link

bergey commented Jan 31, 2019

I just ran into this, and was about to open an issue. The docs on Encoding URI Path Parts say

Note that encoding is typically handled for you by client libraries, so you can pass the raw object name to them.

@mgsloan
Copy link
Author

mgsloan commented Jan 31, 2019

Yeah, I think the escaping ought to be handled by the library. FWIW, here's the function I wrote for doing the escaping, back when I encountered this issue:

import Data.Text (Text)
import Data.Text.Encoding (encodeUtf8, decodeUtf8With)
import Data.Text.Encoding.Error (lenientDecode)
import qualified Network.HTTP.Types as HTTP

urlEncodeKey :: Text -> Text
urlEncodeKey
  = decodeUtf8With lenientDecode
  . HTTP.urlEncode False
  . encodeUtf8

To use it, Network.Google.Storage.objectsGet bucket (urlEncodeKey key). It should be possible to write a more efficient version that doesn't convert to ByteString and back, but eh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants