New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP Authentication #45

Closed
scott-fleischman opened this Issue May 1, 2017 · 13 comments

Comments

Projects
None yet
3 participants
@scott-fleischman

scott-fleischman commented May 1, 2017

It would be useful to be able to import Dhall expressions from URLs that require authentication such as retrieving a file from a private GitHub repository.

In the GitHub file case, it would be sufficient to allow one to specify Authorization and Accept headers to access a file using https://api.github.com.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 May 2, 2017

Collaborator

I suppose the most general solution is to allow the user to specify the relevant headers directly within the source as a value of type:

List { header : Text, value : Text }

... such as:

http://www.example.com using
[ { header = "Authorization", value = "token ..." }
, { header = "Accept"       , value = "text/plain" }
]

... and then you could protect those values using Dhall's import system. For example, the authorization token could be imported from a protected file:

http://www.example.com using
[ { header = "Authorization", value = ./secret }
, { header = "Accept"       , value = "text/plain" }
]

... or from an environment variable:

http://www.example.com using
[ { header = "Authorization", value = env:TOKEN }
, { header = "Accept"       , value = "text/plain" }
]

The main issue I can see with this is that this means that Dhall could have multiple sets of import/typecheck/normalize phases since the header information could itself be imported from a secured URL:

http://www.example.com using
[ { header = "Authorization"
  , value =
    http://www.example.com/foo using
    [ { header = "Authorization", value = "token ..." }
    , { header = "Accept", value = "text/plain" }
    ]
  }
, { header = "Accept", value = "text/plain" }
]

... which is fine and probably worthwhile to support since it would be very powerful if it worked and was safe

Collaborator

Gabriel439 commented May 2, 2017

I suppose the most general solution is to allow the user to specify the relevant headers directly within the source as a value of type:

List { header : Text, value : Text }

... such as:

http://www.example.com using
[ { header = "Authorization", value = "token ..." }
, { header = "Accept"       , value = "text/plain" }
]

... and then you could protect those values using Dhall's import system. For example, the authorization token could be imported from a protected file:

http://www.example.com using
[ { header = "Authorization", value = ./secret }
, { header = "Accept"       , value = "text/plain" }
]

... or from an environment variable:

http://www.example.com using
[ { header = "Authorization", value = env:TOKEN }
, { header = "Accept"       , value = "text/plain" }
]

The main issue I can see with this is that this means that Dhall could have multiple sets of import/typecheck/normalize phases since the header information could itself be imported from a secured URL:

http://www.example.com using
[ { header = "Authorization"
  , value =
    http://www.example.com/foo using
    [ { header = "Authorization", value = "token ..." }
    , { header = "Accept", value = "text/plain" }
    ]
  }
, { header = "Accept", value = "text/plain" }
]

... which is fine and probably worthwhile to support since it would be very powerful if it worked and was safe

@mckeankylej

This comment has been minimized.

Show comment
Hide comment
@mckeankylej

mckeankylej May 15, 2017

Super +1 on this idea!

mckeankylej commented May 15, 2017

Super +1 on this idea!

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 May 30, 2017

Collaborator

Sorry for the delay on this. Implementing the fully general approach is way trickier than I thought it would be and requires very invasive changes to the library API. There are some simpler approaches I've been considering but in order to decide between them I need a better idea of what are the use cases that both of you have in mind for this feature

Collaborator

Gabriel439 commented May 30, 2017

Sorry for the delay on this. Implementing the fully general approach is way trickier than I thought it would be and requires very invasive changes to the library API. There are some simpler approaches I've been considering but in order to decide between them I need a better idea of what are the use cases that both of you have in mind for this feature

@scott-fleischman

This comment has been minimized.

Show comment
Hide comment
@scott-fleischman

scott-fleischman May 30, 2017

My use case is shared project configuration across multiple private GitHub repositories. In my current idea, the Dhall code will make at least one HTTP request for a file in a private repo, and possibly many such requests.

So for my purpose, I would be happy with specifying an Authorization header from an environment variable or from a local file (not in source control) for passable security. I would also like to use a literal value for Accept header to fine-tune the response content from the GitHub API.

scott-fleischman commented May 30, 2017

My use case is shared project configuration across multiple private GitHub repositories. In my current idea, the Dhall code will make at least one HTTP request for a file in a private repo, and possibly many such requests.

So for my purpose, I would be happy with specifying an Authorization header from an environment variable or from a local file (not in source control) for passable security. I would also like to use a literal value for Accept header to fine-tune the response content from the GitHub API.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 May 30, 2017

Collaborator

Ignoring what I originally proposed, what would be the best approach for provisioning the authorization header securely? For example, I assume a file is more secure than an environment variable because files can restrict access via permissions, but is there an approach that you would prefer to a file?

Collaborator

Gabriel439 commented May 30, 2017

Ignoring what I originally proposed, what would be the best approach for provisioning the authorization header securely? For example, I assume a file is more secure than an environment variable because files can restrict access via permissions, but is there an approach that you would prefer to a file?

@scott-fleischman

This comment has been minimized.

Show comment
Hide comment
@scott-fleischman

scott-fleischman May 30, 2017

I don't have too much of a preference between environment variables and files, both seem like common yet simple approaches and my impression is that neither are ideal from a security perspective. Beyond that it seems that some kind of external server/vault can be used to access secrets, but that seems out of scope here.

Also another thing that would be important is to be able to reuse the authorization to avoid repeating the header information all the time. Either being able to bind auth to the header list and pass auth to each URL or being able to make a binding for a function makeAuthorizedRequest that can part or all of the URL as an argument.

scott-fleischman commented May 30, 2017

I don't have too much of a preference between environment variables and files, both seem like common yet simple approaches and my impression is that neither are ideal from a security perspective. Beyond that it seems that some kind of external server/vault can be used to access secrets, but that seems out of scope here.

Also another thing that would be important is to be able to reuse the authorization to avoid repeating the header information all the time. Either being able to bind auth to the header list and pass auth to each URL or being able to make a binding for a function makeAuthorizedRequest that can part or all of the URL as an argument.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 May 30, 2017

Collaborator

One more question: do you need to use the dhall executable or would you be comfortable using the Dhall library API directly to customize the import logic to support authoriation? In other words, if the API made it easier to supply custom HTTP headers would you be okay with using that?

Collaborator

Gabriel439 commented May 30, 2017

One more question: do you need to use the dhall executable or would you be comfortable using the Dhall library API directly to customize the import logic to support authoriation? In other words, if the API made it easier to supply custom HTTP headers would you be okay with using that?

@scott-fleischman

This comment has been minimized.

Show comment
Hide comment
@scott-fleischman

scott-fleischman May 30, 2017

I would prefer to use a published build of dhall-to-yaml from Hackage and have the Dhall code be able to handle the authentication for HTTP requests. I would be happiest if pure Dhall code could address our needs.

I could have a custom build of dhall-to-yaml or a custom Haskell application that executes Dhall code. But the extra piece just for HTTP authentication is an inconvenience.

scott-fleischman commented May 30, 2017

I would prefer to use a published build of dhall-to-yaml from Hackage and have the Dhall code be able to handle the authentication for HTTP requests. I would be happiest if pure Dhall code could address our needs.

I could have a custom build of dhall-to-yaml or a custom Haskell application that executes Dhall code. But the extra piece just for HTTP authentication is an inconvenience.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Jun 14, 2017

Collaborator

Still working on this very slowly. So another thing I'm realizing while I'm working on this is that if this is implemented within the language the header information should be passed as an import of some kind, like this:

http://example.com using ./headers

... or this:

http://example.com using env:HEADERS

There are two main reasons for this:

First, the header information has to be parsed/loaded/type-checked independently of the surrounding module in a separate phase, but if you provide the header information inline then it can lead to user confusion if they do something like this:

let
  authorization = "token ..."
in
  http://www.example.com using
  [ { header = "Authorization", value = authorization }
  , { header = "Accept"       , value = "text/plain" }
  ]

That would lead to a confusing Unbound variable error for user due to the authorization in the header information. The let-bound authorization wouldn't be in scope when resolving http://www.example.com, which occurs in a separate phase.

Forcing the header information to be stored in a separate import ensures that it's clear to users that the header expression must be closed and avoids this sort of confusion

The second reason for doing this is that it simplifies the implementation significantly

I assume that's okay with you two, but I just wanted to run that requirement by here in case anybody had any objections

Collaborator

Gabriel439 commented Jun 14, 2017

Still working on this very slowly. So another thing I'm realizing while I'm working on this is that if this is implemented within the language the header information should be passed as an import of some kind, like this:

http://example.com using ./headers

... or this:

http://example.com using env:HEADERS

There are two main reasons for this:

First, the header information has to be parsed/loaded/type-checked independently of the surrounding module in a separate phase, but if you provide the header information inline then it can lead to user confusion if they do something like this:

let
  authorization = "token ..."
in
  http://www.example.com using
  [ { header = "Authorization", value = authorization }
  , { header = "Accept"       , value = "text/plain" }
  ]

That would lead to a confusing Unbound variable error for user due to the authorization in the header information. The let-bound authorization wouldn't be in scope when resolving http://www.example.com, which occurs in a separate phase.

Forcing the header information to be stored in a separate import ensures that it's clear to users that the header expression must be closed and avoids this sort of confusion

The second reason for doing this is that it simplifies the implementation significantly

I assume that's okay with you two, but I just wanted to run that requirement by here in case anybody had any objections

@mckeankylej

This comment has been minimized.

Show comment
Hide comment
@mckeankylej

mckeankylej Jun 15, 2017

Actually I think that is quite a good solution. I think it would be easy to use.

mckeankylej commented Jun 15, 2017

Actually I think that is quite a good solution. I think it would be easy to use.

@scott-fleischman

This comment has been minimized.

Show comment
Hide comment
@scott-fleischman

scott-fleischman Jun 15, 2017

That works for me. It also addresses reuse of authorization credentials.

scott-fleischman commented Jun 15, 2017

That works for me. It also addresses reuse of authorization credentials.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Jun 16, 2017

Collaborator

Alright, so I have a very hacky version of this working:

$ cat headers
[ { header = "foo", value = "bar" } ]
$ dhall <<< 'https://httpbin.org/headers using ./headers as Text'
Text

"{\n  \"headers\": {\n    \"Accept-Encoding\": \"gzip\", \n    \"Connection\": \"close\", \n    \"Foo\": \"bar\", \n    \"Host\": \"httpbin.org\"\n  }\n}\n"

It still needs a lot more polish, but it should be ready soon and I will put a pull request when it is ready. Once I restricted the headers to be supplied from a PathType that greatly simplified the implementation.

There is one other corner case that I needed to handle along the way, which is what headers to supply for an import relative to a URL. In other words, suppose that you import:

http://example.com using ./someHeaders

... and then http://example.com serves code that contains a relative import of ./foo. The question is whether or not you should reuse the same headers when importing http://www.example.com/foo. For now I've decided to forward the same headers to all relative imports as well.

My understanding is that this is probably not a security issue, even if you supply a sensitive header to a given host (such as an authorization header). The only way those headers can be forwarded to another import is if the original host (which already had access to the sensitive headers) serves a file with a relative import. If you trust the original URL with the sensitive header, then you should also trust it to not serve code that redirects to a malicious endpoint on the same host. There would be no point in it redirecting to a malicious endpoint since it could just directly steal the sensitive header at the original endpoint.

Also, if you wish to not forward the headers to the same host, you can always reset the headers by just importing the full URL again without any headers instead of using a relative import.

Collaborator

Gabriel439 commented Jun 16, 2017

Alright, so I have a very hacky version of this working:

$ cat headers
[ { header = "foo", value = "bar" } ]
$ dhall <<< 'https://httpbin.org/headers using ./headers as Text'
Text

"{\n  \"headers\": {\n    \"Accept-Encoding\": \"gzip\", \n    \"Connection\": \"close\", \n    \"Foo\": \"bar\", \n    \"Host\": \"httpbin.org\"\n  }\n}\n"

It still needs a lot more polish, but it should be ready soon and I will put a pull request when it is ready. Once I restricted the headers to be supplied from a PathType that greatly simplified the implementation.

There is one other corner case that I needed to handle along the way, which is what headers to supply for an import relative to a URL. In other words, suppose that you import:

http://example.com using ./someHeaders

... and then http://example.com serves code that contains a relative import of ./foo. The question is whether or not you should reuse the same headers when importing http://www.example.com/foo. For now I've decided to forward the same headers to all relative imports as well.

My understanding is that this is probably not a security issue, even if you supply a sensitive header to a given host (such as an authorization header). The only way those headers can be forwarded to another import is if the original host (which already had access to the sensitive headers) serves a file with a relative import. If you trust the original URL with the sensitive header, then you should also trust it to not serve code that redirects to a malicious endpoint on the same host. There would be no point in it redirecting to a malicious endpoint since it could just directly steal the sensitive header at the original endpoint.

Also, if you wish to not forward the headers to the same host, you can always reset the headers by just importing the full URL again without any headers instead of using a relative import.

@Gabriel439

This comment has been minimized.

Show comment
Hide comment
@Gabriel439

Gabriel439 Jun 16, 2017

Collaborator

Alright, the pull request is up at #71

I still need to add documentation to it but in the meantime you can play around with it and let me know if that solves your use case

Collaborator

Gabriel439 commented Jun 16, 2017

Alright, the pull request is up at #71

I still need to add documentation to it but in the meantime you can play around with it and let me know if that solves your use case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment