Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@callback get_host_data to rhyme with get_peer_data? #948

Closed
garthk opened this issue Jun 4, 2020 · 6 comments
Closed

@callback get_host_data to rhyme with get_peer_data? #948

garthk opened this issue Jun 4, 2020 · 6 comments

Comments

@garthk
Copy link

garthk commented Jun 4, 2020

It'd help us fill in the net.host block of these OpenTelemetry general network connection attributes if Plug.Conn.Adapter declared a c:get_host_data/1 callback matching c:get_peer_data/1:

Attribute name Notes and examples
net.transport Transport protocol used. See note below.
net.peer.ip Remote address of the peer (dotted decimal for IPv4 or RFC5952 for IPv6)
net.peer.port Remote port number as an integer. E.g., 80.
net.peer.name Remote hostname or similar, see note below].
net.host.ip Like net.peer.ip but for the host IP. Useful in case of a multi-IP host.
net.host.port Like net.peer.port but for the host port.
net.host.name Local hostname or similar, see note below.

I can see what I need in conn.adapter thanks to Phoenix' call to :telemetry.execute([:phoenix, :router_dispatch, :start], …):

{Plug.Cowboy.Conn,
 %{
   host: "api.example.com",
   host_info: :undefined,
   method: "GET",
   path: "/api",
   path_info: :undefined,
   peer: {{10, 10, 20, 20}, 32972},
   pid: #PID<0.26689.0>,
   port: 4000,
   scheme: "http",
   sock: {{10, 10, 10, 10}, 4000},
   version: :"HTTP/1.1"
 }}

sock would be very handy, though looking at Plug.Cowboy.Conn I've no idea how it got in there:

defmodule Plug.Cowboy.Conn do
  @behaviour Plug.Conn.Adapter
  @moduledoc false

  def conn(req) do
    %{
      path: path,
      host: host,
      port: port,
      method: method,
      headers: headers,
      qs: qs,
      peer: {remote_ip, _}
    } = req

    %Plug.Conn{
      adapter: {__MODULE__, req},
      # ...
@josevalim
Copy link
Member

@garthk if you have the whole conn, you can call Plug.Conn.get_peer_data/1 to get peer data. However, I am afraid the socket is not available, we would need change Plug.Cowboy to also return the socket and any other data you may need. A PR is welcome or open up an issue with the fields you need and I can tackle it. :)

@mtrudel
Copy link
Contributor

mtrudel commented May 5, 2021

In the process of building out https://github.com/mtrudel/bandit I've also implemented a c:get_local_data/1 callback on an adjunct Bandit.Adapter behaviour for exactly this reason (it's needed to obtain port & SSL info to properly fill out a Plug struct). It's a useful bit to have, for sure.

@josevalim in a somewhat related question, is there an example of a caller for the newly introduced Plug.Conn.Adapter.conn/5 function? I ask because building the URI to pass into that function is a notoriously difficult problem to solve completely in a server context, and passing in the constituent parts of the URI would be much easier (indeed, to make use of this function in Bandit I would need to construct a URI struct from the exact components that the above function then extracts from the URI struct). Given that the get_local_data callback under discussion here is directly related to getting this information, I wonder if there isn't a chance to revise both at once.

@josevalim
Copy link
Member

You should be able to build the URI by hand by building the URI struct directly: %URI{host: ...}. WDYT?

Other than that, this is a very separate discussion from get_local_data because we need the URI information when building the connection, its info is used on all requests, while the get_local_data is only needed in certain cases.

@mtrudel
Copy link
Contributor

mtrudel commented May 6, 2021

Yep, sounds good to me. I haven't gotten around to updating bandit to use the the new Plug.Conn.Adapter.conn/5 function yet & I'd incorrectly remembered the URI struct to be a readonly one along the lines of DateTime. Insofar as a URI is a convenient way to bundle up a number of parameters, it makes sense for me to just build a URI struct up directly on the bandit side.

My worries were / are centered around concerns that web servers usually don't have a 100% canonical view of the URI requested, and typically build it based using heuristics against a number of sources. For example, the hostname of a request may be any of:

  • The content of the Host header (only guaranteed to be present in HTTP/1.1 +, and easily spoofable by the client)
  • The SNI hostname from TLS (not sure about spoofability)
  • Other out-of-band signalling (X- headers from a load balancer, for example)
  • Server hostname
  • User configuration

There are similar concerns surrounding canonicalizing requests (Do you include the port number if it's a standard port? Do you URL encode an otherwise invalid URI for the purposes of comparison? Does SNI trump the Host header? What about information that's just straight up not known like hostnames in HTTP 1.0? Can you guess?). Exposing a URI as the primary way for web servers to get that information into Plug glosses over many of these concerns, making it easy for a Plug to have fragile dependencies on the behaviour of a particular server implementation without even knowing it.

Seeing as all of the relevant input data is available via Plug, my thought is that it may be better to have Plug provide a single heuristic to build URIs from constituent parts rather than leaving it up to the discretion of the underlying web server. To the extent that the c:get_host_data call under discussion here could contribute to that (by providing information on the server port), it may be worth considering bringing such concerns into Plug. This would presumably change the shape of the Plug.Conn.Adapter.conn/5 function to explicitly take the constituent parts of a URI, along with a Plug.Conn function(s) to access the canonical URI for the request.

All of this is rather off topic from the content of this PR (which I think makes sense regardless of the above). That being said, if this makes sense to you I'm happy to open an issue for discussion along with an accompanying PR. I'm going to be doing the work to build out bandit's currently very rudimentary request canonicalization anyway; I'd be happy to do it within Plug instead.

@josevalim
Copy link
Member

josevalim commented May 6, 2021

Right now, the only adapter that we use does the heuristic itself, that's why we don't do it. So my recommendation for now would be for you to do it and then in the future we can move it to Plug if there is more need.

About URI, you are correct you are not allowed to build it by hand, but that's something I want to change short term. Just don't worry about the authority field.

@josevalim
Copy link
Member

Closing this. If still relevant, please let me know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants