Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPv6 only hosts support #306

Closed
nekto0n opened this issue Oct 8, 2013 · 24 comments · Fixed by #1178
Closed

IPv6 only hosts support #306

nekto0n opened this issue Oct 8, 2013 · 24 comments · Fixed by #1178

Comments

@nekto0n
Copy link

nekto0n commented Oct 8, 2013

If I'm right host identifier in trace is currently named ipv4 and encoded as int32 and used as integer throughout the system (column types, etc).
This automaticaly disables ipv6 only hosts from being compatible with zipkin. Are there any plans or thoughts how one can add such support or maybe workaround can be found?

@bmdhacks
Copy link
Contributor

bmdhacks commented Oct 8, 2013

It's a pretty big change to work around it, not that I wouldn't welcome a
patch. Note that this is in no way related to whether Twitter offers ipv6
endpoints to our customers.

On Tue, Oct 8, 2013 at 3:07 PM, Nikita Vetoshkin
notifications@github.comwrote:

If I'm right host identifier in trace is currently named ipv4 and encoded
as int32 and used as integer throughout the system (column types, etc).
This automaticaly disables ipv6 only hosts from being compatible with
zipkin. Are there any plans or thoughts how one can add such support or
maybe workaround can be found?


Reply to this email directly or view it on GitHubhttps://github.com//issues/306
.

@nekto0n
Copy link
Author

nekto0n commented Oct 8, 2013

@bmdhacks yes, I understand that :)
Any suggestions/ideas/anything how can this be patched or where to look for help/advice?

@fedor57
Copy link

fedor57 commented Feb 17, 2015

Hi, I have some idea, can you judge it?

  • I assume that IPv4 addresses are just being stored in Zipkin DB without any interpretation, indexing or using as keys, right?
  • make sure instrumentation does not break while working on pure IPv6 hosts. Just to store 0.0.0.0 as IPv4 in this case
  • add extra default annotation like this "ipv6"="1a02:3b8::402:7a31:c2ff:fef2:b61d" for span at server side
    • tweak Web UI to do not show 0.0.0.0 as IP

Another useful step:

  • * add two annotations like "remotehost" = "myservice.mydomain.com" and "localhost" = "myservice-345-23-node.mydomain.com" to see the names outside and inside SLB.

What do you think? Will it significantly increase the amount of data? Do you have any proposal of how to do it more naturally?

@mattbenjamin
Copy link

Is anyone working on this? My preference would be either:
a) not constrain endpoint except to be appropriately unique
b) make endpoint self-describing--just supporting ipv6 just kicks the can down the road, and not that far, for me

Matt

@codefromthecrypt
Copy link
Member

This, issue came up again in armeria from @trustin. I wonder if we want to consider a tactical workaround or not?
line/armeria#195 (comment)

Right now, there's no special indexing in zipkin for the ip address anyway, so it is more or less how we handle the union concern of ipv4, ipv6 or I don't know. This would permeate all sorts of tracers, who probably already came onto this problem.

For example, I know opentracing expresses the ipv6 of a peer as a "tag". However, whatever we come up with needs to be able to work for both the local endpoint (the one making annotations), and also the address binary annotations (which designate client vs server)

cc @kristofa @eirslett @abesto @yurishkuro @jcarres-mdsol @basvanbeek for ideas.

@abesto
Copy link
Member

abesto commented Jul 5, 2016

Agree we need to support ipv6 somehow. How about the naïve approach, just adding a new field "ipv6" to the annotations? The only problem I see is enforcing that at least one of ipv[4,6] is set. Or we could just say that we don't really care whether it's an ipv4 or an ipv6 address, anyone who cares can decide about an address trivially which one it is.

So, idea: let's introduce a new field "ip" or "ipAddress" in the relevant annotations, and deprecate "ipv4". The new field can be either an ipv4 or an ipv6 address.

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 5, 2016 via email

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 5, 2016 via email

@basvanbeek
Copy link
Member

basvanbeek commented Jul 5, 2016

the zipkin-go-opentracing implementation can easily be adjusted without consumer API breakage. So what ever we come up with will be fine

@abesto
Copy link
Member

abesto commented Jul 6, 2016

How about this: since Zipkin itself doesn't semantically use the IP, just displays it (is this true?), we can create a permissive data structure - separate ipv[4,6] fields, and let users decide what they want to put there. Let them come up with their own convention, and make this explicit, maybe providing best practices.

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 6, 2016 via email

@codefromthecrypt
Copy link
Member

ps here's a link about model problems, and behind my desire for this issue to result in a tactical change vs an incompatible one. It is choosing battles..

Ex I have time to help with a tactical change like this, and prefer to punt incompatible changes to something that packs more punch #939

I'm thinking about thrift implications (cassandra), json implications (elasticsearch), and schema implications (another column in mysql), as well how to make this a "add this here" step for instrumentors. It is a lot of steps already, and adding something incompatible or inconsistent makes it less inviting for someone to jump in and do.

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 6, 2016

so here's a summary of compatible options that I think are possible and relatively equivalent work unless called out otherwise. I'm ok with either, actually, and happy to have folks vote for whatever they prefer and implement accordingly.

Add Endpoint.address[String]

This would hold an opaque value and deprecates ipv4. We could say that this overrides any value held in ipv4, and perhaps backfill it at query time (as we do with Span.timestamp).

Thrift

We'd mark ipv4 deprecated and add the following:

struct Endpoint {
--snip--
 /**
  * Holds the text representation of a IPv4 or IPv6 address associated with this endpoint
  * 
  * IPv4 address: Ex. "1.2.3.4" per RFC 2673, section 3.2.
  * IPv6 address: Ex. "2001:0db8:85a3:0000:0000:8a2e:0370:7334" per RFC 2373, section 2.2.
  */
  4: optional string address

Json

I'm not aware of a json-schema type that describes both ipv4 and ipv6.

Add Endpoint.ipv6[bytes]

This would be optional and hold the 128 bits for the ipv6 address and doesn't affect ipv4.

Thrift

We'd the following:

struct Endpoint {
--snip--
 /**
  * IPv6 host address packed into 16 bytes. Ex Inet6Address.getBytes()
  */
  4: optional binary ipv6

Json

Uses text format, just like the "ipv4" currently does. Document "ipv6" as the the json schema type of the same name.

Pros and Cons

address[String] will require a larger thrift field and row, although in both cases it is bounded by largest text form of an IP (50bytes). Some may object to the rows * 50 bytes enlarging of annotation/binary annotation namespace, and can choose not to store this.

ipv6[bytes] is separate from ipv4, which means an endpoint could have both an ipv6 and ipv4 association. It is more consistent in approach, but the current approach of ipv4 in thrift has proven annoying to some. This choice still needs a column even if it only needs to be 16bytes long.

@codefromthecrypt
Copy link
Member

@mosesn if you don't mind.. could you give a "twitter answer" on this topic?

@basvanbeek
Copy link
Member

I'm in favor of Endpoint.ipv6

@eirslett
Copy link
Contributor

eirslett commented Jul 6, 2016

We could base64-encode the ipv6 address, in JSON it's a pain to work with binary data...

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 6, 2016 via email

@mosesn
Copy link
Contributor

mosesn commented Jul 6, 2016

@adriancole let me try to find someone from Twitter who has more background on this.

@abesto
Copy link
Member

abesto commented Jul 6, 2016

You convinced me Endpoint.ipv6 is the way to go (with binary in thrift, string in json).

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 7, 2016

Thinking out loud here...

a related change we could do is leniently parse ipv4 in thrift, defaulting to 0.0.0.0. For example, I've noticed people stuff '0' when they don't have an ipv4. We couldn't return null from Endpoint.ipv4, as that would break people, but we could default it to zero and document the practice. This would keep those with ipv6 endpoints a little smaller as they don't have to encode a fake ipv4 address.

cc @liyichao @trustin on this one

@tstumpf
Copy link

tstumpf commented Jul 7, 2016

Endpoint.ipv6 as a binary seems like the best approach to us (chiming in for Twitter's zipkin team).

Our rational being: the utility of Endpoint.address' is uncertain but comes at a guaranteed cost for ipv6. The world is definitely moving to ipv6, it's practically inevitable, and we can expect ipv6 to be the dominant case in the near future, so let's optimize for it.

FWIW, a default of 0 for ipv4, that seems sane, but I don't have any useful thoughts one way or the other about it.

cc @adriancole @mosesn

@codefromthecrypt
Copy link
Member

codefromthecrypt commented Jul 9, 2016 via email

codefromthecrypt pushed a commit that referenced this issue Jul 11, 2016
This adds `Endpoint.ipv6` as a fixed-width byte array (16 bytes). This
formalizes `Endpoint.ipv4 == 0` implying there's no ipv4 address.

In thrift, this remains a byte array (String) at field 4.
In json, this is normal string formatting.
 * normal java utilities are used for codec (Inet6Address)

In the UI, the ipv6 address is preferred and bracketed when present.
Ex. [2001:db8:0:0:0:0:0:c002]:8080

In MySQL, this is mapped to a `BINARY(16)` field named
`zipkin_annotations.endpoint_ipv6`. When this column is missing a
warning like below is printed:

```
Jul 11, 2016 4:08:53 PM zipkin.storage.mysql.HasIpv6 compute
WARNING: zipkin_annotations.ipv6 doesn't exist, so Endpoint.ipv6 is not supported. Execute: alter table zipkin_annotations add `endpoint_ipv6` BINARY(16)
```

Fixes #306
@codefromthecrypt
Copy link
Member

here's the impl. when merged I can update zipkin-api's repo (to update the thrift and json model definition) #1178

@codefromthecrypt
Copy link
Member

here's the api definition change openzipkin/zipkin-api#20

codefromthecrypt pushed a commit to openzipkin/brave that referenced this issue Sep 22, 2016
This adds Endpoint.ipv6 as a fixed-width byte array (16 bytes). This
formalizes Endpoint.ipv4 == 0 implying there's no ipv4 address. These
are supported in recent versions of Zipkin with the same semantics.

You can use this via the Endpoint constructor of Brave, or via tracer
hooks like ClientTracer.setClientSent or ServerTracer.setServerReceived

This also deprecates Endpoint.create(serviceName,ipv4,port) as it leads
to NullPointerExceptions

See openzipkin/zipkin#306
See openzipkin/zipkin#1309
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.