Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(admin) get a list of active targets per upstream #2230

Merged
merged 1 commit into from
Mar 24, 2017

Conversation

p0pr0ck5
Copy link
Contributor

No description provided.

@p0pr0ck5 p0pr0ck5 requested a review from Tieske March 21, 2017 17:43
@p0pr0ck5 p0pr0ck5 force-pushed the feat/active-targets branch from cc4c3a9 to 9a915d8 Compare March 21, 2017 18:11
@thibaultcha
Copy link
Member

(feat(admin))

Copy link
Member

@thibaultcha thibaultcha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (except for the commit scope), but simply suggesting implementing this under the /active/ path segment rather than relying on a qs param maybe?


-- user wanted a list of all active targets for this upstream
local target_history, err = dao_factory.targets:find_all(
{ upstream_id = self.params.upstream_id }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure why this table's brackets are on the same line instead of:

thing({

})

return crud.paginated_set(self, dao_factory.targets)
end

self.params.active = nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we considered implementing this under:

/upstreams/{upstream}/targets/active/

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought here was that active is a modifier of the resource, not a separate (or sub) resource. This is why I choose to leave it as a query param, not a URI. But my experience defining RESTful routes is very limited :) Let me know what your thoughts are given the above, I will happily defer to your expertise here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My first thought was with @thibaultcha proposal, but the @p0pr0ck5 explanation makes sense as well.

Googled around a bit and it seems the query parameter is the better option

Copy link
Member

@thibaultcha thibaultcha Mar 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Googled around a bit and it seems the query parameter is the better option

Could you share those sources? Curious to see.

The way I see this, those two endpoints do not really give you the same entities, or at least, not in the same context, which is everything here:

  • /upstreams/{upstream}/targets/ gets you a - paginated - list of all the target rows currently in the database.
  • /upstreams/{upstream}/targets/active/ gets you a subset of those above rows, currently in use in the nodes memory, at runtime.

It's all about the context in which those entities are presented here. I think it is significant enough to warrant another endpoint. Additionally, we now have a single endpoint that is paginated by default, but not paginated when the ?active= qs param is given. Overall, I think it's fine not to paginate active targets, but I do think this is another reason why it shouldn't be under the same endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

^ Works for me! I'll fix this up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thibaultcha

here's one: http://stackoverflow.com/questions/30967822/when-do-i-use-path-params-vs-query-params-in-a-restful-api

the 404 related comment in the top answer here seems legit: http://stackoverflow.com/questions/4024271/rest-api-best-practices-where-to-put-parameters

That said from the overall kong api now, the path based solution seems easier to use, though it might be not as restful. So effectively I do not have a preference.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally speaking, I tend to use path parameters when there is an obvious 'hierarchy' in the resource

I this this is most definitely the case here, where there is a hierarchy between the unused and used targets.

@p0pr0ck5 p0pr0ck5 force-pushed the feat/active-targets branch from 9a915d8 to 9cdfcf4 Compare March 22, 2017 03:05
@p0pr0ck5 p0pr0ck5 changed the title feat(targets) get a list of active targets per upstream feat(admin) get a list of active targets per upstream Mar 22, 2017
@p0pr0ck5 p0pr0ck5 force-pushed the feat/active-targets branch from 9cdfcf4 to 6ca88e4 Compare March 22, 2017 03:07
for _, target in ipairs(target_history) do
target.order = target.target..":"..target.created_at
end
table.sort(target_history, function(a, b) return a.order > b.order end)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sorting is not ok, see https://github.com/Mashape/kong/blob/6ca88e4d5732320d03cf7b4b40e7f79c903392a7/kong/api/routes/upstreams.lua#L99-L101

It has to be deterministic, so the same order must be used everywhere. As the timestamp might have duplicates (edge case, but could happen) and then the sort order is relevant.

target.order = target.target..":"..target.created_at..":"..target.id ought to be fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reminder: We should have used ulid to prevent this whole thing from the start

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Afraid I'm having a little troubIe understanding here. Isn't target.id unique across all target objects? Just having trouble understanding how it's helpful to have it in the sort order in this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assume same target, 2 target entries, same timestamp. 1st weight = 100, second weight = 0

Now, is the target active or not? Your sort key is identical, hence order is undetermined.

My sort key includes the id (a unique uuid), hence sort order is deterministic (and the same order as used elsewhere, when constructing the balancer, as well as checking active in the POST code)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err... you're basing the sort order on a (supposedly randomly generated) UUID? This pretty much the definition of an undeterministic sort :)

Also, if the sort keys are equal, the resulting sorted elements should just be a function of the order of the elements, no? So this feels like a moot point, since we would be sorting based on order if the timestamps where the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If timestamps are identical, my understanding is that, since the sort keys are identical, the sort order will not change. In this case, the order of the table elements remains unchanged. Consider the following:

> t = { { a = 1, foo = "baz" }, { a = 1, foo = "bar" } }
> table.sort(t, function(a, b) return a.a < b.a end)
> print(t[1].foo)
baz
> print(t[2].foo)
bar

When we compare on equal values, the order is unchanged.

If multiple target entities contained the same UUID, I agree that's a valid sorting element. But because (from my understanding, please correct me if I'm wrong), each POST to /upstreams/:name/targets results in a new entry with a unique UUID, sorting on these UUIDs is useless, and the above applies. If I'm misunderstanding please let me know :)

Copy link
Member

@Tieske Tieske Mar 23, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If timestamps are identical, my understanding is that, since the sort keys are identical, the sort order will not change.

is undefined in Lua, hence non-deterministic

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for deterministic sorting, the only option is to make sure that no key ever equals another key.

target.target will have doubles, target.created_at might have doubles (unlikely, but still), hence we must add yet another element (with lowest sort priority) that always makes the key unique, hence we add the target.id which is unique by definition.

Now we can be sure that any sorting always will yield the exact same order.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when you have two sets of two targets with the same create time that look like such:

[
    {
      "weight": 0,
      "id": "38b4ab34-f6da-4f88-9e03-aa53f75ed9f7",
      "target": "1.2.3.4:8000",
      "created_at": 1490116516932,
      "upstream_id": "413ea7ee-2740-445d-9c2e-d042e2c7a798"
    },
    {
      "weight": 20,
      "id": "8676ddb9-cd96-40ed-a537-7cc714faac44",
      "target": "1.2.3.4:8000",
      "created_at": 1490116516932,
      "upstream_id": "413ea7ee-2740-445d-9c2e-d042e2c7a798"
    }
]

See that target and created_at are equal; note also the differences in UUIDs (the first element's id is 'lower'). Now consider the following pair:

[
    {
      "weight": 0,
      "id": "38b4ab34-f6da-4f88-9e03-aa53f75ed9f7",
      "target": "1.2.3.4:8000",
      "created_at": 1490116516932,
      "upstream_id": "413ea7ee-2740-445d-9c2e-d042e2c7a798"
    }
    {
      "weight": 20,
      "id": "196be764-8fbe-4332-a1d4-c192c7cd6bb5",
      "target": "1.2.3.4:8000",
      "created_at": 1490116516932,
      "upstream_id": "413ea7ee-2740-445d-9c2e-d042e2c7a798"
    }
]

So what happens when we try to sort based on target, then created_at, then id? Since it depends solely on id, we get different results based on the randomly-defined ID: https://gist.github.com/p0pr0ck5/675d5062f84cf5e04fd311dbb7f36c3f

Now consider the use case of finding all 'active' targets. In such a case, we're essentially looking for the most recent entry in the db for a given row's target. As we can see above, it's entirely possible that multiple cases, using the same sort key and method, would produce different results, because the sort key is generated from random, non-deterministic data. This is the definition of an non-deterministic sort. While the behavior is reproducible, it is not useful for a sort :)

Having said all this, it's very much an edge case, and really at this point we're kind of abusing the design of target objects (which really aren't objects, per se). So I'm pushing a new version of this patchset with your desired sort key (target.target..":"..target.created_at..":"..target.id) so that we can move on :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thing is, your examples are identical. And there is nothing we can do about it. timestamp == timestamp. We simply do not have any information on which one comes first. Increasing timestamp precision would only make it less likely for a collision to happen, but won't take the possibility away.

That given: all we can do is make sure that whenever we rely on the order of events, we make sure to use the same order. The uuid is created with the record, and never changes afterwards, so it is a perfect way to make the sort order deterministic. So yes, the order is arbitrary in the end, but it will be consistently the same order on every occasion.

@@ -40,7 +42,48 @@ return {
end,

GET = function(self, dao_factory)
crud.paginated_set(self, dao_factory.targets)
-- by default, just get a paginated list
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're adding code to determine the active ones (similar as when posting), can't this be refactored in a function, that can be called from both methods to prevent code duplication?

@p0pr0ck5 p0pr0ck5 force-pushed the feat/active-targets branch 2 times, most recently from ae2538d to b830452 Compare March 23, 2017 16:35
@@ -107,4 +109,54 @@ return {
crud.post(self.params, dao_factory.targets)
end,
},

["/upstreams/:name_or_id/targets/active/"] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to #2252, this will have to be changed to :upstream_name_or_id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tieske, what do you think the chances of #2252 getting in before this? Can you merge this PR in and then rebase your change? Or shall this change wait until #2252 is in, and then we'll rebase?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't get to it until monday, so if #2252 is ok and you got time, then please merge it. Rebase this and merge it also (just approved this one).

assert.equal(2, #json.data)
assert.equal(2, json.total)
assert.equal('api-3:80', json.data[1].target)
assert.equal('api-2:80', json.data[2].target)
Copy link
Member

@thibaultcha thibaultcha Mar 24, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: should use double quotes

it("only shows active targets", function()
local res = assert(client:send {
method = "GET",
path = "/upstreams/"..upstream_name3.."/targets/active/",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: ideally - even if maybe this file is not up to date - this should have spaces around the .. operator

@@ -257,6 +257,65 @@ describe("Admin API", function()
assert.equal([[{"data":[],"total":0}]], body)
end)
end)

describe("active targets", function()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admin API tests should ideally be scoped under the endpoint they are testing, in this case, we should ideally have:

describe("Admin API", function()
  describe("/upstreams/{upstream}/targets/active", function()
    it("only shows active targets", function()

    end)
  end)
end)

Hence this test in busted will be:

Admin API /upstreams/{upstream}/targets/active only shows active targets

entry.order = nil -- dont show our order key to the client
found_n = found_n + 1
found[found_n] = entry
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: missing blank line above else

local target_history, err = dao_factory.targets:find_all({
upstream_id = self.params.upstream_id,
})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: no need for a blank line here


--sort and walk based on target and creation time
for _, target in ipairs(target_history) do
target.order = target.target..":"..target.created_at..":"..target.id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: ditto above spaces around ..


local ignored = {}
local found = {}
local found_n = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this is needed in such cold code paths, it just adds cognitive load where one could use table.insert instead imho. But it's fine.

@thibaultcha thibaultcha added pr/changes requested Changes were requested to this PR by a maintainer. Please address them and ping back once done. and removed pr/status/needs review labels Mar 24, 2017
@p0pr0ck5 p0pr0ck5 force-pushed the feat/active-targets branch from b830452 to 27f0ced Compare March 24, 2017 21:52
@p0pr0ck5 p0pr0ck5 force-pushed the feat/active-targets branch from 27f0ced to dc8cc25 Compare March 24, 2017 21:53
@thibaultcha thibaultcha added pr/ready (but hold merge) No more concerns, but do not merge yet (probably a conflict of interest with another PR or release) and removed pr/changes requested Changes were requested to this PR by a maintainer. Please address them and ping back once done. labels Mar 24, 2017
@p0pr0ck5 p0pr0ck5 merged commit 3bc8726 into next Mar 24, 2017
@p0pr0ck5 p0pr0ck5 deleted the feat/active-targets branch March 24, 2017 21:57
@Tieske Tieske added this to the 0.10.1 milestone Mar 27, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/ready (but hold merge) No more concerns, but do not merge yet (probably a conflict of interest with another PR or release)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants