Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

We need to be able to restrict the whisper importer utility to a single org #1334

Closed
replay opened this issue May 28, 2019 · 13 comments · Fixed by #1335
Closed

We need to be able to restrict the whisper importer utility to a single org #1334

replay opened this issue May 28, 2019 · 13 comments · Fixed by #1335
Assignees

Comments

@replay
Copy link
Contributor

replay commented May 28, 2019

Currently when we spin up the importer-writer utility, then it will just write all the chunk write requests it receives into the defined keyspace. But we need to be able to limit which orgs we want to write to, in order to not accidentally write into the wrong one.

@woodsaj
Copy link
Member

woodsaj commented May 29, 2019

I think this problem needs a bit more thought.
I dont think that we should put restrictions on mt-whisper-importer-writer. I think the better way to handle this is in the same way we handle metric ingestion, ie we have tsdb-gw authenticate requests and do org-id enforcement.

@replay
Copy link
Contributor Author

replay commented May 29, 2019

That's a good idea. If we support the same authentication mechanism that the normal metric ingestion uses maybe we won't need to deploy the importer-writer every time we need it for a customer on the shared instance. The shared instance could just always have an importer-writer pod running, and if a customer wants to import data they'd simply run the importer-reader with the same authentication key that they use for ingest.

@replay
Copy link
Contributor Author

replay commented May 29, 2019

How about this: 0e0a5cd

@Dieterbe
Copy link
Contributor

Dieterbe commented May 30, 2019

i don't think we should move auth into any of the mt services
we already have decent separation of concerns: tsdb-gw doing auth, and after auth, forwarding/proxying requests to MT. why can't we do the same for the importer? just put a tsdb-gw in front which takes care of the auth, that's what it's for. it should only require updating tsdbgw to support this http path

@woodsaj
Copy link
Member

woodsaj commented May 30, 2019

I agree with @Dieterbe. Authentication should continue to remain out-of-scope for metrictank.
importer-writer should handle auth the same way as metrictank, ie just expect the 'x-org-id' header.

In addition to just gating access, we also need to ensure that the orgId used in archives matches the orgId of the authenticated user. For metrics, we currently handle this in tsdb-gw. We could also handle it there for the importer, but that would mean that the gateway needs to decode the archives and validate that the correct orgId is being used (we could also potentially just modify the orgId in archives before forwarding to the importer, we do that for metrics). But this seems like unnecessary overhead for archives. Instead i think we could design it like so:

importer-writer should have a cli flag for orgId enforcement. If enabled then importer-writer will expect requests to have a 'x-org-id' header. The value of the header will be the orgId that should be used to validate archives (i am undecided as to whether we should reject or just rewrite).

For GrafanaCloud we then just need to add a proxy route to tsdb-gw for the importer, with the "x-org-id" header being added before proxying the request. This is exactly what we do for graphite/* requests.

@Dieterbe
Copy link
Contributor

Dieterbe commented May 30, 2019

I see 3 main options:

enforce proper orgid in tsdbgw-auth (rewriting orgid based on auth like ingest)

  • con: extra decoding (and possibly encoding in case of auth mismatch - rare) overhead

enforce proper orgid in imporer-writer via a cli flag

  • con: we can't just have an importer-writer running to satisfy imports of different / arbitrary customers.

enforce proper orgid in imporer-writer by tsdbgw specifying it to importer-writer (e.g. via the http x-org-id header)

  • pro: doesn't have the cons of above approaches
  • con: this doesn't seem very clean: data on the wire may be inconsistent, and do we want this concern in the mt-importer-writer? it's also inconsistent with our ingestion path.

Note also that the ArchiveRequest not only contains an orgid in its MetricData property, but each contained ChunkWriteRequest has an orgid in the AMKey.
so regardless of which services takes care of auth/validation, either:

  • we should also validate that the orgid in the AMKey reflects the orgid of the MetricData (in addition to checking the orgid against the auth)
  • the ArchiveRequest should use a slightly different type of CWR, one that uses a new key type AKey (archive key, but not containing an orgid). it should be trivial to convert this CWR type into the one we already have. this enables us to validate much quicker (only in 1 place per ArchiveRequest instead of N), and in theory it means we only need to decode the beginning of the ArchiveRequest instead of having to decode the whole thing (not sure how practical this is though)

@woodsaj
Copy link
Member

woodsaj commented May 30, 2019

  • option 1 is not viable due to the overhead of decoding and re-encoding before forwarding from tsdb-gw to importer-writer.

  • option 2 is out as it prevents us from sharing the same importer-writers for multiple users/customers

  • option 3 is the clear winner for me. I dont share @Dieterbe's opinions on the cons.

    • how is the solution not clean? Using an x-org-id header to a) confirm a request has been authenticated and b) identify who authenticated is exactly the approach used in metrictank. I feel that is a very clean solution
    • 'data on the wire may be inconsistent' of course, that is always true regardless of how requests are authenticated. And is precisely why we need to validate payloads before blindly pushing them to the backend store.
    • 'do we want this concern in the mt-importer-writer?' Definitely yes, it wouldn't make sense to validate payloads anywhere else.
    • 'it's also inconsistent with our ingestion path.' In what way is it inconsistent? The only reason we do validation in tsdb-gw is because metrics are processed asynchronously by metrictank. If metrictank did the validation we would not be able to notify the users that validation was failing. Because processing archives is a synchronous operation you cant compare it with metric ingestion. You can compare it with graphite render queries though, which are also HTTP POSTs that simply get proxied to the backend with the x-org-id header.

the ArchiveRequest should use a slightly different type of CWR,

I do agree with this. The only change we need is to use Archive schema.Archive instead of Key schema.AMKey. When consuming the ArchiveRequest we would then

  • validate the MetricData (ie set orgId to x-org-id and re-run SetId() to reset the id.)
  • for each writeRequest in the archive we would then just create a new ChunkWriteRequest setting Key to schema.AMKey{MKey: mkey, Archive: writeRequest.Archive}

@replay
Copy link
Contributor Author

replay commented May 30, 2019

I like 3) as well. If we do that, does the user even need to specify the org id when starting the reader? If the user will need to specify their authentication credentials anyway, then we might as well just set the org id in the MetricData to 0, then the writer can just take the id from the X-Org-Id header and add it to the MetricData before writing to the store. That way the user has one less option to specify when using the reader, so usability is better: https://github.com/grafana/metrictank/blob/master/cmd/mt-whisper-importer-reader/main.go#L51

@Dieterbe
Copy link
Contributor

No point arguing about minor nuances. I'm fine with option 3.
@replay regarding your question, probably you're right. but i'ld like @woodsaj to confirm.

@replay
Copy link
Contributor Author

replay commented May 31, 2019

This is now involving 3 PRs (and I'll need to create one for raintank/schema too):
#1335
https://github.com/raintank/tsdb-gw/pull/164
https://github.com/raintank/hosted-metrics-api/pull/241
I did a test deploy in my QA instance and so far it looks good. I can authenticate, and when authenticated I can reach the importer. I haven't done a test import yet, I'll do that tomorrow.

@replay
Copy link
Contributor Author

replay commented Jun 3, 2019

I have done more testing of this on Friday (and added a bunch of fixes). If you get a chance, these two are still waiting for review:

#1335
https://github.com/raintank/hosted-metrics-api/pull/241

@Dieterbe
Copy link
Contributor

Dieterbe commented Jun 7, 2019

With all this discussion having happened I think a synopsis is worthwhile. Seems it all boils down to:

  1. tsdbgw provides auth and proxies to importer writer, setting the x-org-id and passing along the client's ArchiveRequests unaltered. <- merged
  2. currently ArchiveRequest has orgid and AMKey in each CWR which may not match x-org-id. thus we will use Archive schema.Archive instead of Key schema.AMKey for the CWR. after validating the MetricData orgid, we can then generate the proper key for the CWR
  3. importer-writer has no need for a cli flag for orgId enforcement, and can accept whatever was sent to it (once 1 and 2 are implemented)
  4. importer-reader needs no orgid option, as that'll be set automatically via auth.

@replay
Copy link
Contributor Author

replay commented Jun 7, 2019

that's correct as you described. the only PR left to merge is the MT one (#1335), the HM-API & tsdb are already merged

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants