Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create endpoint for default SSO code #954

Merged
merged 23 commits into from
Feb 4, 2020

Conversation

mheinzel
Copy link
Contributor

No description provided.

(retry x1 . query sel $ params Quorum ())
where
sel :: PrepQuery R () (Identity SAML.IdPId)
sel = "SELECT idp FROM default_idp"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am still thinking about whether this is the schema we want. This query works, but based on what I read about Cassandra so far, it doesn't know which partition the one row is on, so it has to ask all nodes whether they have it. This would not be the case with a primary key with just one possible value.

-- However, the SELECT query will deterministally pick one of them and the
-- others will get removed by TRUNCATE the next time this function is called.
retry x5 . write trunc $ params Quorum ()
retry x5 . write ins $ params Quorum (Identity idpId)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't put them into the same batch, as there is no ordering within a batch (and I didn't want to play around with USING TIMESTAMP). Would be fixed again by the other schema.

addPrepQuery delIdp (Identity idp)
addPrepQuery delIssuerIdp (Identity issuer)
addPrepQuery delTeamIdp (team, idp)
where
delDefaultIdp :: PrepQuery W (Identity SAML.IdPId) ()
delDefaultIdp = "DELETE FROM default_idp WHERE idp = ?"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is simple with the chosen schema and would be harder with a different primary key.

@mheinzel mheinzel force-pushed the mheinzel/linear-onboarding/sso branch from cb4ef38 to d69d1d1 Compare February 3, 2020 08:03
Copy link
Contributor

@fisx fisx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

except for upcoming cql changes this looks ready to me.

( idp uuid
, PRIMARY KEY (idp)
) with compaction = {'class': 'LeveledCompactionStrategy'};
|]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

summary of off-line discussion with @mheinzel:

  ( key text
  , idp uuid
  , primary key (key, idp)
  )

the extra key attribute determines the cluster node and avoids having to look at all nodes for the actual uuid.

the extended primary key simplifies idp or team deletion:

delete from default_idp where key = 'default' and idp = '...'

race-condition-free default-code update can be implemented with a batch query:

delete ... where key = 'default'; update key = 'default', idp = '...' ...

other idea: make idp partition key, which is unique over all records with key = 'default'.

services/spar/src/Spar/API.hs Outdated Show resolved Hide resolved
services/spar/src/Spar/Types.hs Outdated Show resolved Hide resolved
services/spar/src/Spar/Types.hs Outdated Show resolved Hide resolved
@@ -147,9 +148,14 @@ type IdpDelete = Capture "id" SAML.IdPId :> DeleteNoContent '[JSON] NoContent
instance MakeCustomError "wai-error" IdPMetadataInfo where
makeCustomError = sparToServantErr . SAML.CustomError . SparNewIdPBadMetadata . cs

type APISSOSettings
= Get '[JSON] SSOSettings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this type alias necessary? is there there for symmetry with other aliases? i would have expected this to be inlined, but feel free to leave it like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary, just there for symmetry. I think I'll leave it there.

services/spar/test-integration/Test/Spar/APISpec.hs Outdated Show resolved Hide resolved
(withoutRaw <$> (eitherDecode . encode) val) `shouldBe` Right (withoutRaw val)
(withoutRaw <$> (Aeson.eitherDecode . Aeson.encode) val) `shouldBe` Right (withoutRaw val)

describe "SSOSettings JSON instance" $ do
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that validateEveryToJSON doesn't roundtrip-test the aeson instances. (But we should probably fix this there, and not start another list of types for which to roundtrip-test that we need to keep up to date manually.)

Copy link
Contributor

@fisx fisx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some non-critical comments, also let's wait for @jschaul to give another opinion, but i'm happy.

-- It exists so the row is always at a known partition.
void $ schema' [r|
CREATE TABLE if not exists default_idp
( partition_key_always_default text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure about the name. this restricts us to never give it another value in the future, without adding any strong guarantees (it's just a name, same weight as a comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make this table a more general "settings" key-value store, but the values can only be UUIDs now. I think I'd rather make a separate table for different things.

And this is slightly better than just a comment, since everyone writing a query has to type it out. If you want you could say it's an enforced per-usage-site comment. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good points.

-- there is a race condition here which means there could potentially be more
-- than one entry (violating invariant 2).
-- However, the SELECT query will deterministally pick one of them and the
-- others will get removed by TRUNCATE the next time this function is called.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where does the TRUNCATE happen? or do you mean DELETE?

and is the determinism part of the semantics of cql? reference the docs then? (not sure i'm overdoing the referencing here.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks. I wanted to add an ORDER BY clause to the SELECT query. This will ensure that it picks one deterministically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Better?

migration = Migration 7 "Store default SSO code" $ do

-- partition_key_always_default should always be "default".
-- It exists so the row is always at a known partition.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It exists so the row is always at a known partition.

Isn't the motivation here so that we guarantee to only have a single value?

avoids having to look at all nodes for the actual uuid.

That never happens, cassandra nodes themselves are aware which data resides where, and the coordinating node (the one contacted by the Haskell code) will forward the requests to those nodes concerned by a query. This is independent of the choice of primary key.

Our cassandra client is not token aware (which would be a general improvement one could implement), so a cassandra node (at random more or less) is initially contacted, which knows where to find the information we are looking for, and acts as coordinating node. Not all nodes are contacted.
With or without the partion_key_always_default field, spar will contact some random cassandra node, which will then contact 3 nodes of the cluster who will read or write a value, and this operation is successful once two (a quorum) have confirmed the write or the read.

Please note the current cassandra_spar cluster has 3 nodes only, so, anyway any read, write, or delete contacts all nodes. Writing code which "will not contact all nodes" is thus currently impossible.

Perhaps you could reformulate the comments here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the motivation here so that we guarantee to only have a single value?

No, that's not the case. I also tried that approach (before commit schema composite primary key/bdc9396), but what's currently here has PRIMARY KEY (partition_key_always_default, idp), so you can theoretically have duplicates.

I had three slightly different schema versions I looked at and I could go back to the one with just PRIMARY KEY (primary_key_always_default), but that makes the query for deleting an IdP more complicated (with an additional SELECT before you can DELETE) and I got errors from Cassandra (java.lang.IndexOutOfBoundsException: readerIndex(207) + length(4) exceeds writerIndex(208)`) that I couldn't quite figure out.

cassandra nodes themselves are aware which data resides where

As far as I understood from the reading I did last week, they only know that if they know the partitioning key, which here is partition_key_always_default (while idp is the clustering key). With the first schema I tried, without that "default" partition key, a query would look like SELECT idp FROM default_idp, so it doesn't specify the partition key, so the data could potentially be on any node.

Please correct me if my understanding is flawed.

So we basically have the three options:

  1. single column idp
  • con: row could be in any partition, which is bad when querying it (the common case), although that apparently is not too bad with just 3 nodes?
  1. additional column as primary key
  • pro: guarantees single row
  • con: requires additional SELECT with conditional DELETE on removing an IdP, comes with a race condition
  • also gave me errors (but I don't think it's a fundamental problem)
  1. additional column as partitioning key, composite primary key
  • what we have now

Not sure if I forgot anything, but yeah, honestly not sure if going back to 2) is a good tradeoff. 🤷‍♂️

If my thoughts on partitioning keys don't make sense, 1) could also be a viable option again.


Just code -> do
wrapMonadClient (Data.getIdPConfig code) >>= \case
Nothing ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 5 levels of indentation in this function, making code harder to read as with fewer levels of indentation. Couldn't the first case split be done at function level (internatlPutSsoSettings (SsoSettings Nothing) = ...)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thanks.

-- there is a race condition here which means there could potentially be more
-- than one entry (violating invariant 2).
-- However, the SELECT query will deterministally pick one of them and the
-- others will get removed by TRUNCATE the next time this function is called.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears the code was changed but the comments here were not updated. I see no call to truncate anywhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, thanks.

@@ -911,7 +913,84 @@ specAux = do

sequence_ [ check tryowner perms | tryowner <- [minBound..], perms <- [0.. (length permses - 1)] ]

specSSOSettings :: SpecWith TestEnv
specSSOSettings = do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test which sends an empty json object {} to the PUT endpoint is missing. According to the comments, this should be forbidden.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I check this for the JSON instance in the unit tests (services/spar/test/Test/Spar/APISpec.hs), but I can add it to the integration tests as well.

@mheinzel mheinzel force-pushed the mheinzel/linear-onboarding/sso branch from 12345b2 to 98e500b Compare February 3, 2020 13:53
@@ -444,17 +444,17 @@ getDefaultSSOCode = fmap runIdentity . minimumMay <$>
(retry x1 . query sel $ params Quorum ())
where
sel :: PrepQuery R () (Identity SAML.IdPId)
sel = "SELECT idp FROM default_idp WHERE partition_key_always_default = 'default'"
sel = "SELECT idp FROM default_idp WHERE partition_key_always_default = 'default' ORDER BY idp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you make this DESC, it might even always give you the newer entry. Not sure how uuids work, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't assume any meaningful ordering in the UUIDs (and I don't think they have one), I just want that there is some ordering, so running the same query twice is guaranteed to give the same result.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like it's ok to assume an ordering, though:

$ uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid
b93369c6-4697-11ea-ae1a-27c8e8d1ded5
b9339b58-4697-11ea-8ed2-abf4b3b6a750
b933c588-4697-11ea-8060-8f056ba4913b
b933ee50-4697-11ea-96a9-37a65b436789
b9341a1a-4697-11ea-b801-3fb0f1caecfc
b9344a62-4697-11ea-a810-1ff327e8656e
b934797e-4697-11ea-89dc-a3580b24d3b4
b934a7d2-4697-11ea-8022-9b26ab5b572d
b934d6ee-4697-11ea-9799-7fbf7c802f05
b93525e0-4697-11ea-aacf-a32eda12db52

this suggests descending ordering would always give the most --

ah, never mind, i was confusing the time at which an idp is made the default with the time at which it is created.

i would still make DESC since it works better in the more common case that a new idp has just been created an then made the default.

retry x5 $ batch $ do
setType BatchLogged
setConsistency Quorum
addPrepQuery delDefaultIdp ()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
addPrepQuery delDefaultIdp ()
when (currentDefaultIdP == Just idp) $ addPrepQuery delDefaultIdp ()

then you don't need the other if-block...

@mheinzel mheinzel force-pushed the mheinzel/linear-onboarding/sso branch from 723ba29 to e66cf8a Compare February 3, 2020 14:11
mheinzel and others added 19 commits February 3, 2020 16:48
test-integration/Test/Spar/APISpec.hs:978:9:
1) Spar.API, SSO settings endpoint, removes the default SSO code if the IdP gets removed
  predicate failed on: Response {responseStatus = Status {statusCode = 500, statusMessage = "server-error"}, responseVersion = HTTP/1.1, responseHeaders = [("Transfer-Encoding","chunked"),("Date","Mon, 03 Feb 2020 10:33:38 GMT"),("Server","Warp/3.2.25"),("Content-Type","application/json")], responseBody = Just "{\"code\":500,\"message\":\"{\\\"code\\\":500,\\\"message\\\":\\\"ResponseError {reHost = datacenter1:rack1:127.0.0.1:9042, reTrace = Nothing, reWarn = [], reCause = ServerError \\\\\\\"java.lang.IndexOutOfBoundsException: readerIndex(207) + length(4) exceeds writerIndex(208): SlicedAbstractByteBuf(ridx: 207, widx: 208, cap: 208/208, unwrapped: PooledUnsafeDirectByteBuf(ridx: 217, widx: 217, cap: 1024))\\\\\\\"}\\\",\\\"label\\\":\\\"server-error\\\"}\",\"label\":\"server-error\"}", responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}
@mheinzel mheinzel force-pushed the mheinzel/linear-onboarding/sso branch from e9c8eaf to 4814afd Compare February 3, 2020 18:49
@mheinzel mheinzel merged commit c438f49 into develop Feb 4, 2020
@mheinzel mheinzel deleted the mheinzel/linear-onboarding/sso branch February 4, 2020 07:11
@lucendio lucendio mentioned this pull request Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants