-
Notifications
You must be signed in to change notification settings - Fork 326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create endpoint for default SSO code #954
Conversation
services/spar/src/Spar/Data.hs
Outdated
(retry x1 . query sel $ params Quorum ()) | ||
where | ||
sel :: PrepQuery R () (Identity SAML.IdPId) | ||
sel = "SELECT idp FROM default_idp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still thinking about whether this is the schema we want. This query works, but based on what I read about Cassandra so far, it doesn't know which partition the one row is on, so it has to ask all nodes whether they have it. This would not be the case with a primary key with just one possible value.
-- However, the SELECT query will deterministally pick one of them and the | ||
-- others will get removed by TRUNCATE the next time this function is called. | ||
retry x5 . write trunc $ params Quorum () | ||
retry x5 . write ins $ params Quorum (Identity idpId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't put them into the same batch, as there is no ordering within a batch (and I didn't want to play around with USING TIMESTAMP
). Would be fixed again by the other schema.
services/spar/src/Spar/Data.hs
Outdated
addPrepQuery delIdp (Identity idp) | ||
addPrepQuery delIssuerIdp (Identity issuer) | ||
addPrepQuery delTeamIdp (team, idp) | ||
where | ||
delDefaultIdp :: PrepQuery W (Identity SAML.IdPId) () | ||
delDefaultIdp = "DELETE FROM default_idp WHERE idp = ?" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is simple with the chosen schema and would be harder with a different primary key.
cb4ef38
to
d69d1d1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except for upcoming cql changes this looks ready to me.
( idp uuid | ||
, PRIMARY KEY (idp) | ||
) with compaction = {'class': 'LeveledCompactionStrategy'}; | ||
|] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
summary of off-line discussion with @mheinzel:
( key text
, idp uuid
, primary key (key, idp)
)
the extra key
attribute determines the cluster node and avoids having to look at all nodes for the actual uuid.
the extended primary key simplifies idp or team deletion:
delete from default_idp where key = 'default' and idp = '...'
race-condition-free default-code update can be implemented with a batch query:
delete ... where key = 'default'; update key = 'default', idp = '...' ...
other idea: make idp partition key, which is unique over all records with key = 'default'
.
services/spar/src/Spar/API/Types.hs
Outdated
@@ -147,9 +148,14 @@ type IdpDelete = Capture "id" SAML.IdPId :> DeleteNoContent '[JSON] NoContent | |||
instance MakeCustomError "wai-error" IdPMetadataInfo where | |||
makeCustomError = sparToServantErr . SAML.CustomError . SparNewIdPBadMetadata . cs | |||
|
|||
type APISSOSettings | |||
= Get '[JSON] SSOSettings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this type alias necessary? is there there for symmetry with other aliases? i would have expected this to be inlined, but feel free to leave it like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessary, just there for symmetry. I think I'll leave it there.
(withoutRaw <$> (eitherDecode . encode) val) `shouldBe` Right (withoutRaw val) | ||
(withoutRaw <$> (Aeson.eitherDecode . Aeson.encode) val) `shouldBe` Right (withoutRaw val) | ||
|
||
describe "SSOSettings JSON instance" $ do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that validateEveryToJSON
doesn't roundtrip-test the aeson instances. (But we should probably fix this there, and not start another list of types for which to roundtrip-test that we need to keep up to date manually.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some non-critical comments, also let's wait for @jschaul to give another opinion, but i'm happy.
-- It exists so the row is always at a known partition. | ||
void $ schema' [r| | ||
CREATE TABLE if not exists default_idp | ||
( partition_key_always_default text |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure about the name. this restricts us to never give it another value in the future, without adding any strong guarantees (it's just a name, same weight as a comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this table a more general "settings" key-value store, but the values can only be UUIDs now. I think I'd rather make a separate table for different things.
And this is slightly better than just a comment, since everyone writing a query has to type it out. If you want you could say it's an enforced per-usage-site comment. ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good points.
services/spar/src/Spar/Data.hs
Outdated
-- there is a race condition here which means there could potentially be more | ||
-- than one entry (violating invariant 2). | ||
-- However, the SELECT query will deterministally pick one of them and the | ||
-- others will get removed by TRUNCATE the next time this function is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does the TRUNCATE
happen? or do you mean DELETE
?
and is the determinism part of the semantics of cql? reference the docs then? (not sure i'm overdoing the referencing here.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks. I wanted to add an ORDER BY
clause to the SELECT
query. This will ensure that it picks one deterministically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Better?
migration = Migration 7 "Store default SSO code" $ do | ||
|
||
-- partition_key_always_default should always be "default". | ||
-- It exists so the row is always at a known partition. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It exists so the row is always at a known partition.
Isn't the motivation here so that we guarantee to only have a single value?
avoids having to look at all nodes for the actual uuid.
That never happens, cassandra nodes themselves are aware which data resides where, and the coordinating node (the one contacted by the Haskell code) will forward the requests to those nodes concerned by a query. This is independent of the choice of primary key.
Our cassandra client is not token aware (which would be a general improvement one could implement), so a cassandra node (at random more or less) is initially contacted, which knows where to find the information we are looking for, and acts as coordinating node. Not all nodes are contacted.
With or without the partion_key_always_default
field, spar will contact some random cassandra node, which will then contact 3 nodes of the cluster who will read or write a value, and this operation is successful once two (a quorum) have confirmed the write or the read.
Please note the current cassandra_spar
cluster has 3 nodes only, so, anyway any read, write, or delete contacts all nodes. Writing code which "will not contact all nodes" is thus currently impossible.
Perhaps you could reformulate the comments here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the motivation here so that we guarantee to only have a single value?
No, that's not the case. I also tried that approach (before commit schema composite primary key
/bdc9396
), but what's currently here has PRIMARY KEY (partition_key_always_default, idp)
, so you can theoretically have duplicates.
I had three slightly different schema versions I looked at and I could go back to the one with just PRIMARY KEY (primary_key_always_default), but that makes the query for deleting an IdP more complicated (with an additional SELECT before you can DELETE) and I got errors from Cassandra (
java.lang.IndexOutOfBoundsException: readerIndex(207) + length(4) exceeds writerIndex(208)`) that I couldn't quite figure out.
cassandra nodes themselves are aware which data resides where
As far as I understood from the reading I did last week, they only know that if they know the partitioning key, which here is partition_key_always_default
(while idp
is the clustering key). With the first schema I tried, without that "default" partition key, a query would look like SELECT idp FROM default_idp
, so it doesn't specify the partition key, so the data could potentially be on any node.
Please correct me if my understanding is flawed.
So we basically have the three options:
- single column
idp
- con: row could be in any partition, which is bad when querying it (the common case), although that apparently is not too bad with just 3 nodes?
- additional column as primary key
- pro: guarantees single row
- con: requires additional SELECT with conditional DELETE on removing an IdP, comes with a race condition
- also gave me errors (but I don't think it's a fundamental problem)
- additional column as partitioning key, composite primary key
- what we have now
Not sure if I forgot anything, but yeah, honestly not sure if going back to 2) is a good tradeoff. 🤷♂️
If my thoughts on partitioning keys don't make sense, 1) could also be a viable option again.
services/spar/src/Spar/API.hs
Outdated
|
||
Just code -> do | ||
wrapMonadClient (Data.getIdPConfig code) >>= \case | ||
Nothing -> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 5 levels of indentation in this function, making code harder to read as with fewer levels of indentation. Couldn't the first case split be done at function level (internatlPutSsoSettings (SsoSettings Nothing) = ...
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thanks.
services/spar/src/Spar/Data.hs
Outdated
-- there is a race condition here which means there could potentially be more | ||
-- than one entry (violating invariant 2). | ||
-- However, the SELECT query will deterministally pick one of them and the | ||
-- others will get removed by TRUNCATE the next time this function is called. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It appears the code was changed but the comments here were not updated. I see no call to truncate anywhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thanks.
@@ -911,7 +913,84 @@ specAux = do | |||
|
|||
sequence_ [ check tryowner perms | tryowner <- [minBound..], perms <- [0.. (length permses - 1)] ] | |||
|
|||
specSSOSettings :: SpecWith TestEnv | |||
specSSOSettings = do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A test which sends an empty json object {}
to the PUT endpoint is missing. According to the comments, this should be forbidden.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I check this for the JSON instance in the unit tests (services/spar/test/Test/Spar/APISpec.hs), but I can add it to the integration tests as well.
12345b2
to
98e500b
Compare
services/spar/src/Spar/Data.hs
Outdated
@@ -444,17 +444,17 @@ getDefaultSSOCode = fmap runIdentity . minimumMay <$> | |||
(retry x1 . query sel $ params Quorum ()) | |||
where | |||
sel :: PrepQuery R () (Identity SAML.IdPId) | |||
sel = "SELECT idp FROM default_idp WHERE partition_key_always_default = 'default'" | |||
sel = "SELECT idp FROM default_idp WHERE partition_key_always_default = 'default' ORDER BY idp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if you make this DESC
, it might even always give you the newer entry. Not sure how uuids work, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't assume any meaningful ordering in the UUIDs (and I don't think they have one), I just want that there is some ordering, so running the same query twice is guaranteed to give the same result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like it's ok to assume an ordering, though:
$ uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid ; uuid
b93369c6-4697-11ea-ae1a-27c8e8d1ded5
b9339b58-4697-11ea-8ed2-abf4b3b6a750
b933c588-4697-11ea-8060-8f056ba4913b
b933ee50-4697-11ea-96a9-37a65b436789
b9341a1a-4697-11ea-b801-3fb0f1caecfc
b9344a62-4697-11ea-a810-1ff327e8656e
b934797e-4697-11ea-89dc-a3580b24d3b4
b934a7d2-4697-11ea-8022-9b26ab5b572d
b934d6ee-4697-11ea-9799-7fbf7c802f05
b93525e0-4697-11ea-aacf-a32eda12db52
this suggests descending ordering would always give the most --
ah, never mind, i was confusing the time at which an idp is made the default with the time at which it is created.
i would still make DESC
since it works better in the more common case that a new idp has just been created an then made the default.
services/spar/src/Spar/Data.hs
Outdated
retry x5 $ batch $ do | ||
setType BatchLogged | ||
setConsistency Quorum | ||
addPrepQuery delDefaultIdp () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
addPrepQuery delDefaultIdp () | |
when (currentDefaultIdP == Just idp) $ addPrepQuery delDefaultIdp () |
then you don't need the other if-block...
723ba29
to
e66cf8a
Compare
Co-Authored-By: fisx <mf@zerobuzz.net>
test-integration/Test/Spar/APISpec.hs:978:9: 1) Spar.API, SSO settings endpoint, removes the default SSO code if the IdP gets removed predicate failed on: Response {responseStatus = Status {statusCode = 500, statusMessage = "server-error"}, responseVersion = HTTP/1.1, responseHeaders = [("Transfer-Encoding","chunked"),("Date","Mon, 03 Feb 2020 10:33:38 GMT"),("Server","Warp/3.2.25"),("Content-Type","application/json")], responseBody = Just "{\"code\":500,\"message\":\"{\\\"code\\\":500,\\\"message\\\":\\\"ResponseError {reHost = datacenter1:rack1:127.0.0.1:9042, reTrace = Nothing, reWarn = [], reCause = ServerError \\\\\\\"java.lang.IndexOutOfBoundsException: readerIndex(207) + length(4) exceeds writerIndex(208): SlicedAbstractByteBuf(ridx: 207, widx: 208, cap: 208/208, unwrapped: PooledUnsafeDirectByteBuf(ridx: 217, widx: 217, cap: 1024))\\\\\\\"}\\\",\\\"label\\\":\\\"server-error\\\"}\",\"label\":\"server-error\"}", responseCookieJar = CJ {expose = []}, responseClose' = ResponseClose}
e9c8eaf
to
4814afd
Compare
No description provided.