Implement and test all channels of WebSocket API #10

ericpashman · 2019-11-14T02:59:07Z

Sorry for the delay, but here's my implementation of all data channels provided by the WebSocket API, issue #9. I consider this mergeable as-is, but there are some outstanding issues:

I haven't yet implemented any of the private fields on messages pertaining to your own orders that show up only when authenticated. These were also unsupported in the limited existing implementation of the "full" channel.
I haven't implemented "activate" messages, which relate to triggered stop orders. The documentation on these looks odd and I'll need to play around with some stop orders on the sandbox server and see what happens. Meanwhile, this won't result in any runtime errors so long as you don't place stop orders.
My Aeson instances and QuickCheck instances need some work. They're all correct, so far as my testing can tell, but the Aeson instances (including the ones that already existed) should be re-written where possible to use the more efficient toEncoding method, and my QuickCheck Arbitrary instances are pure boilerplate that probably should be replaced with generic instances using the functionality in the generic-random package or similar.

I didn't use generic-random mainly because I felt bad about already having introduced a couple of new dependencies in the test target; the new QuickCheck and text dependencies are not really new, because they were already there as sub-dependencies of other packages, but the quickcheck-instances and aeson-qq dependencies are new. The latter one can go away in the future, though, as newer versions of QuickCheck provide all the functionality I'm using from that package. (We just need to update to a recent Stack package set.) And the only instances we need from quickcheck-instances are for the Text and UUID types, I believe, so if you prefer we can just steal those and eliminate that dependency too.

This commit includes a bunch of new QuickCheck tests of the Aeson instances; they're fairly limited in that they only test that we correctly decode the JSON we encode our types into, which doesn't tell us whether that JSON matches Coinbase's, but I've paired them with some unit tests that decode the example messages given in the API documentation. The idea is that we should collect some more messages ourselves from the live server, store them, and test them in the same way; then we're not dependent on getting lots of messages over the network every time for basic regression testing. I plan to implement similar tests for the REST API code in the future.

That said, pulling and testing the example messages from the docs has shown me that the docs leave a lot to be desired. Several message types are undocumented, and many of the examples are not actually what the API produces, and not even valid JSON. I've made a bunch of comments in the code about these issues and others, annotated with NOTE, TODO, or FIXME. If you search for those keywords, you should be able quickly to review most of the gotchas and questionable bits in this PR that we may need to discuss before merging.

And of course I've uncovered some other issues that I'll open in the days ahead.

dimitri-xyz · 2019-11-24T22:50:43Z

I cannot build this pull request. I get a few errors like:

haskell-coinbase-pro/src/Coinbase/Exchange/Types/Socket.hs:279:33: error:
Not in scope: type constructor or class ‘BestPrice’

I cannot find the BestPrice type. Did you forget something to include something? Or did I miss something?

Sorry about the delay on this. This is good work.

ericpashman · 2019-11-25T18:17:26Z

Yikes, looks like I broke this while tidying up my Git history just prior to submitting the PR. Will fix and re-submit.

…" field

ericpashman · 2019-11-25T20:06:03Z

OK, this should work now. Sorry for the history-mangling. Some further notes:

Four tests currently fail. These are all tests of REST API functionality that I haven't touched.
Aside from adding a bunch of property tests of the WebSocket API's Aeson instances, I also added several more channels to the pre-existing tests of the WebSocket API that hit the server. This ends up pulling down a lot more data and typically speeds up the tests considerably.
I included a stack.yaml.lock file in this commit. I hadn't meant to, but it's supposed to be under version control, so there it is.

Sorry again for the mess, and for the size of this PR.

ericpashman · 2019-11-27T06:31:03Z

Oh, I forgot to mention that this also reverts the parsing of the client_oid field of “received” message types to the code you first proposed in your fix #4. My very helpful suggestion of a “more idiomatic” way to write it does not in fact work, as the left-side parse results in an empty value rather than a Nothing, and the (.:!) operator only handles the latter. As far as I can tell, using the applicative operator as you suggested is the only way to get the result we want.

haskell-coinbase-pro.cabal

src/Coinbase/Exchange/Socket.hs

src/Coinbase/Exchange/Types/Socket.hs

dimitri-xyz · 2020-05-12T04:40:49Z

Could we rebase this PR on the latest head (70352fd)? It makes it much easier to review!

ericpashman · 2020-05-12T05:11:57Z

Sure, I don't think this PR touches any files with newer merged changes, so there shouldn't be any conflicts.

If you haven't done it before then, I'll rebase tomorrow when I'm at a computer with a local copy of the repo.

ericpashman · 2020-05-12T05:17:10Z

Oh, you might also want to search for TODO, FIXME and NOTE comments and review them for gotchas that we might want to deal with before committing. That said, though it's been too long for me to recall the details, I considered this PR ready to be merged when I submitted it.

dimitri-xyz · 2020-05-12T17:58:32Z

test/Main.hs

+
+-- TODO: Make this a list of all products
+products = [ ProductId "ETH-BTC"
+           , ProductId "ETH-EUR"
+           , ProductId "ETH-USD"
+           , ProductId "BTC-EUR"
+           , ProductId "BTC-USD"
+           ]


I think we should update this to the latest list, it feels sloppy not to do so before we merge.

I added this for testing purposes, to diversify the test data. Running the tests against every market might makes sense in some circumstances, but not as part of the automated test suite, I don't think. That said, it would be nice to have a list of all products and then maybe select a subset of markets to run the tests against as part of the automated suite, or let users choose what they want to test against.

Attempting to update the list just now, I ran getProducts and it returns only six products, despite there actually being dozens. I don't know what's going on there, but I don't know of another way to get a list of all products. If this is a bug in Coinbase's code, we should make them aware of it, see #11.

Also note that this is the sort of thing that will have to be kept up to date if it's meant to be complete, so we'll have to figure out how we want to do that.

src/Coinbase/Exchange/Types/Socket.hs

dimitri-xyz · 2020-05-12T18:43:09Z

src/Coinbase/Exchange/Types/Socket.hs

+  toJSON L2SnapshotMsg{..} = object
+    [ "type"       .= ("snapshot" :: Text)
+    , "product_id" .= msgProductId
+    , "asks"       .= msgAsks
+    , "bids"       .= msgBids
+    ]
+  toJSON L2UpdateMsg{..} = object
+    [ "type"       .= ("l2update" :: Text)
+    , "time"       .= msgTime
+    , "product_id" .= msgProductId
+    , "changes"    .= msgChanges
+    ]
+  toJSON OpenMsg{..} = object
+    [ "type"       .= ("open" :: Text)
+    , "sequence"   .= msgSequence
+    , "time"       .= msgTime
+    , "product_id" .= msgProductId
+    , "order_id"   .= msgOrderId
+    , "side"       .= msgSide
+    , "remaining_size" .= msgRemainingSize
+    , "price"      .= msgPrice
+    ]


I would like to be able to use deriving via to derive these automatically. This is technical debt. And it seems so close now! (Just change from camelCase to snake_case and remove the prefix) Some future work.

I've never used deriving via, though I know the basic idea. How does that work here?

I am not sure exactly. We would have to write a rule (i.e. a toJSON instance for a newtype wrapper) to derive these and then just apply this one rule to all these types. But I am not sure how to deal with the "type" field in this context. It seems worth exploring now.

The best description I know for deriving via is here:
https://skillsmatter.com/skillscasts/12654-deriving-via

It seems like this should be straightforward using Aeson's generic-deriving facilities and applying an appropriate fieldLabelModifier and constructorTagModifier. However, I'm pretty sure that I would have done that if possible. Perhaps there is some problem in deriving generic representations?

I'll look at it.

OK, I've looked at this and I think it can indeed be done with genericToJSON and a minimum of fiddling. I will re-write and resubmit tomorrow.

I take it back. This can almost be done by deriving generically, needing only to change a bunch of field and constructor tag names, but there are a small handful of cases where the fields on the ExchangeMessage constructors are insufficiently parallel to the JSON messages to make this work.

[I've included a few footnotes below on some digressions from the main point.]

For example, the fact that "received" messages are parsed into two constructors (ReceivedLimitMsg and ReceivedMarketMsg) makes it impossible to derive generically because there's no way to add JSON fields that don't correspond to a Haskell record field (the "order_type" field is absent in generically produces JSON because that information is captured by the distinction between the two constructors themselves).[1]

Another example of insufficiently parallel types and messages: "received" messages for market orders sometimes have a "size" field and sometimes a "funds" fields, but the ReceivedMarketMsg constructor has a combined msgMarketBounds field with type Either Size (Maybe Size, Cost). This is explained by a comment in the code which reads:

market orders have no price and are bounded by either size, funds or both

Evidently this is meant to capture the idea that a market order is specified either with a quantity (Size) to sell or an amount of funds (Cost) to use to buy (evidently with the possibility in the latter case of there also sometimes being a quantity present).[2]

So we can either (a) leave things as they are, with a big hand-coded ToJSON instance; or (b) re-write the data structure to be parallel to the server's JSON and then derive generically. In the above example, that would mean combining the ReceivedLimitMsg and ReceivedMarketMsg constructors and replacing the msgMarketBounds :: Either Size (Maybe Size, Cost) field with two fields, msgMaybePrice :: Maybe Price and msgMaybeFunds :: Maybe Cost.[3]

In my opinion, it makes sense to re-write the data structure to be parallel to the JSON (at least in the long run, if not now, and at least in the case of ExchangeMessage, though maybe not of other types). That would vastly simplify the code and eliminate pretty much all of the pain of debugging the Aeson instances. And I think it makes sense: the point of this library is to implement the Coinbase Pro API, which means giving users a good Haskell representation of what Coinbase's servers spit out. To me, the fact that we have to write fiddly ToJSON and FromJSON instances by hand suggests that our types are trying to do too much.

Now, I can see the value in providing types that express stronger invariants than what the API makes explicit in its message schema, but it doesn't seem to me that the existing types, where they diverge from the schema in ways like I mentioned above, actually help much. Is that Either type for a market order's "bounds" (a concept not present in the API or its documentation) really better than just using a Maybe for the two message fields that don't always show up? Even if the Either correctly encodes the permutations of fields that you might see, who cares? It mostly just seems more complicated.

There's something of a middle ground, which I imagine is similar to what you had in mind when you mentioned deriving via: we can use exactly parallel types for parsing, then translate into non-parallel types like those we have, e.g., splitting the representation of the "received" message type into two constructors. To the extent that we feel non-parallel representations really add value that might make sense.

Thoughts?

Footnotes:

I mean it's impossible to do this generically with the provided facilities for modifications along the way, which only allow for re-naming things. Obviously we could produce some wrong or incomplete JSON generically, then go back and fix it by adding fields or whatever. I mention this possibility towards the end of this comment.

Is this another Coinbase bug, or is this just wrong? A market buy should always specify an amount of funds (aka "cost"), without specifying a quantity (aka "size"), and a market sell should always specify a quantity ("size") without specifying an amount of funds ("cost"). Do you know the origin of this code? If not, this may merit further investigation.

Why Cost and notFunds? This reminds me that Size, Cost, and other types that refer to quantities of some asset would benefit from having the asset (aka "product") name or ID embedded in the type.

Whoa! There's a lot to unpack on your comments. Here are my thoughts.

First and foremost, I am not as concerned about deriving the ToJSON instances as I am about deriving the Arbitrary instances. This is why I said this is maybe future work. I think it is worth improving, but I don't think it (the technical debt) has to be solved now, for these ToJSON instances.

You did a good job at pointing out and documenting the difficulties with deriving them automatically because of the multiple constructors (i.e. ReceivedLimitMsg and ReceivedMarketMsg).

I strongly disagree that there is no value in capturing the API invariants expressed in the type of the field msgMarketBounds :: Either Size (Maybe Size, Cost). This invariant does exist (it just can't be expressed in JSON), This is evidenced by the fact that I don't ever recall a parsing failure because of these (let me know if you see one). Furthermore, once this invariant is established, we don't have to keep checking this value again and again as would be the case with multiple Maybes. I think this is one of those "Parse, don't validate" and "make invalid states unrepresentable" situations. Does this type make it more complicated to parse? For sure! But precisely because it enforces more strict invariants. (I have written other code that depends on this library and it was quite useful to be able to count on these invariants. The alternative is having to do this "type checking" again at a later stage. The invariant does show up either way.)

Although it may seem initially counter intuitive, allowing market orders to be bound by size, funds or both, is not a bug. This is exactly how market orders should be implemented in a "continuous limit orderbook" market. It is implicit that when limited by both, we should use the minimum (most restrictive) bound. This is one situation where Coinbase got it right and almost everybody else got it wrong.

The type Cost was created to distinguish from Price. The Cost of 2 (BTC) at a price of 10K (USD/BTC) is 20K (USD). Note the units. Funds seems to also match this idea, but seems a little bit less precise. I agree that costs could have the units encoded at the type level, for example, as a phantom type. As a requirement, I think that any implementation of this should also take care of the above calculation and changing units from Price to Cost.

Considering that after you looked at it carefully, it looks like a significant amount of work for an uncertain amount of gain. I suggest we leave it as is for now.

test/Coinbase/Exchange/Socket/Test.hs

dimitri-xyz · 2020-05-13T07:26:28Z

test/Coinbase/Exchange/Socket/Test.hs

+instance Arbitrary CurrencyDetails where
+  arbitrary = CurrencyDetails
+    <$> arbitrary
+    <*> arbitrary
+    <*> arbitrary
+    <*> arbitrary
+    <*> arbitrary
+    <*> arbitrary
+
+instance Arbitrary L2BookEntry where
+  arbitrary = L2BookEntry <$> arbitrary <*> arbitrary
+
+instance Arbitrary BookChange where
+  arbitrary = BookChange <$> arbitrary <*> arbitrary <*> arbitrary
+
+instance Arbitrary SendExchangeMessage where
+  arbitrary = do
+    sub <- Subscribe <$> arbitrary <*> arbitrary
+    unsub <- Unsubscribe <$> arbitrary <*> arbitrary
+    elements [sub, unsub]
+
+instance Arbitrary ExchangeMessage where
+  arbitrary = oneof
+    [ ErrorMsg <$> arbitrary
+    , SubscriptionsMsg <$> arbitrary
+    , HeartbeatMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , StatusMsg <$> arbitrary <*> arbitrary
+    , StartTickerMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary
+    , TickerMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , L2SnapshotMsg <$> arbitrary <*> arbitrary <*> arbitrary
+    , L2UpdateMsg <$> arbitrary <*> arbitrary <*> arbitrary
+    , ReceivedLimitMsg <$> arbitrary  <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , ReceivedMarketMsg <$> arbitrary  <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary
+    , OpenMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary
+    , MatchMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , LastMatchMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , DoneMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , ChangeLimitMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+    , ChangeMarketMsg <$> arbitrary <*> arbitrary <*> arbitrary <*> arbitrary
+        <*> arbitrary <*> arbitrary <*> arbitrary
+    ]
+
+--------------------------------------------------------------------------------
+-- `Arbitrary` instances for types defined in `... .MarketData.Types`
+-- FIXME: These should be moved to the module in which these types are tested,
+-- but the types themselves should probably move to `... .Core.Types` or
+-- `... Core`.
+
+instance Arbitrary Size where
+  arbitrary = Size <$> arbitrary
+
+instance Arbitrary Price where
+  arbitrary = Price <$> arbitrary
+
+instance Arbitrary Cost where
+  arbitrary = Cost <$> arbitrary
+
+instance Arbitrary CurrencyId where
+  arbitrary = CurrencyId <$> arbitrary
+
+instance Arbitrary ProductId where
+  arbitrary = ProductId <$> arbitrary
+
+instance Arbitrary Side where
+  arbitrary = elements [Buy, Sell]
+
+instance Arbitrary CoinScientific where
+  arbitrary = CoinScientific <$> arbitrary
+
+instance Arbitrary Sequence where
+  arbitrary = Sequence <$> arbitrary
+
+instance Arbitrary TradeId where
+  arbitrary = TradeId <$> arbitrary
+
+instance Arbitrary OrderId where
+  arbitrary = OrderId <$> arbitrary
+
+instance Arbitrary ClientOrderId where
+  arbitrary = ClientOrderId <$> arbitrary
+
+instance Arbitrary Reason where
+  arbitrary = elements [Filled, Canceled]
+
+--------------------------------------------------------------------------------
+
+-- FIXME: This is a hack. Newer versions of the `quickcheck-instances` package
+-- provide a proper `Arbitrary` instance for `UUID`. Delete this after updating
+-- package dependencies.
+instance Arbitrary UUID where
+  arbitrary = elements [nil]


This is too much extra code! Let's add any necessary dependencies and just derive these.

Yes, it's quite absurd. I believe we can derive all or most of these using generic-random; I'll re-write using that.

dimitri-xyz

This is great! Thanks for bringing the socket interface back from the dead. I do feel it would be wise for us to cut down on the boiler plate code for generating arbitrary instances before merging it, though. (And file/document corresponding bugs) We have a lot of testing code already. Feel free to merge it yourself, once you consider it done.

ericpashman · 2020-05-14T02:10:57Z

Well, it’s not a completely opaque field. The docs say it must be a UUID, and the empty string is not a valid UUID, so that seems to me to be an implementation problem on Coinbase’s part. If the field were meant to be completely opaque, we would just parse it into Text and there would be no problem. Anyway, I think we’re on the same page here. I’m working on making all the other discussed changes and will update the PR, maybe tonight, probably tomorrow.

…

On May 13, 2020, at 20:26, Dimitri DeFigueiredo ***@***.***> wrote: @dimitri-xyz commented on this pull request. In src/Coinbase/Exchange/Types/Socket.hs: > - | Error - { msgMessage :: Text - } I think punching "too big a hole" while parsing client_oid is different from this. It is an opaque field provided by customers themselves and I cannot point to any implementation problem from Coinbase. Having said that, I'll leave the decisions of what to do for you. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

…empty string; see disccusion at dimitri-xyz#10.

ericpashman · 2020-05-16T08:41:54Z

OK, agreed that we leave things as they are. I’ve made all the other changes discussed in this review, except that I still need to re-implement the boilerplate Quickcheck instances using `generic-random`. I’ll do that tomorrow and update the PR. Sorry for dropping all of that on you. :P These lengthy comments are partly notes to self as I reacquaint myself with the codebase after months away, and with Aeson, etc. I’ve experienced some déjà-vu working on this, coming to realizations I’ve had before that, (a) I originally tried to do this generically in the first place but ran into these same roadblocks, and (b) that in my opinion, the right way to implement third-party APIs is almost always to mirror what the API produces as closely as possible in the Haskell types, leaving “improvements” to higher-level consumers of the API. —— Please don't read the rest of this unless you have noting else to do this weekend. —— Apology behind me, let me do a bit of a brain dump while my thoughts are fresh. Please don’t feel the need to respond to this unless you think you can correct some basic misunderstanding on my part; I’ll open specific issues later if I feel this still merits it. There are several reasons why I don’t think it makes sense to “improve” the API in our types by trying to express invariants that the API itself doesn’t explicitly express. The API *does not* explicitly express these invariants. If an invariant isn't expressed in the JSON and isn’t expressed in the documentation, why are we inferring it? Diverging from the API by “improving” it makes the library harder to use by making it different. Even if you know the Coinbase Pro API well, you won’t know what the `msgMarketBounds` field is, because it *does not exist* in the Coinbase Pro API. Not even as a concept. This library just made it up. So now you have two APIs to learn. And as much as I’ve been shitting on Coinbase's documentation, it’s a lot better than this library’s. Writing, maintaining, and testing types and instances that mirror the API’s JSON is clean and easy, while writing, maintaining, and testing types that differ from the JSON rapidly adds lines of code and conceptual complexity. You don’t have to debug generic Aeson instances for types parallel to the JSON, while debugging hand-coded instances is as tedious and time-consuming as debugging in Haskell gets. I’m not convinced that the `msgMarketBounds` record field correctly expresses an API invariant on the permutations of “funds” and “size” fields. You seem to think otherwise, so I am prepared to realized I’m wrong, but I don’t see such an invariant expressed in the documentation, and as you noted it can’t be expressed in JSON. Further, as I wrote in the footnote to my previous comment, the invariant expressed by the `msgMarketBounds` is not what I think it obviously *should* be. I may well be wrong about that, but it seems to me that there should be a one-to-one correspondence between a market order’s “side” and its “bounds” (buy orders should have a “funds” field, sell orders should have a “size” field). Why should a single market order ever have both a “funds” (aka “cost”) and a “size” field? Note that one would not expect a parsing failure if the invariant expressed by `msgMarketBounds` is wrong, because in the case of what I think should happen, the `Maybe Size` inside a `Right` value of `msgMarketBounds` should always be `Nothing`. (That’s what I mean by “wrong”: the invariant is improperly expressed, if the preceding turns out to be correct, by allowing for a permutation that cannot happen.) One exception to (2)(1) that I see is if markets allow for negative prices. In that case, the invariant expressed in `msgMarketBounds` would still be wrong, as both size and funds would *always* be required in the case of a market sell; the case of a market buy would be unchanged. I don’t otherwise understand your point about a market order being limited by both “size” and “funds”. If I want to market sell 1 BTC in the BTC-USD market, how is my dollar balance relevant unless negative prices are allowed? And even if negative prices are allowed, how is my BTC balance relevant if I want to market buy $1000 worth of BTC? Another exception to (2)(1) that I see is if Coinbase calls “market orders” something which is not usually called that. It’s possible to implement an order type that is limited by both size and funds, it’s just not what’s usually called a market order, and it’s not a sensible order type that anybody would want to use. An order to buy $1000 worth of BTC at the market price, but not to buy more than one million BTC for those thousand dollars, even if it’s on offer, say, does nothing for you except to keep you from buying BTC at even cheaper prices than the prices you already bought at in the same order even though you have more cash left to buy with. An opposite example, in which the price is binding, is called a limit order. The only other exception to (2)(1) that I see is related to stop orders (or similar synthetic order types), for which a limitation by both size and funds does make sense. I haven’t looked at Coinbase’s implementation of stop orders, and I don’t know whether they show up with an “order_type” field of “market”. I would think they have a different order type, but Coinbase has surprised me before. I understand the distinction between `Cost` and `Size`, and I recognize that the units of the two quantities differ in the example at issue. That difference in units is why I wrote what I did in the footnote, that the asset of denomination should be embedded in the type. I don’t understand why the type was called `Cost` instead of `Funds`, diverging from the language used in the API and its documentation. In your example, “funds” seems precisely the right term, while “cost” seems like it could possibly mean “price”. In my opinion, the way Coinbase should have implemented this is by specifying a quantity (“size”) and denomination (“product”), i.e., how much of the asset on the other side of the market you’re willing to transact at the best available prices. “Make invalid states unrepresentable”, “parse, don’t validate.” Yes, and I’ve read that blog post and mainly agree with it. But I question its relevance here. We’re implementing bindings for a third-party API, not designing a program around one. Valid data here are the data the API returns, period. Yes, it’s good practice to enforce invariants at the edges of your program, but this is not your program. The types this API provides are not the types you should be using internally in your program; they’re just a Haskell representation of the data the API actually provides. Whatever invariants your program needs to enforce that Coinbase’s API doesn’t actually enforce should be enforced at the boundaries between this library and your program. “Parse, don’t validate” seems to me primarily to be summed up in the motto, “Write functions on the data you wish you had, not on the data you have”. This library deals only with data that you actually have, without reaching the level of what data you wish you had. Functors, applicatives, and monads exist primarily to deal with the problem of “data you wish you had”, so while I like the sentiment of “parse, don’t validate”, it’s not like we don’t have tools to deal with this problem down the line. Much of the above makes it seem like I don’t see value in expressing expressible invariants. My qualm is actually in expressing invariants that are not stated, and that therefore might not really exist or that we might express improperly. Do we really want to be in the business of inferring invariants empirically from the JSON messages we see coming over the wire? Aside from the fact that trying to do this makes the codebase a lot bigger and a lot more complicated, I feel like this sort of guessing game is not worth playing. Hope you didn’t read all of this. Have a good weekend!

…

On May 16, 2020, at 00:24, Dimitri DeFigueiredo ***@***.*** ***@***.***>> wrote: @dimitri-xyz commented on this pull request. In src/Coinbase/Exchange/Types/Socket.hs <#10 (comment)>: > + toJSON L2SnapshotMsg{..} = object + [ "type" .= ("snapshot" :: Text) + , "product_id" .= msgProductId + , "asks" .= msgAsks + , "bids" .= msgBids + ] + toJSON L2UpdateMsg{..} = object + [ "type" .= ("l2update" :: Text) + , "time" .= msgTime + , "product_id" .= msgProductId + , "changes" .= msgChanges + ] + toJSON OpenMsg{..} = object + [ "type" .= ("open" :: Text) + , "sequence" .= msgSequence + , "time" .= msgTime + , "product_id" .= msgProductId + , "order_id" .= msgOrderId + , "side" .= msgSide + , "remaining_size" .= msgRemainingSize + , "price" .= msgPrice + ] Whoa! There's a lot to unpack on your comments. Here are my thoughts. First and foremost, I am not as concerned about deriving the ToJSON instances as I am about deriving the Arbitrary instances. This is why I said this is maybe future work. I think it is worth improving, but I don't think it (the technical debt) has to be solved now, for these ToJSON instances. You did a good job at pointing out and documenting the difficulties with deriving them automatically because of the multiple constructors (i.e. ReceivedLimitMsg and ReceivedMarketMsg). I strongly disagree that there is no value in capturing the API invariants expressed in the type of the field msgMarketBounds :: Either Size (Maybe Size, Cost). This invariant does exist (it just can't be expressed in JSON), This is evidenced by the fact that I don't ever recall a parsing failure because of these (let me know if you see one). Furthermore, once this invariant is established, we don't have to keep checking this value again and again as would be the case with multiple Maybes. I think this is one of those "Parse, don't validate" <https://lexi-lambda.github.io/blog/2019/11/05/parse-don-t-validate/> and "make invalid states unrepresentable" situations. Does this type make it more complicated to parse? For sure! But precisely because it enforces more strict invariants. (I have written other code that depends on this library and it was quite useful to be able to count on these invariants. The alternative is having to do this "type checking" again at a later stage. The invariant does show up either way.) Although it may seem initially counter intuitive, allowing market orders to be bound by size, funds or both, is not a bug. This is exactly how market orders should be implemented in a "continuous limit orderbook" market. It is implicit that when limited by both, we should use the minimum (most restrictive) bound. This is one situation where Coinbase got it right and almost everybody else got it wrong. The type Cost was created to distinguish from Price. The Cost of 2 (BTC) at a price of 10K (USD/BTC) is 20K (USD). Note the units. Funds seems to also match this idea, but seems a little bit less precise. I agree that costs could have the units encoded at the type level, for example, as a phantom type. As a requirement, I think that any implementation of this should also take care of the above calculation and changing units from Price to Cost. Considering that after you looked at it carefully, it looks like a significant amount of work for an uncertain amount of gain. I suggest we leave it as is for now. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#10 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAYGTGNESDLR7F6CMUNJ2PTRRYPSPANCNFSM4JNEZMYA>.

dimitri-xyz · 2020-05-16T14:50:58Z

Just a couple thoughts.

The invariant in msgMarketBounds is documented in the API Spec for order placement. (Look for "market order parameters" there).

Here's an attempt to make the meaning of those parameters clearer.

All buy-side orders (limit or market) have a "funds" (i.e. cost) bound. Consider a market bid (buy order) in the BTC-USD market to buy 3 BTC. Although it is not explicitly specified, the matching engine must ensure the customer has enough funds to pay for the 3 bitcoins. How can this be done? Coinbase puts a hold on all the customers available USD funds and then "estimates" whether those funds will be sufficient to buy the 3 BTC. If the "risk management engine" (that makes the estimate) thinks those funds are insufficient, the order is immediately rejected ("insufficient funds"). If it thinks there's enough money to pay for the 3 BTC, the order goes to the matching engine. Exactly the same process happens for limit orders, but in that case it is easy to calculate how much USD to place on hold. Buying 3 BTC at $10K each will require at most $30K. This is the amount placed on hold. So, market bids will always have two bounds: one specified in "how many BTC to buy" (size) and another in "how much money to place on hold (funds)". Coinbase's API just allows customers to express those explicitly.

ericpashman · 2020-05-18T02:05:47Z

All changes made as discussed. Merging.

dimitri-xyz · 2020-05-18T04:24:17Z

Nice! :-)

ericpashman requested a review from dimitri-xyz November 14, 2019 02:59

This was referenced Nov 14, 2019

Notify Coinbase of issues with docs #11

Open

Incorporate @ericpashman's and @karl-chan's work #1

Closed

Implement and test all WebSocket channels; fix parsing of "client_oid…

5e765ee

…" field

ericpashman force-pushed the channels branch from 1ec7a71 to 5e765ee Compare November 25, 2019 19:48

ericpashman mentioned this pull request May 12, 2020

Failing tests #18

Open