-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some FromSQLValue instances are very difficult to write as a result of limitations in node-postgres #58
Comments
Note: implementing strategy (2) might take some care in order to make sure that all possible SQL return values are typeable. For instance, one must handle the case of an SQL query returning a set of rows where two or more rows differ in type. This is exemplified by SELECT 'this is text', ('bla', 'bla', 'bla') UNION ALL SELECT 'this is numbers', (100, 101); |
@Quelklef Sorry for my late response. I think that want to takle all the problems decribed by you. I have already encountered and tried to "patch" a specific problematic datetime case but this was just a "hot fix". I'm not sure if I understand your second proposition. On the other hand your first design seems to be really clear and easy to follow so I like it for sure :-) I think that additional possible benefit of this design is that we are possibly moving away from @Quelklef what is your preference - do you prefer your first or second design? Do you think that I should go and dive into the first proposition once again (I have like a computer detox for the next three more months so I'm hardly ever online :-P)? P.S. |
I don't really have one. I lean towards value-based codecs, so I'd prefer not even using |
This turned out to be a very long comment. I have provided a summary below. Alternatively, please feel free to skim or skip this comment. My recommendation is for an implementation that looks like this: class FromSQLValue c a | a -> c where
sqlTypedesc :: SQLTypedesc c -- value that tells how to go from an SQL expression to a c
fromCanonical :: c -> Either String a -- go from a c to an a
-- Secretly they're just parsers
newtype SQLTypedesc c = SQLTypedesc (Parser c)
-- How fromSQLValue would be written with this
fromSQLValue :: forall c a. FromSQLValue c a -> Either String a
fromSQLValue = runParser sqlTypedesc >=> fromCanonical -- or something like that
-- Used like this
newtype Box = Box { width :: Number, height :: Number }
instance FromSQLValue Box where
sqlTypedesc = pgTuple (pgDouble /\ pgDouble) :: SQLTypedesc (Number /\ Number)
fromCanonical (width /\ height) = Box { width, height } I believe this implementation to provide a nice API, reasonable implementation, and likely work well with your goal of backend-agnosticism. Oh, and fully suppressing The rest of this comment provides more details as well as documents how I concluded that this design is good. @paluh No sweat on the time frame! In fact, in the time between first posting this GitHub issue and your response, I have been working on the problem on my own and have more thoughts. "I'm not sure if I understand your second proposition."Oops! I should explain it a bit more. The idea is to have In code, class FromSQLValue a where
exprType :: SQLTypedesc
fromForeign :: Foreign -> Either String a
data SQLTypedesc
= Pg_Int
| Pg_Double
| Pg_Text
| Pg_Array SQLTypedesc
| {- ... -} The newtype Box = Box { width :: Number, height :: Number }
instance FromSQLValue Box where
exprType = Pg_Tuple (Pg_Double /\ Pg_Double) -- expected type is (DOUBLE, DOUBLE)
fromForeign f = {- readArray then readNumber twice -} Because I gave an The idea behind this design is as follows. Design (1) just asks A third design: design (3)An interesting feature of design (2) is that if We can reify this with the typesystem by parameterizing data SQLTypedesc a where
Pg_Int :: SQLTypedesc Int
Pg_Double :: SQLTypedesc Number
Pg_Text :: SQLTypedesc String
Pg_Array :: forall a. SQLTypedesc a -> SQLTypedesc (Array a)
{- ... -}
class FromSQLValue canonical rich where
exprType :: SQLTypedesc canonical -- turn the SQL row into some canonical Purescript form
fromCanonical :: canonical -> Either String rich -- turn that into a rich value
newtype Box = Box { width :: Number, height :: Number }
instance FromSQLValue Box where
exprType = Pg_Tuple (Pg_Double /\ Pg_Double)
fromCanonical (width /\ height) = Box { width, height } Unfortunately, Purescript does not currently support GADTs. Fortunately, GADTs are not fundamentally required for this solution. As long as we have a. an opaque type that's enough. I believe this can be achieved, in theory, in a number of ways. For instance:
Implementation stub: data SQLTypedesc a
= Pg_Int
| Pg_Double
| {- ... -}
pgInt :: SQLTypedesc Int
pgInt = Pg_Int
pgDouble :: SQLTypedesc Number
pgDouble = Pg_Double
{- ... -}
parse :: SQLTypedesc a -> String -> Either String a
parse Pg_Int = parseInt
parse Pg_Double = parseDouble
{- ... -}
Implementation stub: newtype SQLTypedesc a = SQLTypedesc (Parser a)
pgInt :: SQLTypedesc Int
pgInt = SQLTypedesc $ parseInt
{- ... -}
parse :: SQLTypedesc a -> String -> Either String a
parse (SQLTypedesc parser) str = runParser parser str # lmap showErr -- or something like that
where showErr = {- ... -} Again, designs (3.1) and (3.2) are definitely not the only ways to achieve design (3). But they are a couple of reasonable ways. "your first design seems to be really clear [...] I think that additional possible benefit of this design is that we are possibly [...] starting to build a layer of parsers / serializers which are backand agnostic"I don't know exactly what you have in mind for this backend-agnostic future; however, in my mind design (3) would work best for it. The issue with design (1) is that it exposes the fact that parsing SQL strings is happening, meaning that client code might inadvertently write something which is PostgreSQL-only. On the contrary, because in design (3) "what is your preference"I'm personally in favor of design (3.2). I don't like that design (1) allows the client code to think about SQL strings; design (2) makes below-optimal use of the typesystem; and design (3.1) seems comparable to (3.2) but more complex. The only issue I can see with design (3.2) is that it would likely less performant than design (1). Using design (3.2), parsing SQL expressions is a two-step process: first, turn them into a canonical value via In my personal bindings, I have been using design (1) with reasonable success so far. (Which is to say, success writing code; I have yet to run any). "I wonder how hard is to skip unnecessery node-postgres parsing and get String input"Not difficult! As it turns out, it's easy to suppress To disable parsing, use the following: require('pg').types.getTypeParser = () => x => x; Note that this relies on undocumented code details, so upgrading An API-compliant way to do this, and perhaps the more obvious solution, is to set every OID parser to As for disabling printing (js→pg), nothing has to be done. As it turns out, one can pass parameter bindings to "I lean towards value-based codecs"@JordanMartinez What do you mean by this? |
Oh, value-based as in fromSQLValue :: Parser a -> String -> Either Err a instead of fromSqlValue :: FromSQLValue a => String -> Either Err a |
I continued work on my personal bindings. As it turns out, a number of the designs I suggested have not-so-obvious issues. I have landed upon a design which:
The idea is basically as follows. Start with class FromSQLValue a where
fromSQLValue :: String -> Either String a -- parse a PostgreSQL expression
instance FromSQLValue String where {- ... -}
-- ... other instances Because the client code shouldn't be thinking about parsing at all, wrap this in an opaque newtype: class FromSQLValue a where
fromSQLValueImplementation :: Impl a
-- type is exported, constructor is not
newtype Impl a = Impl (String -> Either String a)
instance FromSQLValue String where {- ... -}
-- ... other instances But, because the mkImpl :: forall a r. FromSQLValue a => (a -> Either String r) -> Impl r This allows instances like the following: newtype Box = Box { height :: Number, width :: Number }
instance FromSQLValue Box where
fromSQLValueImplementation = mkImpl $
\(width /\ height) ->
if width < 0.0 || height < 0.0
then Left "Box with negative dimensions"
else Right (Box { width, height }) This implementation uses the built-in Note that The main entrypoint is fromPg :: FromSQLValue a => String -> Either String a Please see my current implementation here. (Note that I name the class |
Hi @Quelklef, Thanks a lot for all the description and links to your reference implementation. It seems that I won't be able to dive into your solution because it seems that it would be in general major change and I just don't have time before the upcomming release of this lib to do this kind of revolution :-( I think that we want to provide some better solution for parsing layer and composite types handling in the future so I'm going to leave this issue opened for now and maybe attach to it some appropriate tag. Once again thanks and sorry for this lame feedback from my side :-( |
Hey @paluh! Completely understandable! It's a pretty large effort ask to finalize a design (especially given my slew of open-ended suggestions), and then implement it. No hard feelings that there aren't the resources to do this right now With any luck, I will continue thinking about this on my own time, and either fork |
Hello!
I am currently writing my own set of ps/pg bindings for a personal project. Mason Mackaman suggested possibly using
postgresql-client
instead. (He also mentioned that @paluh had expressed frustration about yet another set of bindings being written).postgresql-client
and my bindings are very similar in design and both exhibit a particular pitfall. I want to document this pitfall here and also propose a potential solution. Should this pitfall be fixed inpostgresql-client
, I don't really have good reason to continue developing my bindings. Should the pitfall be deemed a WONTFIX or non-problem, I may continue developing my bindings and eventually publish them.Let me detail the pitfall now.
It seems that
postgresql-client
and my bindings have the same method for handling returned SQL rows: we ask destination types to instantiateFromSQLValue
; then, when an SQL row is returned fromnode-postgres
, we usefromSQLValue
to turn it into a Purescript value.While
node-postgres
generally displays pretty reasonable behaviour, atypical cases are not handled so well. For instance, on the querySELECT 'fst', 'snd';
, which has row typeTEXT, TEXT
(a size-two row of text),node-postgres
will produce["fst", "snd"]
. However, for the querySELECT ('fst', 'snd');
, with row type(TEXT, TEXT)
(a size-one row containing a 2-tuple of text),node-postgres
will produce["(fst,snd)"]
[0].This means that instantiating
FromSQLValue
for the row type(TEXT, TEXT)
is of notable complexity. The programmer needs to properly handle SQL string syntax, which includes accounting for both quoted and unquoted strings, escape codes, etc [1].postgresql-client
currently does not provide any tools for dealing with this and other complex cases, essentially throwing the programmer into the deep end.I have two design which I believe to pander better to these complex cases of
FromSQLValue
as well as retain support for existing cases.(1) The first design is as follows.
Forgo the parsing from
node-postgres
entirely. HaveFromSQLValue
parse strings instead, i.e.class FromSQLValue a where fromSQLValue :: String -> Either String a
. Then, provide a module containing SQL parser combinators for use when implementingfromSQLValue
.Many existing
FromSQLValue
instances will be trivial to update: for instance, parsingBOOL
goes fromForeign.readBool
toparseBool
. Others, such as those with container types, will be less trivial:Foreign.readArray >=> traverse Foreign.readInt
will have to become something likeparseArrayOf parseInt
[2]. However, I believe that all existing cases will be accounted for, as well as additionally accounting for the complex cases previously discussed.(2) The second design is as follows.
Ask instances of
FromSQLValue
to provide two things: a description of the expected SQL row type, and then a converterfromSQLValue :: Foreign -> Either String a
. The row type description will be used to turn the returned row into a canonicalForeign
value, and then the converter will be used as it is currently.This design is more complex, but also more reverse-compatible, as most
fromSQLValue
implementations will not need to be changed. Additionally, this second design may work better depending on how regular/irregular the SQL value formatting is. For instance, if a container's syntax changes depending on the values it contains, then this second design will probably handle those cases better.End second design.
The pitfall outlined in this GitHub issue creates a schism wherein some
FromSQLValue
instances are easy to write, and some are very difficult, and the divide is not obvious; this strikes me as a significant issue. However, I recognize that (A) these complex cases likely make up a very, very small portion ofFromSQLValue
instances [3], and (B) this proposal is a fairly notable shift in design. For these reasons, I understand if there is hesitance towards accepting it.Please let me know your thoughts.
Post-script. It's likely worth mentioning that a similar thought process might apply to
ToSQLValue
, as well, but that I have not yet thought about it fully.[0]: The fundamental issue, as far as I can tell, is that
node-postgres
uses type OIDs to determine how to parse the strings that postgres produces. For instance, if postgresql produces the stringt
with type OID 18 (CHAR), thennode-postgres
will parse this to the js value't'
. However, if postgresql produces the stringt
with type OID 16 (BOOL), thennode-postgres
will parse this to the js valuetrue
. It seems, however, that not all types have OIDs, and if a row is returned with no OID thennode-postres
will return it unchanged. This is whySELECT (true, false);
became"(t,f)"
: because(BOOL, BOOL)
has no OID. Or, at least,node-postgres
does not know of any.I would like to admit, quickly, that I am not an expert on Postgres interop / communication, so it's possible that I am mistaken about something with regards to type OIDs.
[1]: Example difficult-to-parse result: the row produced by
is returned by
node-postgres
as[2]: FWIW, the reason that "array of int" needs a nontrivial change is because while
Foreign.readArray
can read the array structure without reading its contents, one cannot parse an array from a string without parsing its contents.[3]: Unfortunately for me, my current use-case is one of these very, very few cases =(
The text was updated successfully, but these errors were encountered: