Arrays #33

Closed
wants to merge 11 commits into
from

Projects

None yet

5 participants

@solidsnack
Contributor

Sorry for such a large patch. This integrates some of Bas's work on Postgres arrays. The parser and pretty-printer are much expanded.

@joeyadams
Contributor

Nice work!

However, I wonder if, instead of hanging the FromField and ToField instances directly on Seq, we should use a newtype wrapper instead, e.g.:

newtype PGArray a = PGArray [a]

instance (FromField a, Typeable a) => FromField (PGArray a) where

Because:

  • Due to PostgreSQL array's multidimensional constraint and other quirks, PostgreSQL arrays don't really have the same semantics as Seq a, [a], or even Array i a.
  • If someone wants to overload Seq to support a more consistent representation of arrays (e.g. JSON), they won't be able to.
  • If we want to support explicit dimensions like '[3:5]={3,4,5}'::int[] in the future, we can extend PGArray, but we can't extend Seq.
  • This avoids an unnecessary conversion to Seq a.
@solidsnack
Contributor

Well, with my approach it is possible to parse [a], [[a]] and so on up to six braces deep. We would need to do something very different for PGArray ... to be able to handle that kind of structure.

I do not think it is unreasonable for Postgres arrays with explicit indices to fail to parse for Seq, Vector or []. An absolutely sound, reliable mapping of Postgres types to Haskell types is too much to ask for -- and would not reduce one's programming effort, in any event (we always have to handle the not-Ok case).

@joeyadams
Contributor

Good point. But can't we just have PGArray fill the same role as Seq currently does here? To support multidimensional arrays, stack PGArray type constructors like we stack Seq. Or is Seq used for its performance characteristics rather than just its name?

In either case, I guess it's not a big deal.

@solidsnack
Contributor

We're using Vector internally, in fact; and I meant to push a commit which used Vector. If we weren't compelled to use Seq by the names/instances issue, then we'd likely have used []; but it certainly is a happy marriage of performant data-type and clean instances. It is probably just as well not to make a new collection type.

@lpsmith
Owner
lpsmith commented Aug 16, 2012

Yeah, I would like to see instances for Vector rather than Seq. What's the name/instances issue that you are referring to?

Also, how hard would it be to handle the non-comma separator issue that Joey Adams pointed out on the mailing list?

I'm also thinking that the BuiltinTypes module should be redesigned, probably renamed to TypeInfo and the interface changed a bit. I'm thinking what we really need is:

  1. A TypeInfo constant for a number of datatypes, accessible directly and not through a function of BuiltinTypes Enumeration -> TypeInfo
  2. A function that maps TypeOids to TypeInfos

I'm not so sure we really need an enumeration type like there currently is.

I'm perfectly willing on doing this overhaul myself, though.

@solidsnack
Contributor

Yeah, I would like to see instances for Vector rather than Seq.

I've pushed these instances to the array branch and they should end up as part of this pull request.

What's the name/instances issue that you are referring to?

That an instance for [a] would overlap with String.

Also, how hard would it be to handle the non-comma separator issue that Joey Adams pointed out on the mailing list?

The parser accepts the parameter as a delimiter.

https://github.com/solidsnack/postgresql-simple/blob/arrays/src/Database/PostgreSQL/Simple/Arrays.hs#L36

For nested arrays, commas are used; but when it comes time to parse values, the delimiter that's passed in is used. In the instance definition, the delimiter is hardcoded to a comma ,; but the type information could be used instead.

https://github.com/solidsnack/postgresql-simple/blob/arrays/src/Database/PostgreSQL/Simple/FromField.hs#L251

Each datatype has a delimiter type.

I think support for arrays of atomic types -- uuid, int4, bytea and many others -- would be fine in a feature release. Getting the GIS types right could be a quite a bit more work, as you suggest.

I wonder if there is some way to make the database handle all this stuff. For example, registering a prepared statement that executes any SQL at all and translates the result to JSON.

@lpsmith
Owner
lpsmith commented Aug 16, 2012

Well, this is going to be a feature release, I'll call it 0.3 and hopefully we can have it ready to go within a few weeks :) Speaking of, @sopvop, if you could fix up the SqlError type as you see fit, that would also be a nice thing to add to 0.3. (And your beginnings of a module to interpret SqlError, too)

Thanks for all the work!

@sopvop
Contributor
sopvop commented Aug 16, 2012

Arrays is the feature I'm looking forward to! (composite types would also be awesome)

I should have time for SqlError type next week.
As on error parsing, I don't know how any other types of errors can be parsed for info useful at runtime, Trigger action and plpgsql exeptions are useful at runtime, but their message bodies are user defined. All other errors are about fixing client code. Maybe naming module 'Errors' was not a good Idea considering it's limited scope.

@lpsmith
Owner
lpsmith commented Aug 17, 2012

Ok, I pushed a re-engineered BuiltinTypes module to my arrays branch, though it isn't hooked into anything else yet.

https://github.com/lpsmith/postgresql-simple/blob/arrays/src/Database/PostgreSQL/Simple/TypeInfo.hs

@lpsmith lpsmith closed this Aug 17, 2012
@lpsmith lpsmith reopened this Aug 17, 2012
@solidsnack
Contributor

Added a comment to the TypeInfo module -- not sure how you receive notifications for that. It stands to reason that TypeInfo should include typarray explicitly, so one can go up as well as down...

@lpsmith
Owner
lpsmith commented Aug 17, 2012

I did get a notification for your comment, actually. When I looked at the pg_type table and thought about going crazy pulling in a lot of information, but then decided to keep to things that we have a specific use case for.

So, out of curiosity, do you have a specific use case in mind for this information?

@lpsmith
Owner
lpsmith commented Aug 17, 2012

Though I didn't think about compatibility across different PostgreSQL versions until this morning. I guess I have two issues:

  1. How static is this data, really? (And what are the consequences of getting this data wrong? Clearly that depends on the column: getting the OID wrong would almost certainly cause problems, though small changes in the typcategory might not matter so much.)
  2. Also the pg_type table has evolved (looks like mostly additions, though). For example, the typcategory looks like it was added in 8.4, while typarray looks like it was added in 8.3.

Unfortunately I want to support versions all the way back to 8.1... Though it's no longer supported by the community, 8.1 is still supported by Red Hat and there are still a number of software packages that depend on it. (Actually, I don't personally care about 8.3 or 8.2, so I'm not going to be testing with these versions, though they are something I'd like to support.)

typarray is an easy enough to emulate, as the information is already available in the type table though not as directly available in older versions. It appears that typcategory is new information though... on the other hand I'm not using any types in 8.1 that aren't in the static type table, so maybe this isn't actually an issue that impacts me personally, and the level of compatibility that this change would currently create would be good enough.

Also, adding typarray would require pulling more types into the static table, which is fine by me but does bring up the issue of more things that can go wrong.

Perhaps we should add a utility function in Internal or somewhere that sanity-checks the static TypeInfo table? Or maybe this belongs as part of the test suite...

@sopvop
Contributor
sopvop commented Sep 21, 2012

@lpsmith how about merging all the pull requests into 0.3 branch, without releasing it on hackage?

@lpsmith
Owner
lpsmith commented Oct 8, 2012

@sopvop; i just did that, sorry about the delay

@lpsmith
Owner
lpsmith commented Jan 8, 2013

@solidsnack, do you have any opinions on whether or not the Database.PostgreSQL.Simple.Arrays module should be part of the public interface? I'm also interested in hearing from other interested parties.

@solidsnack
Contributor

@lpsmith I believe Database.PostgreSQL.Simple.Arrays should indeed be part of the public interface. There's nothing unsafe about it and it will be helpful to people wanting display arrays in diagnostic messages or debug the library should it be misbehaving.

I realize this code is rather old now. The new stuff is here: https://github.com/erudify/postgresql-simple/tree/arrays

I would be interested to know more about your strategy for getting this merged. I guess there are bunch of features planned for 0.3 but isn't this orthogonal to them? It would be of immediate value to most people using Postgres, I expect.

@lpsmith
Owner
lpsmith commented Mar 17, 2013

This pull request has already been merged into the 0.3 branch. I didn't see the updates in erudify's branch until now, but a quick persual suggests that not that much has changed.

Unfortunately the FromField instance in the 0.3 branch is currently broken, as I broke it in a rush to allow FromField instances to perform IO actions and thus stop pre-computing the typnames. I added a few failing test cases to the test suite to document the problem, but this does need to be fixed, and is one of a few small tasks blocking the release of 0.3.

The work I did ~2 months ago did get the 0.3 branch a great deal closer to release-ready state, but I'll also admit this particular release isn't a personal priority at the moment. I did write a comment for Bas's benefit that does describe the major task I'd still like to accomplish before the release here.

@solidsnack
Contributor

Is a feature release -- say 0.2.5 -- for just the arrays out of the question?

@lpsmith
Owner
lpsmith commented Mar 17, 2013

No, but I'm not sure how much work that would take versus completing what's there.

@solidsnack
Contributor

Maybe not much... #56

@solidsnack solidsnack closed this Mar 17, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment