Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semantics of aggregation is unclear #22

Open
karamaan opened this issue Jun 6, 2014 · 1 comment
Open

Semantics of aggregation is unclear #22

karamaan opened this issue Jun 6, 2014 · 1 comment

Comments

@karamaan
Copy link

karamaan commented Jun 6, 2014

The semantics of HaskellDB's aggregation operators is very unclear. Let me propose the following example to demonstrate my confusion. It concerns a table whose rows represent people. A person has a family and an age.

import Database.HaskellDB.PrimQuery
import Database.HaskellDB.Query
import Database.HaskellDB.HDBRec
import Database.HaskellDB.DBLayout

data Person = Person
instance FieldTag Person where
  fieldName _ = "person"

data Family = Family
instance FieldTag Family where
  fieldName _ = "family"

data Age = Age
instance FieldTag Age where
  fieldName _ = "age"

family :: Attr Family String
family = mkAttr Family

person :: Attr Person String
person = mkAttr Person

age :: Attr Age Int
age = mkAttr Age

personTable :: Table (RecCons Person String
                  (RecCons Family String
                   (RecCons Age Int RecNil)))
personTable = Table "mytable" [ ("person", AttrExpr "personcol")
                              , ("family", AttrExpr "familycol")
                              , ("age", AttrExpr "agecol") ]

I might want to calculate the total age of everyone in a family. This I do with agesOfFamilies. It returns a query whose rows (ostensibly) pair the family with the total age of everyone in that family.

agesOfFamilies :: Query (Rel
                         (RecCons Family (Expr String)
                          (RecCons Age (Expr Int) RecNil)))
agesOfFamilies = do
  my <- table personTable
  project (family << my!family # age << _sum (my!age))

I can test it with showSql thus:

*Main> putStrLn $ showSql agesOfFamilies 
SELECT familycol as family,
       SUM(agecol) as age
FROM mytable as T1
GROUP BY familycol

which is exactly what I wanted. What happens if I want to project just the age column from this query?

justAgesOfFamilies :: Query (Rel (RecCons Age (Expr Int) RecNil))
justAgesOfFamilies = do
  agesOfFamilies <- agesOfFamilies
  project (age << agesOfFamilies!age)

It seems that justAgesOfFamilies should return a single-column query with one row for each family containing their total age, i.e. the result of the query agesOfFamilies without the family column. However, what I get is completely different

*Main> putStrLn $ showSql justAgesOfFamilies 
SELECT SUM(agecol) as age
FROM mytable as T1

This kind of behaviour seems to be an enormous impediment to composability of queries in HaskellDB.

@tomjaguarpaw
Copy link

Just to be clear, the reason that this is undesirable is that it is a violation of referential transparency. ("Referential transparency" here is with respect to the database, not with respect to Haskell, of course!) An expression's value should be unchanged when you replace a subexpression with its value. For example agesOfFamilies might evaluate to

Family Age
Smith 75
Jones 85

Replacing agesOfFamilies in the definition justAgesOfFamilies with its value (i.e. this table) would lead to a result of

Age
75
85

Since HaskellDB gives us

Age
160

this is a violation of referential transparency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants