Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor listTheory #98

Open
xrchz opened this issue Oct 30, 2012 · 22 comments
Open

refactor listTheory #98

xrchz opened this issue Oct 30, 2012 · 22 comments

Comments

@xrchz
Copy link
Member

xrchz commented Oct 30, 2012

listTheory is cluttered with stuff that has moved in over time from rich_listTheory, and depends on probably too many things (like pred_setTheory). There is arguably a "core" theory about lists, that could serve as listTheory, and the remainder can be moved elsewhere (e.g. to rich_list, or refactored further into different pieces). This issue comes in part from fallout discussion on the HOL developers mailing list after #52 was first closed.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@xrchz
Copy link
Member Author

xrchz commented Oct 30, 2012

least impact option might be to have a new core_listTheory, and build a listTheory with roughly the same stuff in it as currently on top (and possibly merge it with rich_list).

@konrad-slind
Copy link
Contributor

A. Yes, rich_list (and related conversions) needs to be revised or retired.
B. I like Magnus' point about having the list theory be "functional
programming" in
nature. It might be worth looking at SML's list structure for what
ought to be in
a "core" list theory:

    http://www.standardml.org/Basis/list.html

Also, I remember that Isabelle/HOL went through a big re-org of their

list
theory many years ago. Might be some hints there

Konrad.

On Tue, Oct 30, 2012 at 12:17 PM, Ramana Kumar notifications@github.comwrote:

least impact option might be to have a new core_listTheory, and build a
listTheory with roughly the same stuff in it as currently on top (and
possibly merge it with rich_list).


Reply to this email directly or view it on GitHubhttps://github.com//issues/98#issuecomment-9914455.

@mn200
Copy link
Member

mn200 commented Oct 30, 2012

It’s not obvious to me that it is overly cluttered (and removing MEM made it less cluttered of course :-). The complete list of constants is

 ALL_DISTINCT :α list -> bool
 APPEND :α list -> α list -> α list
 CONS :α -> α list -> α list
 DROP :num -> α list -> α list
 EL :num -> α list -> α
 EVERY :(α -> bool) -> α list -> bool
 EVERY2 :(α -> β -> bool) -> α list -> β list -> bool
 EVERY2_tupled :α list # β list -> bool
 EXISTS :(α -> bool) -> α list -> bool
 FILTER :(α -> bool) -> α list -> α list
 FLAT :α list list -> α list
 FOLDL :(β -> α -> β) -> β -> α list -> β
 FOLDL2 :(α -> β -> γ -> α) -> α -> β list -> γ list -> α
 FOLDL2_tupled :(α -> β -> γ -> α) # α # β list # γ list -> α
 FOLDR :(α -> β -> β) -> β -> α list -> β
 FRONT :α list -> α list
 GENLIST :(num -> α) -> num -> α list
 GENLIST_AUX :(num -> α) -> num -> α list -> α list
 HD :α list -> α
 LAST :α list -> α
 LEN :α list -> num -> num
 LENGTH :α list -> num
 LIST_REL :(α -> β -> bool) -> α list -> β list -> bool
 LIST_TO_SET :α list -> α -> bool
 LRC :(α -> α -> bool) -> α list -> α -> α -> bool
 LUPDATE :α -> num -> α list -> α list
 MAP :(α -> β) -> α list -> β list
 MAP2 :(β -> γ -> α) -> β list -> γ list -> α list
 MAP2_tupled :(β -> γ -> α) # β list # γ list -> α list
 NIL :α list
 NULL :α list -> bool
 PAD_LEFT :α -> num -> α list -> α list
 PAD_RIGHT :α -> num -> α list -> α list
 REV :α list -> α list -> α list
 REVERSE :α list -> α list
 SET_TO_LIST :(α -> bool) -> α list
 SNOC :α -> α list -> α list
 SUM :num list -> num
 TAKE :num -> α list -> α list
 TL :α list -> α list
 UNZIP :(α, β) alist -> α list # β list
 ZIP :α list # β list -> (α, β) alist
 isPREFIX :α list -> α list -> bool
 listRel :(α -> β -> bool) -> α list -> β list -> bool
 list_case :β -> (α -> α list -> β) -> α list -> β
 list_size :(α -> num) -> α list -> num

If you were going to remove things because they were too obscure, I’d go for all those from “ListPair” (i.e., EVERY2, FOLDL2, MAP2, ZIP and UNZIP), and also LRC. The rest all look pretty reasonable to me. (The fact that we have two copies of LIST_REL (that and listRel) is not good of course...)

SET_TO_LIST is also pretty weird and wild, so I’d feel OK with deleting that.

Finally, I’ve just had a look at the Isabelle library, and I see that they have a ListMem constant, defined inductively, and for which there are no further theorems. Their theorems whose names include the substring “mem” typically have x ∈ set l sub-terms.

@acjf3
Copy link
Contributor

acjf3 commented Oct 30, 2012

Having a separate ListPair theory makes sense to me.

I would actually propose creating a completely new directory under src/ and starting a clean list development from scratch. That way users can keep on using the existing theories for now and wait until the new one stabilises (there's agreement/maturity). In time, the old theories can be abandoned en masse. Of course, the old theories can be raided for proofs -- hopefully with a fair bit of cleaning up. Mucking around with the existing theories in situ is probably going to be too disruptive.

I don't see why we can't simply mimic the Standard ML libraries (almost verbatim style). In particular, things like "GENLIST" should be called "tabulate". An separate theory can be used for set/relational stuff. I find mapPartial to be a useful SML function; zip and unzip are also pretty useful (I wouldn't describe them as being that obscure).

@mn200
Copy link
Member

mn200 commented Oct 30, 2012

I didn't mean to denigrate ZIP; it would just be better in ListPair (as per the Basis).

I also like mapPartial, so would be more than happy to see it in listTheory. I think that setting up tabulate as an alias for GENLIST is a pretty good idea, though SML’s List.tabulate takes a pair and GENLIST is curried. I think I prefer the curried version. Other incompatibilities appear in the type of FOLDL for example. I don’t think the pain of fixing that one would be worth it.

(I can’t help but note that the basis library doesn’t have a MEM function at all.)

Finally, I don’t want to put users into the situation of deciding that they should be doing

open interimListTheory

and then having to later switch back to

open listTheory

and what if they end up including/open-ing both?

We want listTheory to be the go-to place for theorems about lists, with all the standard notions in place. Users get a pretty reasonable version of that right now. They even get a notion of membership. It doesn’t seem at all obvious to me that there is any need for a dramatic reworking of what’s there.

@acjf3
Copy link
Contributor

acjf3 commented Oct 31, 2012

There are a few other differences:

nth, EL
all, EVERY
concat, FLAT

SML has "find" (which is useful) and "getItem" (which I don't use), "SNOC" and "FRONT" don't appear in the Basis either.

I agree that re-implementing something as widely used as the list theory would be a fairly radical move, so there would have to be a clear imperative. A couple of previous bit-vector theories have been deprecated/removed., This approach works best when something isn't being heavily used, or if porting to the new development is very easy or highly beneficial. I think a much cleaner/better version of listTheory could be developed, especially if one isn't shackled by supporting/maintaining the existing version. However, I admit that the benefits would be largely aesthetic. The state of the current version is somewhat embarrassing.

If refactoring work is to be done on listTheory then perhaps it is best if a new branch is created for that purpose?

@mn200
Copy link
Member

mn200 commented Oct 31, 2012

Yes, a features/better-lists branch or something would be the way to go. But what’s the plan of work?

Is it just:

  1. move pair-ish constants to new listPairTheory
  2. add mapPartial
  3. provide “tabulate” alias for GENLIST
  4. provide “concat” alias for FLAT
  5. move LRC somewhere else, maybe a list_relationTheory

These all seem to be sufficiently low-impact that they’d be fine as small commits on master.

@konrad-slind
Copy link
Contributor

  1. Move SNOC somewhere else
  2. Move PAD_{LEFT,RIGHT} somewhere else
  3. Add a useful ordering on lists (lex, colex)?

Speculation: mem not in SML List structure because it has an equality type.
And maybe because the designers wanted to promote using other datastructures
for set-like applications.

On Tue, Oct 30, 2012 at 7:53 PM, Michael Norrish
notifications@github.comwrote:

Yes, a features/better-lists branch or something would be the way to go.
But what’s the plan of work?

Is it just:

  1. move pair-ish constants to new listPairTheory
  2. add mapPartial
  3. provide “tabulate” alias for GENLIST
  4. provide “concat” alias for FLAT
  5. move LRC somewhere else, maybe a list_relationTheory

These all seem to be sufficiently low-impact that they’d be fine as small
commits on master.


Reply to this email directly or view it on GitHubhttps://github.com//issues/98#issuecomment-9928841.

@mn200
Copy link
Member

mn200 commented Oct 31, 2012

Totally agree about providing a LEX relation. What would be a good name for a theory that included PAD_LEFT? And where would SNOC go? ((Half-seriously) We could just make SNOC an alias for _ ++ [_] ...)

mn200 added a commit that referenced this issue Oct 31, 2012
See #98 (clean up listTheory) and #99 (automatically delete unneeded
_tupled constants).
@acjf3
Copy link
Contributor

acjf3 commented Oct 31, 2012

I think it's important to have reasonably well-founded reasons for selecting the locations for each of the constants. In particular, I don't think being "infrequently used" or "not used by me" is a good criteria for pushing something out of the core theory. It's too cheap to have extra constants around and too annoying to have "hidden / less well known" theories around. Frequency of use is too fragile/confusing a criteria.

Distinctions can be made on the basis of dependencies and hopefully some unambiguous taxonomical criteria. So having PairLists and "Equality Type / Set Based" lists seems reasonable. Whereas a dumping ground for SNOC, PAD_{LEFT/RIGHT} seems pointless.

I'd prefer it if SNOC where renamed to something like append1. An overload is an interesting proposal but it could be hard to pull off.

@mn200
Copy link
Member

mn200 commented Oct 31, 2012

How about listSegments or something similar for some of those things in rich_list? I agree with your comments about trying to be a little bit principled, and that constants are cheap. Particularly now that most people (I hope) are able to use heaps, the fact that listTheory is perhaps rather large on disk doesn't really matter. (On Moscow ML, the larger the theory, the slower it will be to load.)

@acjf3
Copy link
Contributor

acjf3 commented Oct 31, 2012

Yes, something for list segments sounds reasonable. Having a consistent naming convention would be good, e.g. something like: listPair, listSet (or ?), listSegment.

A decision would have to be made about what gets loaded with bossLib. As a user of the heaps feature, I have a bias towards the "load everything"' approach but I agree that Mosml users won't be happy with that. (PolyML 5.5 is now pretty good at managing memory.) In any case, a library should be added as a shortcut to loading everything (users can then simply add this to their heap).

@konrad-slind
Copy link
Contributor

I agree. Loading listLib should load everything. Hopefully this
will help assuage the irritation that comes with separating
out different sub-theories like listPair, etc.

Konrad.

On Wed, Oct 31, 2012 at 5:16 AM, acjf3 notifications@github.com wrote:

Yes, something for list segments sounds reasonable. Having a consistent
naming convention would be good, e.g. something like: listPair, listSet (or
?), listSegment.

A decision would have to be made about what gets loaded with bossLib. As a
user of the heaps feature, I have a bias towards the "load everything"'
approach but I agree that Mosml users won't be happy with that. (PolyML 5.5
is now pretty good at managing memory.) In any case, a library should be
added as a shortcut to loading everything (users can then simply add this
to their heap).


Reply to this email directly or view it on GitHubhttps://github.com//issues/98#issuecomment-9939368.

mn200 added a commit that referenced this issue Nov 1, 2012
See github issue #98 for more information.

Essence of work to date is that core_list has the type definition, and
the “basic” functions on that type. Derived theorems and other
constants are left until listScript to be defined.
@mn200
Copy link
Member

mn200 commented Nov 1, 2012

I’ve started branch features/list-refactor for work on this idea. We can give up on the whole idea if it proves too ugly, or for whatever other reason. What I’ve committed so far (0d14a13) creates a minimal core_listTheory (without any dependency on pred_setTheory I’m sure you’ll be pleased to see).

The resulting, gutted listTheory then builds on top of this successfully. At the moment, rich_listTheory fails.

@konrad-slind
Copy link
Contributor

Looks good. I'd vote to get rid of SUM from core_listTheory.
Does it get used anywhere?

I'd be inclined to argue that core_listTheory should contain basic
functional programming on lists (so it's similar to the SML Basis).
Needs a better name than core_list, though.

Konrad.

On Wed, Oct 31, 2012 at 11:06 PM, Michael Norrish
notifications@github.comwrote:

I’ve started branch features/list-refactor for work on this idea. We can
give up on the whole idea if it proves too ugly, or for whatever other
reason. What I’ve committed so far (0d14a130d14a132417f82b4)
creates a minimal core_listTheory (without any dependency on
pred_setTheory I’m sure you’ll be pleased to see).

The resulting, gutted listTheory then builds on top of this successfully.
At the moment, rich_listTheory fails.


Reply to this email directly or view it on GitHubhttps://github.com//issues/98#issuecomment-9970439.

@mn200
Copy link
Member

mn200 commented Nov 1, 2012

Yeah, I agree that SUM could probably go. What do you think needs to be there
to support “functional programming à la Basis”? I guess perhaps GENLIST should
be there. TAKE and DROP are Basis functions that could probably move there.

@xrchz
Copy link
Member Author

xrchz commented Nov 1, 2012

I'm not sure what you meant by "Does it get used anywhere?", but I certainly use SUM in some of my theories. I don't think it's being removed in this proposal, just moved, so that's fine. But I should probably reiterate the question I sent to the dev mailing list about testing external example theories (reply there).

@xrchz
Copy link
Member Author

xrchz commented Mar 30, 2013

LIST_REL and EVERY2 are the same constant.

@homeier
Copy link
Contributor

homeier commented Mar 31, 2013

LIST_REL is an important constant from the quotient library. It is parallel
to OPTION_REL and the infix operators for pairs and sums, ### and +++. All
of these are polytypic examples of operators that take as arguments partial
equivalence relations for the arguments of the type, and return as results
the corresponding partial equivalence relation for the aggregate type.

Peter

On Sat, Mar 30, 2013 at 12:21 PM, Ramana Kumar notifications@github.comwrote:

LIST_REL and EVERY2 are the same constant.


Reply to this email directly or view it on GitHubhttps://github.com//issues/98#issuecomment-15677100
.

"In Your majesty ride prosperously
because of truth, humility, and righteousness;
and Your right hand shall teach You awesome things." (Psalm 45:4)

@xrchz
Copy link
Member Author

xrchz commented Mar 31, 2013

Speaking of which, OPTION_REL is the same as OPTREL in optionTheory. I've found the latter to have a much more useful definition theorem.

I merely suggest that one day we remove one of each pair of duplicated constants [(LIST_REL,EVERY2),(OPTION_REL,OPTREL)], keeping all the unique theorems and probably keeping both names with overloads.

@mn200
Copy link
Member

mn200 commented Mar 31, 2013

Yes. Having two constants is unnecessary. Having two names for the same constant is fine.

mn200 added a commit that referenced this issue Mar 31, 2013
Preserve the EVERY2 theorems, at least for the moment.

This issue was first identified in discussion around github issue #98
mn200 added a commit that referenced this issue Mar 31, 2013
Preserve quotient_optionTheory.OPTTION_REL_def as a theorem derived
from OPTREL’s definition.

This is another issue arising from discussion associated with github
issue #98.
@homeier
Copy link
Contributor

homeier commented Mar 31, 2013

Speaking of these constants like LIST_REL, there is an even more important
constant in the quotient library, that perhaps might be of value defined
earlier in the HOL system.

This is the infix constant ===>, defined in quotientScript.sml as

val FUN_REL =
new_infixr_definition
("FUN_REL",
(--$===> (R1:'a->'a->bool) (R2:'b->'b->bool) f g = !x y. R1 x y ==> R2 (f x) (g y)--),
490);

Given two partial equivalence relations R : 'a -> 'a -> bool and S : 'b ->
'b -> bool, R ===> S is the corresponding partial equivalence relation on
the type 'a -> 'b. This is one of the core notions that makes higher order
quotients possible.

Peter

On Sun, Mar 31, 2013 at 2:56 AM, Michael Norrish
notifications@github.comwrote:

Yes. Having two constants is unnecessary. Having two names for the same
constant is fine.


Reply to this email directly or view it on GitHubhttps://github.com//issues/98#issuecomment-15687906
.

"In Your majesty ride prosperously
because of truth, humility, and righteousness;
and Your right hand shall teach You awesome things." (Psalm 45:4)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants