Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add glossary #51

Merged
merged 2 commits into from
Oct 10, 2017
Merged

Add glossary #51

merged 2 commits into from
Oct 10, 2017

Conversation

fabric-and-ink
Copy link
Contributor

@fabric-and-ink fabric-and-ink commented Jun 21, 2017

This a draft for a glossary for this crate (see #41). I chose terms that I found inside the crate and added some that seemed appropriate.

Now I need feedback, @nikomatsakis, @aturon, @withoutboats :)

  • Which terms are unnecessary? Which terms are still missing from the list? Which terms need more explanation?
  • Some terms are still lacking an description. These are the ones that I would have to guess.
  • Please point out all factual & grammatical errors and wording improvements.
  • More pointers to literature :)

Rendered

GLOSSARY.md Outdated
| 0111 | || | Disjunction; or |
| 1001 | <=> | Equivalence; if and only if; iff |
| 1101 | => | Implication; if ... then |
```
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This table could not be rendered correctly in markdown because of the || operator and I didn't like the workaround :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the || operator is not used at all in Chalk, maybe you could just drop it? I think that would avoid some confusion. Or maybe you just want to be exhaustive in a general way and then that's fine.

Copy link
Contributor Author

@fabric-and-ink fabric-and-ink Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generality is not my goal, so I will drop it.

GLOSSARY.md Outdated
`a` is a free variable.
- A sum `\sum_n x_n` binds the index variable `n`.

## Canonical
Copy link
Contributor Author

@fabric-and-ink fabric-and-ink Jun 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(empty)" missing...

Maybe somebody can explain this term?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So Canonical applies to an object which contains variables (e.g., a goal) and means that:

  • we have replaced some variables with their value if any (i.e. we have applied our current "knowledge")
  • we have rebound De Bruijn indices to lower ones

Chalk only deals with goals in canonical form.

For example, when dealing with a goal like: forall<0,1> { 0: Foo && 1: Bar }. Here we have two different indices because there are two different variables, and this goal is in canonical form.

In order to process it we will process each subgoal (namely 0: Foo and 1: Bar) separately, and so when we process 1: Bar, this goal is not in canonical form because it could be written as 0: Bar (lower De Bruijn indices). So we canonicalize this subgoal, process it, and then feed back the answer to the top level goal.

@fabric-and-ink fabric-and-ink mentioned this pull request Jun 21, 2017
1 task
@fabric-and-ink fabric-and-ink changed the title [WIP] Add first draft of glossary [WIP] Add glossary Jun 21, 2017
GLOSSARY.md Outdated

## DeBrujin Index
(empty)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you have filled this section, you may want to move the Canonical section just after this one, since in particular Canonical deals with rebounding variables to lower DeBruijn indices.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to keep the terms in alphabetical order. Can you explain both terms and how they interact?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right it's a glossary!
So I'll start with De Bruijn indices anyway. When you have a formula: forall<T> { exists<U> { T: Foo<Item = U> } }, you don't want to deal with variable names like T and U because these are difficult to handle. So De Bruijn indices transform each variable name into a natural number, for example the previous query would be transformed in forall<0> { exists<1> { 0: Foo<Item = 1> } }.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then these indices aren't really more complicated than what I have read. I wasn't sure if there was some small, magical detail that I was missing :)

Reading the code I noticed that these indices seem to have a close relationship to the UniverseIndex (not sure about the exact name right now). What is an Universe and how is it related to our formulas?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I missed this one! So UniverseIndex indicates the number of universall quantifiers we are within. Examples:

i32: Foo // in universe 0 here
forall<T> {
    // in universe 1 here
    exists<U> {
        forall<V> {
            // in universe 2 here
            ...
        }
    }
}

When we canonicalize, we just recall the universe indices of the re-bound variables.

GLOSSARY.md Outdated
## Datum
(empty)

## DeBrujin Index
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Debruijn :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah... yes :)


There are two notable special cases of clauses. A *Horn clause* has at most one
positive literal. A *Definite clause* has exactly one positive literal.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you should explain how Horn clauses relate to Chalk and logic programming in general (ie they can be understood as A <= B && C && D ... && Z where A, ..., Z are literals)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important point that I didn't really grasp. Could you elaborate a bit?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In propositional calculus, a Horn clause is a clause of the form A || !B || ... || !Z with at most one positive literal (denoted A here). Taking the implication P => Q as a shortcut for !P || Q, we can rewrite the clause above like B && C && ... && Z => A.

So in logic programming, a Horn clause is understood as this latter form, ie exactly one consequence following from a certain number of conditions. Every logical rule that Chalk knows is of this form, and it will only use this kind of rules in order to give a result to a goal.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Neat!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More questions came up.

  • To me Horn clauses seem to be of a "weaker" form than Definite clauses, at least with respect to the logical equivalence B && C && ... => A. In the case of Horn clauses it is possible that there is no A as a consequence of B && C && .... How is this possibility treated in chalk? I such a case rejected?
  • You mentioned rules in your last sentence. Could you explain what these rules are exactly and give some examples?

Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have been more explicit indeed! So in Chalk we don't really deal with Horn clauses without a consequence (called negative Horn Clauses). Well in some sense goals can be seen as such clauses, but that's it.

By rules, I meant the clauses that Chalk infers from a (pseudo) Rust program and definitely holds for true. For example, given the following program:

struct MyStruct<T> { }
impl<T> Foo for MyStruct<T> where T: Clone, T: Bar { }

Chalk will infer something like this (in form of a Horn clause):
(T: Clone) && (T: Bar) => (MyStruct<T>: Foo)
which in logic programming is traditionally denoted as:
(MyStruct<T>: Foo) :- (T: Clone), (T: Bar)
i.e. "consequence IF(:-) condition1, conditon2, ..."

You'll note that since there is a type parameter T, the latter clause should hold for every type T. So that means that in fact we do not have one clause here, but an infinite family of them (one for each possible value of T).

Also note that there can be no conditions at all, like in:

struct MyStruct { }
impl Foo for MyStruct { }

which will be lowered into (MyStruct: Foo)., i.e. this is a ground fact that Chalk holds for true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! Thank you for the explanation.

GLOSSARY.md Outdated
## Goal
With a set of given types, traits and impls, a goal specifies a problem where
types need to be found that satisfy the problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add an example of such a goal?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

GLOSSARY.md Outdated

## Projection
Projection is the reference to a field or (in the context of Rust) to a type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to give an example of what we call a projection in Chalk.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. I think the Iterator/Item case should be a good example, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed :)

Copy link
Contributor Author

@fabric-and-ink fabric-and-ink Jul 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit unsure about Projection... I understand what Normalization is and added an example accordingly. Can you provide an example for this one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these two are related. A projection is a reference to an associated type, like <T as Iterator>::Item. Normalization is the process of telling what this associated type is (specifying the projection), which is denoted in Chalk by <T as Iterator>::Item ==> i32 for example

GLOSSARY.md Outdated
Unification is the process of solving a formula. That means unification finds
values for all the free variables of the formula that satisfy it. In the context
of chalk the values refer to types.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An example here would be valuable :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

A formula with the existential quantifier `exists(x). P(x)` is satisfiable if
and only if there exists at least one value for all possible values of x which
satisfies the subformula `P(x)`.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should add that in the context of Chalk, we actually ask for something stronger than "at least one": we want that there exists exactly one since we are trying to determine what type the user intended. If we have more than one possible values, we consider the result as ambiguous.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this is important :) and did not really know that. I will add this information.

@fabric-and-ink
Copy link
Contributor Author

fabric-and-ink commented Jul 2, 2017

The following descriptions are still missing/incomplete (and I need help with them because I have only a vague or no clue :) )

  • Datum
  • Universe
  • Well-formed
  • Projection

By the way: how do I provide a link to the rendered version of the text? I have seen it before but I don't know how to do it.

@lqd
Copy link
Member

lqd commented Jul 2, 2017

@fabric-and-ink you mean a link to your branch's glossary.md like this ? Rendered

@fabric-and-ink
Copy link
Contributor Author

Exactly, thanks!

@nikomatsakis
Copy link
Contributor

Datum

This is just the singular of "data". It is used to indicate the "data" about an impl or trait, as I recall, in the IR, but has no particular meaning other than being relatively distinctive. In general, I think we're phasing most of those "datums" out anyway, since they were basically used to "cheat" in the trait handling, by looking at the original source program rather than the pure lowered form.

Universe

When you are type-checking something like fn foo<A>(x: A) { ... }, we say that the name A is in a different "universe" from the structs. This is because it is a name that is only valid inside the body of foo, and it can never be named from outside foo. Hence "universe" refers to kind of "the set of things you can talk about at a particular point". Universes are arranged into a tree: things in the root universe can be named from anywhere, because if you are in universe X, you can name all the things in X or any ancestor of X, but not the other universes.

Well-formed

The "well-formed" conditions are basic conditions that are needed to ensure that some request is valid. So, for example, if I have struct HashSet<T: Hash>, then HashSet<i32> is well-formed because i32: Hash. But something like HashSet<NotHash>, where NotHash is a type that does not implement Hash, would not be well-formed. Similarly, if you have trait Foo<T: Hash>, then Bar: Foo<i32> is well-formed (written WF(Bar: Foo<i32>)), because i32: Hash, but Bar: Foo<NotHash> would not be. Note that just because Bar: Foo<i32> is well-formed doesn't mean that it is true -- it basically means that the conditions are correct such that Bar: Foo<i32> could be implemented, but not that it necessarily is.

As a more concrete example, in the Rust stdlib we have trait Eq: PartialEq, which is short for trait Eq where Self: PartialEq. This implies that if you have impl Eq for Foo, you must also have impl PartialEq for Foo. We say then that WF(Foo: Eq) holds if Foo: PartialEq -- that is, those are the conditions in which one is allowed to implement Eq for Foo, even if you do not actually do so.

Projection

A "projection" is a general term meaning to extract some "part" of something. In the case of chalk, it mostly refers to associated types. So when you have <T as Iterator>::Item, this is "projecting" Item "out of" T as Iterator. That is, it is determining what value is given for Item in the impl of Iterator for T.

GLOSSARY.md Outdated
in an unambiguous way in order to work with numbers instead of the names of the
literals. Given the example `forall<T> { exists<U> { T: Foo<Item=U> } }` the
literal names `U` and `T` are replaced with `0` and `1` respectively: `forall<0>
{ exists<1> { 0: Foo<Item=1> } }`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe: "More generally, the debruijn index is the index of the binder where a name was defined, starting from the innermost binder and working out."

## Skolemization
Skolemization is a technique of transferring a logical formula with existential
quantifiers to a statement without them. The resulting statement is in general
not equivalent to the original statement but equisatisfiable.
Copy link
Contributor

@nikomatsakis nikomatsakis Jul 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could make this a bit more intuitive. I would given an example like this:

To get an intuition for skolemization, consider this Rust function:

fn foo<T: Hash>() { 
    ...
}

When we type-check the body of foo(), we don't know yet what the type T will be. So we introduce a "skolemized" type -- meaning, basically, a fresh type that is distinct from all other types in the program. The only thing we know about this skolemized type T is that it implements Hash. Therefore, if we can successfully type-check the function, we know that it should type-check with any other type, so long as that type implements Hash (for this to work, we have to ensure that knowing more about a type never interferes with type-checking).

Skolemization also occurs in other contexts. For example, if you have a "higher-ranked" type like for<'x> fn(&'x u32) -> &'x u32, then 'x is a bound lifetime that will be instantiated each time the function is called. We may however want to check something about the function type in the abstract, without knowing exactly what 'x is. In that context, we could skolemize 'x and check the contents that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your thorough explanations! I will work them in during the next days :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great =)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A glossary is a great idea btw

@nikomatsakis
Copy link
Contributor

@fabric-and-ink do you mean rebasing? I'm reluctant to merge with merge commits in the PR itself.

@fabric-and-ink
Copy link
Contributor Author

Sorry! Made a mistake... I want to squash everything when this PR is ready.

@fabric-and-ink
Copy link
Contributor Author

fabric-and-ink commented Jul 26, 2017

(Still working on it :) )
(PS: I ordered more literature. A few things still confuse me 🙂 )

@nikomatsakis
Copy link
Contributor

@fabric-and-ink is there any reason not to merge this? seems like good stuff so far! :)

@fabric-and-ink
Copy link
Contributor Author

Glad to hear! I think for a first iteration it is almost finished. There are still some descriptions missing or without an example. I will add them in a few days.

@fabric-and-ink fabric-and-ink changed the title [WIP] Add glossary Add glossary Oct 10, 2017
@fabric-and-ink
Copy link
Contributor Author

Ok, I think this should be good for now.

@nikomatsakis
Copy link
Contributor

🚀 thanks @fabric-and-ink !

@nikomatsakis nikomatsakis merged commit c702e6b into rust-lang:master Oct 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants