Skip to content

Commit

Permalink
add a bunch of random documents
Browse files Browse the repository at this point in the history
Swift SVN r369
  • Loading branch information
lattner committed Apr 14, 2011
1 parent c5eeb2c commit 3cce562
Show file tree
Hide file tree
Showing 6 changed files with 1,002 additions and 0 deletions.
341 changes: 341 additions & 0 deletions docs/3 - Swift Simple User Defined Datatypes (enum:union:struct).rtf
Original file line number Diff line number Diff line change
@@ -0,0 +1,341 @@
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350
{\fonttbl\f0\fswiss\fcharset0 Helvetica;\f1\fnil\fcharset0 Monaco;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww16520\viewh17120\viewkind0
\deftab720
\pard\pardeftab720\ql\qnatural

\f0\fs24 \cf0 The basic observation is that these datatypes are really important to be efficient, and thus should be pass-by-value by default and inlined into larger objects. \'a0This is the current behavior of NSRect and it works well. The basic structure I think we should follow comes right from algebraic datatypes (e.g. from the ML world), which combines enums/struct/union all into data descriptor. \'a0Lets start with enums:\
\
\
\pard\pardeftab720\ql\qnatural

\b \cf0 Enums
\b0 \
\
Enums will be declared something like this:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0oneof DataSearchFlags \{ \'a0 // Example stolen from CFDataSearchFlags
\f0 \

\f1 \'a0\'a0 \'a0Backwards,
\f0 \

\f1 \'a0\'a0 \'a0Anchored
\f0 \

\f1 \'a0\'a0\}
\f0 \
\
A major difference from C is that the elements of a 'oneof' don't get injected into the global scope. \'a0This means that\'a0
\f1 Backwards
\f0 \'a0isn't valid in the global scope, you have to use\'a0
\f1 DataSearchFlags::Backwards
\f0 \'a0or\'a0
\f1 DataSearchFlags.Backwards
\f0 \'a0or something like that. \'a0This is good because you don't have to worry about your enumerators clashing with other stuff in the global scope.\
\
Given this declaration, you could do silly things like this:\
\

\f1 \'a0\'a0var x : DataSearchFlags \'a0 // default initialized.
\f0 \

\f1 \'a0\'a0var y =\'a0DataSearchFlags::Anchored\
\'a0\'a0var z : DataSearchFlags =\'a0DataSearchFlags::Anchored \'a0// redundant type specifier.
\f0 \
\
\
Of course, this is seriously over verbose, and it is also not typically how enums get used in our APIs. \'a0The solution is to take advantage of the same mechanics already in place for the "autoclosurification" which provides context-sensitive type inference. \'a0A new ":" operator defers name lookup + type resolution until the context is resolved, allowing stuff like this:\
\

\f1 \'a0\'a0// Declaration, somewhere not in user code.
\f0 \

\f1 \'a0\'a0func\'a0CFDataFind(...., compareOptions : DataSearchFlags)
\f0 \

\f1 \
\'a0\'a0// Users see this.
\f0 \

\f1 \'a0\'a0CFDataFind(a, b, c, :Anchored)
\f0 \
\
Compare this to the existing CF call, which looks like:\
\

\f1 \'a0\'a0CFDataFind(a, b, c, kCFDataSearchAnchored);
\f0 \
\
I think that the combination of deferred scoping operator plus context sensitive name lookup will resolve a lot of over-verbosity issues without sacrificing anything. \'a0Before talking about structs, lets talk about unions.\
\
\pard\pardeftab720\ql\qnatural

\b \cf0 \
Unions
\b0 \
\
Because we want to be type-safe by default, we really care about discriminated unions. \'a0Discriminated unions have a lot of power in ML style languages, allowing truly elegant pattern matching and a lot of other great things. \'a0However, they are so painful/verbose to do right that they are almost never used: Most uses of unions in Cocoa.h are for "reinterpret some piece of data another way" (e.g. int -> float) not for a proper\'a0
\i discriminated
\i0 \'a0union. Reinterpretations can be done with appropriately named casts.\
\
That said, there are some uses. \'a0Here is one (simplified) example from Cocoa:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0typedef UInt32 scalerStreamTypeFlag;
\f0 \

\f1 \'a0\'a0enum \{
\f0 \

\f1 \'a0\'a0 \'a0downloadStreamAction = 0,
\f0 \

\f1 \'a0\'a0 \'a0fontSizeQueryStreamAction = 1,
\f0 \

\f1 \'a0\'a0 \'a0encodingOnlyStreamAction = 2,
\f0 \

\f1 \'a0\'a0 \'a0prerequisiteQueryStreamAction = 3,
\f0 \

\f1 \'a0\'a0 \'a0prerequisiteItemStreamAction = 4,
\f0 \

\f1 \'a0\'a0 \'a0variationQueryStreamAction = 5
\f0 \

\f1 \'a0\'a0\};
\f0 \

\f1 \
\'a0\'a0struct scalerStream \{
\f0 \

\f1 \'a0\'a0 \'a0scalerStreamTypeFlag types;
\f0 \

\f1 \'a0\'a0 \'a0union \{
\f0 \

\f1 \'a0\'a0 \'a0 \'a0struct \{
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0const unsigned short * encoding;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0SInt32 * glyphBits;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0const char * name;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0\} font;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0struct \{
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0SInt32 size;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0SInt32 list;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0\} prerequisiteQuery;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0SInt32 prerequisiteItem;
\f0 \

\f1 \'a0\'a0 \'a0 \'a0SInt32 variationQueryResult;
\f0 \

\f1 \'a0\'a0 \'a0\} info;
\f0 \

\f1 \'a0\'a0\};
\f0 \
\
Note that the only real difference between a discriminated union and an enum is that a union has types associated with the enumerators. \'a0This is really straight-forward, and builds off the existing tuple support we already have. \'a0The example above would be declared like this (the types in [[]] brackets don't exist yet in Swift, use your imagination :-) :\
\

\f1 \'a0\'a0one of\'a0ScalerStream\'a0\{
\f0 \

\f1 \'a0\'a0 \'a0Download,
\f0 \

\f1 \'a0\'a0 \'a0FontSizeQuery (encoding : [[const unsigned short*]],
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0 \'a0 \'a0 \'a0 \'a0 \'a0 glyphBits : [[SInt32*]],
\f0 \

\f1 \'a0\'a0 \'a0 \'a0 \'a0 \'a0 \'a0 \'a0 \'a0 \'a0 name : string),
\f0 \

\f1 \'a0\'a0 \'a0EncodingOnly,
\f0 \

\f1 \'a0\'a0 \'a0PrerequisiteQuery (size : int, list : int),
\f0 \

\f1 \'a0\'a0 \'a0PrerequisiteItem int,
\f0 \

\f1 \'a0\'a0 \'a0VariationQuery int
\f0 \

\f1 \'a0\'a0\}
\f0 \

\f1 \
\pard\pardeftab720\ql\qnatural

\f0 \cf0 Building on what we have from the base expression/type definition, each discriminator can have at most one type associated with it. \'a0In the fontsizequery and prerequisite query cases, this type is a tuple with multiple elements. \'a0This gives us access to the existing (and uniform) tuple initialization and processing stuff.\
\
With this declaration, you can use these like this:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0var x1 : ScalarStream = :Download
\f0 \

\f1 \'a0\'a0var x2 =\'a0ScalarStream::Download \'a0 \'a0// same as x1
\f0 \
\

\f1 \'a0\'a0var y = ScalarStream::PrerequisiteItem 42
\f0 \

\f1 \'a0\'a0x = :PrerequisiteQuery(.size = 2, .list = 42)
\f0 \

\f1 \'a0\'a0x = :PrerequisiteQuery(2, 42)
\f0 \

\f1 \'a0\'a0bar(:FontSizeQuery(.encoding = a, .glyphBits = b, .name = "foo"))
\f0 \

\f1 \'a0\'a0bar(:FontSizeQuery(a, b, "foo"))
\f0 \

\f1 \
\pard\pardeftab720\ql\qnatural

\f0 \cf0 There would also be support for doing a "switch" style pattern matching dispatch to get to the individual elements. \'a0A rough idea is something like this:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0switch (some_stream) \{
\f0 \

\f1 \'a0\'a0case EncodingOnly:
\f0 \

\f1 \'a0\'a0 \'a0...
\f0 \

\f1 \'a0\'a0case PrerequisiteItem x:
\f0 \

\f1 \'a0\'a0 \'a0 handle(x)
\f0 \

\f1 \'a0\'a0 \'a0 ...
\f0 \

\f1 \'a0\'a0case FontSizeQuery(encoding, glyphBits, name):
\f0 \

\f1 \'a0\'a0 \'a0 do_something_with(encoding + glyphBits, name)
\f0 \

\f1 \'a0\'a0 \'a0 ...
\f0 \
\
There should also be an operator to check for discriminators and extract values, etc. \'a0Basically we need a way to poke at the "isa" for the union.\
\
\
\pard\pardeftab720\ql\qnatural

\b \cf0 Structs
\b0 \
\
The last piece of this is the struct case, which is just a special case of a union with exactly one discriminator. \'a0While structs are just a hacky special case :-), they are important, because this is what most people think about. \'a0The following would work:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0one of CGRect \{
\f0 \

\f1 \'a0\'a0 \'a0\'a0CGRect(origin :\'a0CGPoint,\'a0size : CGSize)
\f0 \

\f1 \'a0\'a0\}
\f0 \

\f1 \
\'a0\'a0var x1 = CGRect::CGRect(myorigin,\'a0CGSize::CGSize(42, 123))
\f0 \

\f1 \'a0\'a0var x2 = CGRect::CGRect(.size =\'a0CGSize::CGSize(.width = 42, .height=123), .origin = myorigin)
\f0 \
\
However, this seems like massive syntactic overkill. \'a0There are a couple ways to handle this, but introducing a real "struct" keyword is probably the simplest. \'a0This would give:\
\

\f1 \'a0\'a0struct CGRect (origin :\'a0CGPoint, size : CGSize)
\f0 \

\f1 \
\'a0\'a0var x1 = CGRect(myorigin, CGSize(42, 123))
\f0 \

\f1 \'a0\'a0var x2 = CGRect(.size = CGSize(.width = 42, .height=123), .origin = myorigin)\
var sum = x1.size.width + x1.size.height;
\f0 \

\f1 \
\pard\pardeftab720\ql\qnatural

\f0 \cf0 A struct declaration is just like a declaration of a oneof containing a single element, plus it injects the (single) constructor into the global namespace. The injected constructor is why "CGSize" works without requiring CGSize::CGSize or :CGSize in an inferred context. Internal to the compiler, this is just de-sugared and handled uniformly with the more general oneof case, just like 'func' is de-sugared to 'var'.\
\
In addition to injecting the constructor, a struct definition injects definitions of accessor functions for each field into the containing scope. This allows member access ("x1.size") is directly on structs through normal dot syntax.\
\
\
\pard\pardeftab720\ql\qnatural

\b \cf0 Other Stuff
\b0 \
\
Following the uniform syntax for variable and func definitions, oneof and struct should allow attributes, e.g.:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0struct [packed] MyPoint ( x : sometype1, y : sometype2 )\
\pard\pardeftab720\ql\qnatural

\f0 \cf0 \'a0\
I don't have any specific plans for attributes here, but it could be useful when we want a struct to exactly match the layout of a C type or a hardware resource. \'a0It also allows us to specify that these are implicitly pass by-reference if that ever becomes important. \'a0For example, that would allow us to do something like this:\
\
\pard\pardeftab720\ql\qnatural

\f1 \cf0 \'a0\'a0struct [byref] MyList (\
\'a0\'a0 \'a0data : int,\
\'a0\'a0 \'a0next : MyList\
\'a0\'a0)
\f0 \
\
Without "byref" you'd get an error about MyList not allowed to be infinite size. :-)\
\
This would only be appropriate if you don't want to use an object for some reason, which will always be "by-ref".\
\
\
}

0 comments on commit 3cce562

Please sign in to comment.