Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statically-Sized Types #3

Merged
merged 6 commits into from
Jul 10, 2017
Merged

Conversation

DavePearce
Copy link
Member

This proposal introduces the concept of a statically-sized type to Whiley and, similarly, proposes that many existing types be labelled as dynamically sized.

This is the draft of the RFC for introducing statically sized types to
the Whiley language.  The purpose of this is to provide a clear
mechanism for mapping Whiley types onto the machine.
This updates the draft for statically sized types to dictate what the
proposed types are, and to give some examples mapping to C.
This adds some discussion of implicit coercions to the draft.
The comments regarding operator typing are somewhat incomplete and
really need to be fleshed out some more.
@DavePearce DavePearce changed the title RFC/statically sized Statically-Sized Types May 30, 2017
@bambinella
Copy link

bambinella commented May 30, 2017

I would prefer that the size wasn't in bits, but in bounds, so that one could reserve certain values for fast propagation of errors/not-a-number etc.

So rather than uint:8, I would prefer int<0,255> or in the case of a unicode UTF-8 byte: int<1,255> (reserving 0 for termination). Or in another case you might want int<0,127> and use int<-32,-1> for error conditions.

But I guess I have to think a bit more about this, since this goes into the distinction between memory represenation and value and how Whiley express that relationship.

For instance, is Whiley going to support compact storage of int<-50,200> or int<1,256>? Those would fit in a byte.

One note: Ada has int:32, but uint:31. This is to ensure that uint is a proper subtype of int for the same memory size.

@bambinella
Copy link

bambinella commented May 30, 2017

I wonder if the Whiley semantics could be changed to require that expressions have to be computable at runtime with a static integer size?

From a backend / performance perspective I am not interested in limiting function parameters as much as I am interested in limiting code-gen.

Unfortunately that brings optimization and rewriting capabilities into the type system... So that means you would have to do simplification (partial evaluation?) before verification.

I guess it could be an interesting option to think about though.

@bambinella
Copy link

I also think you should forget cstrings, they are not being used much in new programs and can be modelled as a zero-padded array of fixed size with some suitable constraints. C++ is abandoning the concept for zero-terminated strings completely. C++ now have array_views/string_views with start-end indices (new in C++17).

Also, when it comes to calling conventions you should forget about representing it in the core Whiley language. You really need special syntax for each calling convention as calling conventions vary between languages and hardware platforms. So this should be left to the backend and just provide mechanisms for adding custom annotation.

What is needed is a way to interpret bitvectors... You get 512 bits in, what does that mean in terms of Whiley types?

@DavePearce
Copy link
Member Author

Hi Ola,

So rather than uint:8, I would prefer int<0,255> or in the case of a unicode UTF-8 byte: int<1,255> (reserving 0 for termination)

So, I have considered ranges. That was my first preference. But, it is clunky to write by comparison. And, ultimately, I don't think it's that useful. In your example of a UTF-8 byte you a uint:8. The thing is, programming systems have converged on a preset representation of data primarily oriented around types such as C99's int8, int16, etc. Potentially, we could later extend with value ranges if we really felt it was worthwhile, but I don't think it really will be. Not for the foreseeable future at least. Look at the types available in C, C++ and Rust as examples.

@DavePearce
Copy link
Member Author

Hey,

I also think you should forget cstrings

C strings are just one interesting example. The key is that arrays in C do not have a length argument. For example, int* is just a pointer to an unknown number of ints. That's what &int[?] models.

You really need special syntax for each calling convention as calling conventions vary between languages and hardware platforms.

It's not about calling conventions. It's about modelling foreign data and enabling easy interop. For example, one of the reasons Rust is very successful is that we can take an existing C function and rewrite it in Rust. This then links with other code that used the original C version without any problem. That's what I'm trying to recreate.

@bambinella
Copy link

I guess you can extend it later, but I am afraid that if you make u:32 builtin then people will overuse it and not write generic integer code or over-constrain parameters. I think actually a library solution is better, something like this would be ok:

type u16 = int<0,65535> 

(or some other qualifier, could be "symbol" or something)

I understand what you say about cstrings. My gut feeling is that it would be better to only have
a plain bitvector type that models concrete memory and have some way to build types over that which translates the bit patterns into Whiley types. That could work also for floating point.

@DavePearce
Copy link
Member Author

My gut feeling is that it would be better to only have
a plain bitvector type that models concrete memory and have some way to build types over that which translates the bit patterns into Whiley types

You have to think about what this means. Firstly, it means you can't write drop in replacements in Whiley for C functions. Secondly, C programmers would need to convert their data into bitvectors to communicate with Whiley code. Thirdly, this doesn't work for pointers. That is, you can convert the pointer itself into a bitvector, but you won't convert the data it points to into a bitvector (or, at least, doing do would be costly). Overall, this seems frustrating because, actually, Whiley and C are very similar in many ways. We should be able to represent C data directly in Whiley, so why not allow it?

(Note: having a mechanism to turn an arbitrary type in Whiley into a bitvector is something I'm interested in)

@bambinella
Copy link

bambinella commented May 31, 2017

That is, you can convert the pointer itself into a bitvector, but you won't convert the data it points to into a bitvector (or, at least, doing do would be costly).

I only meant the data part since the topic is "statically sized types", not the pointer part. Yes, something clever has to be done for the "graph-like" data-structures, but I think it ideally should go quite a bit beyond the concept of pointer for proofs to be meaningful.

I have a lot of thoughts on that, but if one goes beyond bitvector then there are other things that ought to be possible in the syntax, like adding some temporal knowledge to the memory area:

  • can it mutate
  • can other mutate it
  • is it immutable while we are looking at it
  • is it constant between calls (so backend can cache)

@bambinella
Copy link

I guess there also should be a mechanism for tying foreign data to foreign function calls, to get at least some type safety even though the foreign type system is completely different, but not quite sure what to make of that yet.

@DavePearce DavePearce merged commit 7d21983 into Whiley:master Jul 10, 2017
@DavePearce DavePearce deleted the rfc/statically-sized branch August 3, 2017 01:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants