Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Alternative to DbC #13553

Closed
iam3yal opened this issue Sep 2, 2016 · 39 comments
Closed

Proposal: Alternative to DbC #13553

iam3yal opened this issue Sep 2, 2016 · 39 comments

Comments

@iam3yal
Copy link

iam3yal commented Sep 2, 2016

This is in response to @MadsTorgersen reply on the .NET blog

Nothing on design by contract this time around, and I’m not sure if there will be anytime soon. We’ve looked a lot at contract-style features, and they tend to get quite verbose and complicated. I am not sure that they would carry their weight in added value as language features. But maybe, one day, if we can get the design right.

While people do use CodeContracts for many things, the 90% case seems to be dealing with null. We are eager to tackle the language integration of that in a more type-system-oriented way, allowing you to distinguish between nullable and nonnullable reference types and track correct usage based on that. I hope this can happen in the next major release of C#.

I partly agree with Mads that DbC tends to be verbose and sometimes complicated but this 10% are as useful as solving the null problems.

Problem:

Checking the input and the output of each method can be both pretty nasty and daunting task for few reasons:

  1. API - There is no standard API or way to express it in the language and because of that it's harder to build static analyzers that understand our code in a way that it can prove or disprove reliably the assumptions that we make and give us feedback about it.

  2. Testing - Testing is expensive, people shouldn't write tests that a tool can make it a lot better, in fact, as an example if we take CodeContracts it did many things right beyond providing this "standard" way of writing contracts at the API level, we had control over many things including great static analysis so yeah we traded complexity and verbosity with data validation.

    Back then when the CodeContracts project was under active development I used it heavily and I saved a lot of time by not writing these validation tests, some people will disagree with me on that mostly TDD purist but that's fine.

    We might not cover everything through tests and we might not even write the correct tests and sometimes we might even forget to write tests so it's better to have a feature that will always be there and do it for us.

  3. Visibility - We have no way to know what the method accept and what the method rejects which will probably lead to some failures so we end up looking it up in the documentation, the code itself or in the tests but these things aren't immediate and sometimes you don't even have them available so you have to do some experiments or decompile the code.

People don't add contracts at all because there's too much weight on the programmer so people don't deal with it at all, they don't care what people will pass to their functions and when I ask them "Why they don't do it?" the response I get is "It's too much work" I don't really accept these responses but I know that if there was a way to express it in the language and the compiler was statically analyzing these contracts people would use it a lot because no one likes to do mistakes and waste their time.

Solution:

Now I know that the null problem is already covered so I'm not speaking about it here at all.

This solution aims to solve cases where we don't want to create a custom type but just want to restrict the input.

So the way it works is by annotate parameters that will be evaluated by the compiler or Roslyn at post-compile-time.

Something like this:

void Time(
    [RangeContract(1, 12)] int hour, 
    [RangeContract(1, 60)] int min, 
    [RangeContract(1, 60)] int sec, 
    [RegexContract("^(p|a)m$")] string period) {
}

Now these contract attributes are not really special but they have to derive from ContractAttribute so the compiler would be aware of this and when you do derive from this you have to implement the IsValid method.

abstract class ContractAttribute : Attribute {}

class RangeContractAttribute : ContractAttribute
{
    public RangeContractAttribute(int from, int to)
    {
    }

    public bool IsValid(int value)
    {
        return true;
    }
}

class RegexContractAttribute : ContractAttribute
{
    public RegexContractAttribute(string regex)
    {
    }

    public bool IsValid(string value)
    {
        return true;
    }
}

I'm aware that ContractAttribute lack the IsValid method above but that's because attributes does not support generics so ContractAttribute is purely a marker.

The value that is passed as arguments to the Time method is passed to the IsValid method for validation and is executed by the compiler and the way this will work is I imagine the compiler will have to sometimes compile fragment(s) of the code to get the value.

There should be a message property that returns a message in cases that the contract failed.

We can even go further with this and have the compiler treat contracts like attributes so instead of writing the full name RangeContract we can just write Range.

Maybe, as an option, it's even possible to force these contracts at run-time by allowing the compiler to generate some code that calls the IsValid method.

This approach is fully extensible so we can plug into it.

P.S. Don't be hard one me when it comes to terminologies of some of the things in my post and by all means feel free to correct me, after all I'm not a compiler person (even though I'm learning it) but I do want the tools I use to improve the code I write and the code others write.

Edit: Change StringContract to RegexContract.

@dsaf
Copy link

dsaf commented Sep 2, 2016

Attributes would probably need a better treatment then :) #711.

So how is this different from CodeContracts (apart from not being dead)?
Are those RangeContract attributes going to be evaluated at run-, compile-, post-compile- time?

Personally I would prefer less powerful but language-level contracts #119 (requires, ensures, ).

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@dsaf

Attributes would probably need a better treatment then :)

Indeed, I fully agree that attributes need more love. :)

So how is this different from CodeContracts (apart from not being dead)?

Well, there's no new syntax or calls to specific APIs and it's more extensible because you get to decide how IsValid will function.

The BCL can contain multiple attributes for each type and people can either derive from it or create custom ones.

Are those RangeContract attributes going to be evaluated at run-, compile-, post-compile- time?

It should be executed at post-compile-time in another processes.

Personally I would prefer less powerful but language-level contracts #119 (requires, ensures, ).

Yeah, I know but from what @MadsTorgersen said it's not going to happen or at least it is unlikely so I thought like at least they can give us something that will allow us to mimic contracts.

Edit: I've updated the post to remove the static analysis part because I don't think it was complete enough for people to understand what I meant.

@dsaf
Copy link

dsaf commented Sep 2, 2016

These days I am experimenting with an alternative approach:

void Time(Hour hour, Minute min, Second sec, Period period) {
}

struct Hour
{
    public Hour(uint value)
    {
        if (value > 12)
            throw new ArgumentOutOfRangeException(...);

        Value = value;
    }

    public uint Value { get; }
}

(or just use TimeSpan but that's beside the point of the original sample, I understand)

This way verification naturally happens at compile time and the run time only deals with data (I/O) which is unavoidable. There could be more done to make creating custom domain value types easier, but there seems to be a conspiracy against it :) #4971 #263 #58.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@dsaf I actually always done it this way, coming from heavy object-oriented background so it's natural to me, in fact, I have a clock that I made in C++ many years ago that I do exactly that.

The timer of the clock looks like this:

#pragma once
#include <memory>
#include "event.h"
#include "second.h"
#include "minute.h"
#include "hour.h"

class Timer final {
public:
    Timer();

    void Attach(std::function<void(Event*)>) const;

    Hour GetHour() const;
    Minute GetMinute() const;
    Second GetSecond() const;

    void Tick();
private:
    const std::unique_ptr<Event> _onThick;

    Hour _hour;
    Minute _minute;
    Second _second;

    int _oldSec = -1;
};

This way verification naturally happens at compile time and the run time only deals with data (I/O) which is unavoidable. There could be more done to make creating custom domain value types easier, but there seems to be a conspiracy against it :) #4971 #263 #58.

Funny thing is C++ is getting the dot-operator in C++17 that will allow custom types to forward calls into native types without creating everything from scratch. :)

However, the issue with your example and also with mine is that the contracts of Hour aren't evaluated at compile-time but at run-time. :)

@dsaf
Copy link

dsaf commented Sep 2, 2016

@eyalsk

However, the issue with your example and also with mine is that the contracts of Hour aren't evaluated at compile-time but at run-time. :)

Yeah, that's true. It would be nice to apply RangeContract in the Hour constructor.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@dsaf yeah exactly. :)

@AdamSpeight2008
Copy link
Contributor

The compiler could generate a <ComplilerGenerated> local function to perform the argument checks.

Function CreateTime(
                     <Range(0,23)> Hours As Integer,
                     <Range(0,59)> Minutes As Integer
                     <Range(0,59)> Seconds As Integer
                   ) As Time
  <CompilerGenerated()>
  Function CreateTime_ArgChecks()
    Validate.Range( 0, 23, NameOf(Hours)) 
    Validate.Range( 0, 59, NameOf(Minutes))
    Validate.Range( 0, 59, NameOf(Seconds))
  End Function

End Function

As well as <CompilerGenerated()> XML Documentation for each of the args.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@AdamSpeight2008 Yeah, in cases we opt-in, the compiler can do that for us, so it's pretty neat. :)

@gordanr
Copy link

gordanr commented Sep 2, 2016

I agree with @MadsTorgersen that code contracts in 80-90% case deals with nulls, but I find very useful to cover other 10-20% cases. Potentially introducing nonnullable reference types in C# will be a really HUGE step for C#. I know that in long-term plans for the next major release there are not resources for this topic.
Maybe you can consider language-level contracts #119 ( requires, ensures, ...) for the next+1 major release. I believe that we now have enough experience from other languages like Spec# or project Midori.

@ufcpp
Copy link
Contributor

ufcpp commented Sep 2, 2016

I prefer dedicated type than DbC on parameter. I'd rather like a type alias with conditions, e.g.

type Hour = int where value >= 1 && value <= 12;

which is translated into code like below:

struct Hour
{
    private int value;
    public Hour(int value)
    {
        if (value >= 1 && value <= 12) throw new ArgumentOutOfRangeException(nameof(value);
        this.value = value;
    }
    public static explicit operator int(Hour x) => x.value;
    public static explicit operator Hour(int value) => new Hour(value);
}

@dsaf
Copy link

dsaf commented Sep 2, 2016

@eyalsk

...10% are as useful as solving the null problems...

@gordanr

...code contracts in 80-90% case deals with nulls...

I think it's important to distinguish between Microsoft Code Contracts and the general principle. I struggle to believe that the statement being referred to is correct for the latter.

@dsaf
Copy link

dsaf commented Sep 2, 2016

@ufcpp how would that work for something with more than one field?

@gordanr
Copy link

gordanr commented Sep 2, 2016

@dsaf, your Time example is very similar with Jimmy Bogard's DDD approach from the site.
https://lostechies.com/jimmybogard/2007/06/25/generic-value-object-equality/
All entities (hour, minute, ...) are not simple types but classes (more better records). It is not always easy to write a lot of classes, but in the end it is worth. If the entity is created - it's valid. I have very good experience in real projects with this approach.

Here is one example. This is a nice example where all checks are for nulls, but this is not always case.

public class Institution : EntityObject<Institution>
{
    public InstitutionId InstitutionId { get; }
    public InstitutionName Name { get; }
    public InstitutionCity City { get; }
    public UniversityId UniversityId { get; }

    public Institution(InstitutionId institutionId, InstitutionName name, InstitutionCity city, UniversityId universityId)
    {
        var validationCode = ValidationCode(institutionId, name, city, universityId);
        if (validationCode != "OK")
        {
            throw new Exception(validationCode);
        }
        InstitutionId = institutionId;
        Name = name;
        City = city;
        UniversityId = universityId;
    }

    public static string ValidationCode(InstitutionId institutionId, InstitutionName name, InstitutionCity city, UniversityId universityId)
    {
        if (institutionId == null) return $"{nameof(institutionId)} = null";

        if (name == null) return $"{nameof(name)} = null";

        if (city == null) return $"{nameof(city)} = null";

        if (universityId == null) return $"{nameof(universityId)} = null";

        return "OK";
    }
}

@dsaf
Copy link

dsaf commented Sep 2, 2016

@gordanr

...If the entity is created - it's valid...

Which is nice, but still means runtime, @eyalsk wants even that compile- or post-compile time.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@gordanr Your example becomes something like the following:

public class Institution : EntityObject<Institution>
{
    public InstitutionId InstitutionId { get; }
    public InstitutionName Name { get; }
    public InstitutionCity City { get; }
    public UniversityId UniversityId { get; }

    public Institution(
        [InstitutionIdContract] InstitutionId institutionId, 
        [InstitutionNameContract] InstitutionName name, 
        [InstitutionCityContract] InstitutionCity city, 
        [UniversityIdContract] UniversityId universityId)
    {
        InstitutionId = institutionId;
        Name = name;
        City = city;
        UniversityId = universityId;
    }
}

What you did by introducing Institution is just moving the problem to another place which is fine and reasonable because it reads better but you didn't solve anything.

@dsaf
Copy link

dsaf commented Sep 2, 2016

@eyalsk actually he did solve something - the error will be close to source. Unlike e.g. passing a raw string around (through layers). So it's a good practice.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@dsaf Yeah, sure but it doesn't really solve the fundamental problems I wrote about above, that's what I meant when I said that he didn't solve anything, it was a bit of a stretch, sorry. :)

@gordanr
Copy link

gordanr commented Sep 2, 2016

This implementation of EntityObject fits well both with record types and proposed language-level contracts #119 ( requires, ensures, ...). That means that contracts can also be used in compile-time.
Look, this class is actually proposed record type.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@gordanr They aren't going to add #119... it's no longer on the table this is the reason I wrote this proposal.

@gordanr
Copy link

gordanr commented Sep 2, 2016

They didn't say NEVER but

But maybe, one day, if we can get the design right.

#119 is very nice proposal.

@iam3yal
Copy link
Author

iam3yal commented Sep 2, 2016

@gordanr

But maybe, one day, if we can get the design right.

It's never going to be #119 but you can wait, in fact, we've been waiting for so many years for something like this to be introduced that I don't think Spec# style or CodeContracts or DbC will ever make it into the language and again, this is the reason I wrote this proposal so we would at least have a way to validate the input in a succinct manner that doesn't introduce new syntax or anything like that, not to mention that the concept I proposed here is very simple to grasp, however, there's probably some complexity involved to get the values that are passed to methods in order to validate them but either way it is necessary if we want to validate it at compile- or post-compile time and get this nice and immediate feedback prior run-time.

p.s. This proposal doesn't handle postconditions because I wanted to keep it really simple and from what I've seen in many codebases people don't tend to use asserts or eager to do so, this is the reason I refrained from adding it.

@HaloFour
Copy link

HaloFour commented Sep 2, 2016

I'd think an attribute-driven form of DbC could be largely implemented via analyzers and source generators (to generate code to enforce the contract) without any changes to the language. Could provide a sandbox in which experiments with DbC can be carried out.

To me the most important aspect of any DbC feature would be embedding of the validation requirements within the contract metadata so that the compiler/tooling can appropriately warn when using a contract incorrectly. Everything else, including enforcement of the contract, is secondary. Having the CLR process explode with a fail-fast on something that the compiler didn't bother to whimper over would be a horrific developer experience.

@ufcpp
Copy link
Contributor

ufcpp commented Sep 3, 2016

@dsaf how about combination with tuples?

type FirstOrthant = (int x, int y) where value.x > 0 && value.y > 0;

@alrz
Copy link
Contributor

alrz commented Sep 3, 2016

@ufcpp Why it couldn't be a regular record,

struct FirstOrthant(int x : X, int y : Y) requires x > 0 && y > 0;

@ufcpp
Copy link
Contributor

ufcpp commented Sep 3, 2016

@alrz My code is a example. Maybe your proposal is better to fit C#. Anyway, I prefer constraints on types rather than on parameters.

@dsaf
Copy link

dsaf commented Sep 3, 2016

@ufcpp @alrz looks interesting, reminds of Ada:

  type Car is record
     Number_Wheels  : Positive range 1 .. 10;
     Horse_Power_kW : Float range 0.0 .. 2_000.0;
  end record;

They added contracts in the new version as well:
http://www.drdobbs.com/architecture-and-design/ada-2012-ada-with-contracts/240150569

@dsaf
Copy link

dsaf commented Sep 3, 2016

@ufcpp

...I prefer constraints on types rather than on parameters.

I agree, otherwise method signatures will become very noisy and also have duplication.

@alrz
Copy link
Contributor

alrz commented Sep 3, 2016

By the way, in my example the constraint is on the primary constructor, if you define another ctor the constraint may not be satisfied unless you call it with the this ctor initializer. Overriden methods and interface implementations should be able to inherit constraints to avoid duplication.

@gordanr
Copy link

gordanr commented Sep 3, 2016

We can think of 'constraints on types' as a special case of 'constraints on parameters of type constructor'.

Look at records with contracts.

public class NumberWheels(int NumberWheels) requires NumberWheels >= 1 && NumberWheels <= 10;
public class HorsePowerKW(int HorsePower) requires HorsePower >= 0 && HorsePower <= 2000;
public class Car(NumberWheels NumberWheels, HorsePowerKW HorsePowerKW);

@AdamSpeight2008
Copy link
Contributor

@gordanr You'll also need to consider what occurs when the constraint isn't satisfied. Does the caller or callee handle it?

@iam3yal
Copy link
Author

iam3yal commented Sep 3, 2016

I know that using attributes is pretty noisy but bear in minds that I wrote this proposal as an alternative because they don't want to add this feature to the language and in the future if they will ever want to add it, they can build on top of this proposal.

However, seems like 'constraints on types' fits only to records and simple types, how do you handle more complex types, i.e., types that have custom members? where do you put the contracts in this case? where do you put the contracts in cases of existing types?

@alrz
Copy link
Contributor

alrz commented Sep 3, 2016

@HaloFour

I'd think an attribute-driven form of DbC could be largely implemented via analyzers and source generators (to generate code to enforce the contract) without any changes to the language.

For an example that does require language support, see #11308 (comment).

@HaloFour
Copy link

HaloFour commented Sep 3, 2016

@alrz

You wouldn't need language support for either iterators or async methods. The source generator would simply emit a replacement that isn't an iterator or an async method:

public IEnumerable<int> Range(int start, [NonNegative] int count) {
    for (int i = 0; i < count; i++) {
        yield return start++;
    }
}
public replace IEnumerable<int> Range(int start, int count) {
    // enforces the arguments immediately
    if (count < 0) throw new ArgumentOutOfRangeException(nameof(count));
    // calls and returns the iterator
    return original(start, count);
}

@JamesFaix
Copy link

I've started trying to implement DbC using Roslyn's syntax rewriters. https://github.com/JamesFaix/Traction

@jnm2
Copy link
Contributor

jnm2 commented Dec 12, 2016

@dsaf's method of creating custom primitives like Hour is honestly amazing IMO. I've been doing this for ages and it has paid off.

@dsaf
Copy link

dsaf commented Dec 12, 2016

@jnm2 biggest obstacles so far are persuading other people (in my team) it's the way to go and that mapping DTO <-> Domain <-> DAL is worth it. The idea of re-using a single type from ORM to JSON API is quite entrenched unfortunately which throws things like immutability out of the window. Maybe records will help? We shall see. There is also a nice talk from the dark side (Java/Scala) that describes the approach well: http://www.slideshare.net/oxbow_lakes/age-is-not-an-int . I haven't seen other good descriptions of the approach despite the fact that all we do these days as an industry is OOP apparently 😕.

@iam3yal
Copy link
Author

iam3yal commented Dec 12, 2016

@jnm2 Yeah, indeed, it works well and the abstraction centralizes the contracts so you don't check the same thing multiple times throughout the codebase but still this approach has some disadvantages that may or may not impact performance and finally you still don't get compile-/post-compile-time feedback.

p.s. Introducing the time unit such as Hour as an object to the system is completely reasonable and valuable imo with or without contracts support in the language because I think many would agree that as pointed out by @dsaf Age is not an int.

@iam3yal
Copy link
Author

iam3yal commented Dec 27, 2016

Closing this because I think that while it adds some value and maybe covers these 10% I mentioned in my post the solution is ugly and there are probably better ideas out there besides as @HaloFour pointed out above this solution or similar one can be achieved today through analyzers.

@iam3yal iam3yal closed this as completed Dec 27, 2016
@johncrim
Copy link

johncrim commented Jun 1, 2017

dotnet/csharplang#341 (Discussion: Code Generator Catalog) is related. If supported, it would enable many of the scenarios discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests