Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protecting internal state of struct instances (encapsulation). Clock example #2974

Closed
ghost opened this issue Jul 29, 2019 · 9 comments
Closed
Labels
Milestone

Comments

@ghost
Copy link

ghost commented Jul 29, 2019

Just wondering what is the best approach to encapsulate structs, or protect the internal state of structs, in zig code. An example could be a clock struct, where the fields are constrained to certain ranges and where they may impact each other when changed. For example, adding 160 minutes to the clock time will change both minute and hour fields.

const Clock = struct{
  hour: u8,, // in range 0..23
  minute: u8, // in range 0..59
  second: u8, // in range 0..59
  millisecond: u16, // in range 0..999

  fn addHours(hours: u8){
    // update internal state
  }

  fn addMinutes(minutes: u8){
    // update internal state
  }

  //other functions
}

In zig 0.4 you can't make fields private, so you can't protect state the normal OOP way by using setters and getters. Currently, the only way to achieve some kind of encapsulation is by using naming convention like _second instead of second to imply that this field is internal and not meant to be modified from outside the struct.

I see four possible future alternatives for protection of state if something more elaborate than naming convention is desirable:

  1. std.debug.assert a custom isValidmethod each time before using a struct instance is used, or each time after modification
  2. Add possibility to mark fields as private or readonly when accessed from outside its own struct.
  3. Make the whole struct instance immutable, and ensure that when initialized the instance is guaranteed to be valid. This would need some kind of protected "constructor" though. If it's possible to do const myclock = Clock{.hour= 999999, ...} then it doesn't help that the instance is immutable after.
  4. Encapsulate with Java style interface. var myclock: IClock = Clock.init() This way, the fields are hidden when accessing the myclock instance while all necessary getters and setters are available through the interface. It could even be possible to enforce that no instance will have the type of the implementing struct so that only the interface will be exposed. Example:
const Clock = implementation struct{
  // members 
}

const myclock : Clock = Clock.init()
// compile error:
// Instance of type Clock can only be instantiated as an interface implementation
const myclock : IClock = Clock.init() // OK.

Personally, I highly prefer alternative 4, as interfaces can provide other things in addition to just encapsulation, such as polymorphism, clarify package/module APIs, and help IDEs autogenerate code. It can also provide the same benefits as inheritance or struct embedding (#1214), at least if you factor in IDE tools. For example, imagine a generate-code dialog that asks: "implement Interface1 by delegating to a Interface1Impl instance?" which if you accept, creates all that "boilerplate" code for you in the fraction of a second, with no added language complexity to worry about.

Related: #2479 #2059 #569 #2952

@ghost
Copy link

ghost commented Jul 29, 2019

My issue with encapsulation is that I think it encourages people to model the real world.

If you wanted to judge the quality of a program's code without access to its source code, one way you could do that is by seeing the amount of data the program reads from each memory address. Does it need to jump around in memory for just a few bytes at each location? Or can it make use of most of the cache line? This isn't even about performance, that's just a side benefit. This tells you whether the program has grouped data together based on concepts that don't map to the details of what the program actually does, or whether it created the structs based on what the functions actually do which I believe provides an objective measure of code quality far better than changing designs only when things look hairy or feel wrong.

All structs do is group together data that is accessed together. But just because you have some grouped data and a few functions/methods that access them, doesn't mean that you don't occasionally have a function elsewhere that needs to reach in for a little bit of the data. As long as structs aren't thought of as being the vehicle for encapsulation, then reaching in is as natural as anything else you do in a program. Furthermore, I imagine it can sometimes make sense to duplicate some of the data into a different struct to simplify the coding of, and improve the performance of, another set of functions. This is something that I think you'd never see in OOP-like code with private fields, with or without getters.

The key realization for me is that structs aren't necessarily named based on some top-down concepts from the real-world, they are named based on what is in the structs, and what's in the structs is based on what the functions need. Encapsulation is really about bringing in top-down concepts and trying to map structs to those concepts, where really those concepts are much too large to fit onto individual structs and belong rather as 'modules'. The details of what's in structs is much too fine and changing to let any kind of thoughts about encapsulation dictate it.

@ghost
Copy link

ghost commented Jul 29, 2019

I think getters and setters are related to a movement that was all about creating abstractions too soon and too often. Virtual classes and private fields. Replace the classes, replace the fields. For each and every object. This is just going to be noise. Good abstractions are really hard. Trying to do them per struct, and even ahead of time before you understand the problem? Impossible.

It's not that I don't think it could be useful to somehow indicate that some field of a struct is a minefield, I just think the whole practice of getters and setters is a much bigger minefield. They're essentially about warm feelings over real code. They will get in the way. They always do.

@ghost
Copy link
Author

ghost commented Jul 29, 2019

I think getters and setters are related to a movement that was all about creating abstractions too soon and too often. Virtual classes and private fields. Replace the classes, replace the fields. For each and every object. This is just going to be noise. Good abstractions are really hard. Trying to do them per struct, and even ahead of time before you understand the problem? Impossible.

It's not that I don't think it could be useful to somehow indicate that some field of a struct is a minefield, I just think the whole practice of getters and setters is a much bigger minefield. They're essentially about warm feelings over real code. They will get in the way. They always do.

Just for the record, with this issue I wanted to discuss encapsulation in general, not only the traditional OOP encapsulation. Hence the clock example (which is a bit contrived), but you can easily imagine scenarios where enforcing valid state at all times can be important.

I largely agree with you that accessibility modifiers may "force" abstractions too early, add some boilerplate, and make zig syntax more complex. That's why I brought up the other alternatives to achieve encapsulation as well. What is your opinion on leveraging interfaces for that?

@andrewrk andrewrk added this to the 0.6.0 milestone Jul 29, 2019
@ghost
Copy link
Author

ghost commented Jul 31, 2019

Just to provide one more scenario where protecting internal state would be desirable:

const CarTransmission = struct{
  gear: u2,
  clutch: bool,

  fn safeChangeGear(self: CarTransmission, newGear : u2){
    const clutchPrevState = self.clutch;
    self.clutch = true;
    self.gear = newGear; // OK
    self.clutch = clutchPrevState;
  }

  fn willDestroyGearbox(self: CarTransmission, newGear : u2){
    if(newGear==self.gear){
      newGear += 1; //let wrap, e.g 3 becomes 0
    }
    const clutchPrevState = self.clutch;
    self.clutch = false;
    self.gear = newGear; // Not OK. Illegal state transition
    self.clutch = clutchPrevState;
  }
}

Again, the example is very artificial, but it is different from the clock example
in that all combinations of states are legal while some state transitions are not.

@BarabasGitHub
Copy link
Contributor

I generally don't like this. What if I want to extent the functionality of a struct/class with some method that doesn't exist in the original?

I'd want access to the 'private' fields so I can do what I want. This is one of the things which annoys me the most in C++. You basically have to re-implement things from scratch, just because you want this one additional piece of functionality which you can't implement with the public member functions.

The most common example I run into is pre-allocating space in a std::vector to pass to a C-style function later. I don't want it to be initialized, but I do want it to be accessible. std::vector only has reserve which doesn't initialize it, but doesn't make it accessible either, or resize which does make it accessible but also initializes all the memory (waste of cpu cycles and memory access).

@m-r-hunt
Copy link
Contributor

m-r-hunt commented Oct 3, 2019

Because Zig is a low level systems language, a sufficiently deranged programmer or sufficiently nasty bug can always change your private or read only field via pointer shenanigans. These kind of features risk lulling the user into a false sense of security.

@Sobeston
Copy link
Sponsor Contributor

I think it would be a good idea to be able to denote which fields are internal, allowing the "end user" of your struct/library to access them using a builtin function (say, @private(struct, []u8) var or something like that).

Some programmers are at the end of the day going to use private/internal fields and functions, but it would be a good idea to have the ability to make clear what the end user is allowed to touch and discourage users making use of these internal things without thinking about it.

@andrewrk
Copy link
Member

Feel free to continue the discussion in any of the community gathering places

@ghost
Copy link
Author

ghost commented Apr 22, 2020

Typedef could be utilized to get encapsulation for the given example. See #5132 for context and explanation of syntax and semantics.

const EncapsulatedClock = typedef(Clock, .Encapsulated{.mode=.MethodsOnly}){

  fn addHours(hours: u8){
    // delegate to Clock.addHours
  }

  fn addMinutes(minutes: u8){
    // delegate to Clock.addMinutes
  }

  // add a new function that is only present in the typedef
  fn getMinutes(self: Clock) u8 { return self.minute;};

  //other functions. delegate to Clock
};

test "encapsulated clock" {
   // auto coercing from base to typedef is defined as ok for encapsulation typedefs
   const encapsulatedClock : EncapsulatedClock = Clock.init();
   _ = encapsulatedClock.minute // compile error
   _ = encapsulatedClock.getMinute(); // ok
}

This typedef feature would require the compiler to "consult" what typedef config is associated with
a variable (ZigValue) each time there's a member access operation to either reject or accept.

It's easy to create a "safer" typedef version of a struct, and then the user has the choice (with @as(,)) between using the safer and more limited typedef, or to stick with the original type.

Often, a typedef config payload set to FieldsReadOnly would suffice, and then it takes only one line of code to get the encapsulated "version", given that there are sufficient mutating methods in the original struct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants