CodeGenerationSimplificationsForAttributesAndAssociations

Kevin Brightwell edited this page Aug 26, 2015 · 1 revision
Clone this wiki locally

ISSUE

My work with code generation of attributes and more importantly associations focused on semantic correctness, but resulting in A LOT of minor variations. This makes porting to other languages very complex to capture all the nuances to ensure the two important properties: referential integrity and multiplicity constraints.

PROPOSAL PART 1 -- Using Constraints to Simplify Association Multiplicities

Geoffrey, I am very interested in taking my analysis of multiplicity variations for binary / unary / reflexive associations and transforming them into a smaller subset with attached constraints

e.g.

0..1 X -- m..n Y

would transform (for example) into

[ cardinality(Y) >= m && cardinality(Y) <= n ]
0..1 X -- * Y

Comment by TL: I suggest that cardinality(Y) rather than num(Y) might be better.

I think this analysis combined with our updated code generation would be quite publishable as means for other code generation tools to get much better multiplicity constraints

PROPOSAL PART 2 -- Simplifying Referential Integrity

To maintain referential integrity through the public API, the code is very confusing. As a refresh, here is an example

0..1 Student -- 0..1 Mentor

In the above, if the student assigned the Mentor, then the Mentor would also be assigned to the Student. But, in smart code, you can do the assignment from either side

  s.setMentor(m)
  // OR, not and
  m.setStudent(s)

In the above, the "set" method needs to be smart enough to know if this is the first time the relationship is being setup (in which case the code needs to configure the other end), or the second (in case we just set it once, as the other end is already setup).

THIS GETS MUCH MORE COMPLICATED, when we have to manage constraints, in the above, when setting assigning a student to a new mentor, the old mentor needs to be removed

  miguel.setMentor(tim)
  andrew.setStudent(miguel) // Tim should no longer be miguel's mentor

For our approach to code generation, we followed how industry would implement assoications (as two variables on each side, instead of introduce a separate association object to maintain the reference). I WANT TO KEEP THIS APPORACH, as I don't like generating new classes for our users, and I also really like that our code looks (in my opinion) like it was written by hand.

As an aside, in industry, associations are usually implemented as

  * Student -> 0..1 Mentor
  * Mentor -> 0..1 Student

Where referential integrity IS NOT maintained explicitly, and the multiplicities are typically 0..1 and --> resulting in verysimplecode.

My proposal for this work, at a high level is to refactor the code as follows

   // 1: Verify that both sides of the association would be valid following the transition
   >> THE CONSTRAINT CHECKING

   // 2: Have both sides prepare for the association to be applied
   >> REFERENTIAL INTEGRITY TO OTHER OBJECTS (i.e. removing / reassigning existing)

   // 3: Set both sides
   >> REFERENTIAL INTEGRITY TO SELF (i.e. student AND mentor both associated)

   // 4: Ensure valid relationship
   >> i.e. resulting relationship should continue to abide by the constraints

The first part, would refactor into our constraints, the second part is really about removing now obsolete relationships and the third part sets up both sides of the relationship, and within this step each side confirm they are still valid.

We could use reflection for #3, but it's much slower and much harder to read. I will be looking at adding public methods to combine both 3 and 4 so that if you cheat and call 3 directly, a runtime exception would be thrown stating "Hey, you should be calling these methods" --> we could also add warnings if we observe any of these calls in the Action Language (AL) portions of Umple that we current parse out directly.

APPROACH

For Part 1, the first work would be in the analysis of the transformations (not updating the code). But for Part 2, I would jump right in and start working with the code to simplify the way in which referential integrity is supported. Now, as Part 1 is developed, we would actually refactor the code generation to look as if were generated using constraints to help ease integration work as we start to apply the transforms.

The generated projects should remain almost 100% unchanged -- and this will act the true litmus test that we have not over simplified things and broken an edge case.

BUT, the syntactic tests will change quite a bit; and here is how I would proceed with that.

1) For attribute / association tests -- these need to be updated

2) For other tests that just happen to have attributes and associations (e.g. tracing) we would do two things

Ensure we have semantic tests covering the intent of the test (i..e it doesn't care about the code per-se but rather what it does)

Refactor the syntax of those tests to use new(ish) fragment syntax assertions (i.e. just focus on the syntax tests on the syntax the test is actually making assertions about)