Skip to content
This repository has been archived by the owner on Aug 31, 2021. It is now read-only.

🍄 [[feature-objectuuids]] Add native support for object UUIDs to the engine. #1590

Open
wants to merge 14 commits into
base: develop
Choose a base branch
from

Conversation

peter-b
Copy link
Contributor

@peter-b peter-b commented Dec 10, 2014

This branch has four main parts:

  1. e01edf2, 49736d0, 3cf5f8e, 17326ea: These add a UUID field and accessors to MCObject, and some convenience functions for working with UUIDs and altids.
  2. 39dc486, 1211602, 798826d, 3fd1c1c: These give stacks the capability to cache object UUIDs in a hashtable for quick lookup.
  3. 8232a11: Add searching for objects by UUID.
  4. 563c8dc: This adds an ObjectID class that represents various different ways of identifying classes. They can be concrete (permanently associated with one specific MCObject instance) or delayed (they contain the identifying information, and go and find the appropriate object on request).

There were also a couple of bugs found by @runrevfraser during code review.

Add a function for comparison of UUIDs.
Give objects a UUID field.  The UUID is generated pseudorandomly and
lazily (i.e. only the first time that it's requested via the
MCObject::GetUuid() method).

This is a first pass at the object ID / reference problem.

Cherry-picked from commit 606e7d1.
Adds an MCObject::HasUuid() method that returns true until an object
has had a UUID allocated to it, a const variant of the
MCObject::GetUuid() method, and explanatory comments to the header
file.
Create a new template abstract class MCStackCache with most of the
logic of MCStackIdCache, and make MCStackIdCache into a subclass of
MCStackCache<uint32_t>.

This is to facilitate adding a new MCStackUuidCache class later on.
This will allow MCObject::m_in_id_cache to represent the state of both
ID and UUID caches.  It also allows uncacheobjectbyid() and
cacheobjectbyid() to avoid taking the cache lock unless necessary.
This adds four new classes:

* MCObjectId is an abstract base class that represents a reference to
  an object.  It provides accessors for three different types of
  identifying information: UUIDs, legacy IDs, and names.

* MCConcreteObjectId is an MCObjectId that's permanently and directly
  associated with a particular MCObject instance.  The accessors
  operate directly on the associated MCObject.  The lifetime of an
  MCConcreteObjectId instance is tied to its associated MCObject.
  This class is implemented as an inner class of MCObject.

* MCConcreteStackId is similar to MCConcreteObjectId but specifically
  for stacks, and based on the stack's altid instead of its id
  (because, for historical reasons, stacks are identified by altid
  rather than by id).

* MCDelayedObjectId is an MCObjectId that encapsulates identifying
  information without being tied to a specific MCObject.  At any time,
  its MCDelayedObjectId::Resolve() method may be used to attempt to
  obtain an MCConcreteObjectId.
…ference.

MCDelayedObjectId::Get() can return NULL if the object couldn't be
found.
peter-b added a commit to peter-b/livecode that referenced this pull request Dec 18, 2014
…arr.

Conflicts:
	engine/src/object.cpp
	engine/src/object.h
	engine/src/stack.h
Allows instances to be safely deleted.
Instead of the copy constructors, provide pointer copy constructors
because they're more useful.  In general, we're going to pass
MCObjectId instances by reference rather than by value, so the
implicit copy constructor will probably be good enough.
peter-b added a commit to peter-b/livecode that referenced this pull request Dec 18, 2014
@peter-b
Copy link
Contributor Author

peter-b commented Aug 13, 2015

@montegoulding Is this relevant to your interests?

@montegoulding
Copy link
Member

Awesome @peter-b this would be very helpful. I can't say I've understood it all yet. One question I have because I haven't spotted it yet is are you clearing the UUID when an object is copied and pasted, cloned etc so that next read of the UUID generates a new one? That's a critical factor and one it's difficult to reliably script due to lock messages.

@runrevmark
Copy link
Contributor

I don't think that's been done yet - this is currently a low-level internal API which was to be used by the internal implementation of stackdir IIRC. To leverage it from script it will need some syntax hooks, and hooking into the various mutation operations that need to update the UUID.

@montegoulding
Copy link
Member

montegoulding commented Aug 13, 2015 via email

@peter-b
Copy link
Contributor Author

peter-b commented Aug 13, 2015

@montegoulding: To clarify, whenever you get a new MCObject instance -- no matter how -- it starts off with no UUID. See peter-b@3cf5f8e. Mutation operators would only need to know about UUIDs if they are removing an object and replacing it with a new one that is supposed to represent the exact same object.

@montegoulding
Copy link
Member

That sounds perfect @peter-b

@mwieder
Copy link
Contributor

mwieder commented Aug 14, 2015

It'll take me a while to absorb this too, but I like what I see so far. This is destined for the develop branch... will it be backported to other branches besides 8.0? And how would this deal with the situation with existing stacks without uuids already in place? Do they get added on use? And added to the cache at that time?

@peter-b
Copy link
Contributor Author

peter-b commented Aug 14, 2015

  • I expect this will be LiveCode 8-only, if/when it gets merged.
  • An object gets a UUID the first time you ask it for one, or when you explicitly give it one.
  • The cache is an optimisation. Its behaviour should be invisible (and correct). Just use MCObjectId::Resolve() when you have a UUID and want the corresponding object.

@montegoulding
Copy link
Member

One of the things I wasn’t sure how you were planning to handle in the engine based stack dir format was what to do about object references that were stored in custom properties. This was one reason why I ended up deciding that we would always need some level of scriptable hooks to handle given custom properties and translate them to and from UUIDs… that is of course unless UUIDs are exposed as object references to script and then we just need a script to translate them once in a project.

@peter-b
Copy link
Contributor Author

peter-b commented Aug 14, 2015

@montegoulding: Well done, you have identified one of the big sticking points I ran into. I couldn't find a solution to this problem that would be always unambiguously correct and reliable, unfortunately.

@montegoulding
Copy link
Member

@peter-b my solution was a plugin api which incidentally ended up being used for other things like cleaning stacks of the many custom proper is the ide adds. Would it be feasible to implement them in LCB?

@peter-b
Copy link
Contributor Author

peter-b commented Aug 14, 2015

@montegoulding LCB code is at a lower level than the LiveCode engine and has no visibility of LiveCode objects, so no. What about using script-only stacks?

What I really wanted was to make custom properties that reference other objects actually contain a proper object reference value, rather than a number. Then serialisation is quite trivial! It's easy enough to do when constructing a new stack; you just:

set the customImage of button "myButton" to image "myIconImage"

...and then everything would Just Work (in the IDE, the reference would be shown as an ID, and in the on-disk format, the reference would be stored as a UUID). Unfortunately, that plan did not survive contact with pre-existing stacks.

@runrevmark
Copy link
Contributor

@mwieder: If finished, this would end up only going into LC8. LC6.7 will be EOL'd in due course, and all being well there will be little need to support LC7 when LC8 appears, as they are mostly the same - we've been careful to try and add all the LC8 features without perturbing the main engine too much. i.e. If you don't use widgets/LCB then LC8 is LC7.

@montegoulding/@peter-b: To be honest I never really considered it a sticking point. A man goes to a doctor and says 'it hurts when I move my arm like this', response 'don't move your arm like that'. My glibness here is merely to emphasize a very important point: if the goal is to try and support every single LiveCode stackfile currently in existence well then you might as well go and look for the pot of gold at the end of the rainbow. It's very important that the VCS system minimises any fragility for most stacks otherwise people are going to wind up down dead-ends they can't back themselves out of without redoing work (a bit like if an IDE crashes whilst you are editing code and you then lose it - how does that make you feel about the environment?).

Any external UUID references introduce a significant level of fragility in the system which will require the writing of tools to help people solve problems 'when things go wrong'. Therefore, if there is a mode of coding which people are used to using but there is a choice to do that thing a different way which isn't dVCS averse people will just have to change and update their code so that it can be dVCS'd - indeed the effort put into trying to solve such problems in the dVCS system itself would probably be much better off put into writing tools that help people to move their stackfiles forward so they can be dVCS'd.

Of course, if we have the UUID metadata in the engine, and the 'stackarr' method of encoding/decoding objects as LiveCode arrays then it means that lcVCS can sit on those features and choose how to support some things itself. The plugin system you already have works perfectly well so its not a bad solution, its just perhaps not as robust as one would hope (which, admittedly is because external UUID references are not robust) - however, perhaps the balance for older stacks is fine. As long as people understand the pitfalls its at least something they can move forward whilst still using VCS.

(By the way, by 'robust' here I mean that if you make a mistake and something happens to objects on one of your stacks, or there's a merge conflict coming about because a UUID change or something like that then it is very difficult for a human to sort things out. Chances are in any scenarios like that you'd end up having to revert branches and re-apply the changes to get back into a sane state. This is why I think it is very important to ensure that external UUID references are only required in very specific cases and very rarely to ensure that most projects would never encounter the problem. I do think that almost all cases can be covered by: encouraging people to name their objects well and uniquely, ensuring that whenever they do need to reference an object they do so via objectname/cardname/stackname triples, any use of ids is entirely at runtime for the purpose of efficiency).

@montegoulding
Copy link
Member

Sent from my iPhone

@montegoulding LCB code is at a lower level than the LiveCode engine and has no visibility of LiveCode objects, so no. What about using script-only stacks?

Oh I assumed it would eventually support object manipulation. Script only stacks would be fine. I guess the engine could load the stacks from a specific directory then call the hooks on each stack for each object.

What I really wanted was to make custom properties that reference other objects actually contain a proper object reference value, rather than a number.

@runrevmark mentioned the same thing a long time ago. I'm not adverse to putting something better in place and telling folks they need to use that if they want to do vcs

@runrevmark
Copy link
Contributor

@montegoulding: A more accurate statement would be to say that: LCB does not currently have direct 'supported' support for manipulating the LC object tree. However, that is not to say that an introspection API would not be useful to add. In fact, it would be quite helpful. We just don't have one yet. (Beyond 'script object' support in widgets which is to be avoided if at all possible because it can cause horrendous re-entrancy issues if used incorrectly - an introspection API wouldn't suffer from this as it would have to be designed to avoid executing LiveCode script at any point!).

In terms of the 'something better', then the new architecture in 7+ would support such a thing as it stands. It would be a new ValueRef type which encodes the idea of an object ref directly rather than as a string. In the first instance we would probably want to make creation of such things explicit - for precisely the use-case embodied above. i.e. So you can store references to objects in custom properties which the VCS system can then choose how to serialize and unserialize. (Used anywhere in LCS, it would render itself as the string you might expect it to).

@montegoulding
Copy link
Member

@runrevmark I'm liking the sound of this. What happens to the object reference if the object it's pointing to doesn't happen to be loaded in memory and either export wants to get its UUID or script wants to get a rugged or long ID etc.

@runrevmark
Copy link
Contributor

@montegoulding: If you don't want to hurt, don't do that :)

I think the use-cases there need to be analyzed more as I'm not sure code which does such a thing arbitrarily actually makes sense.

You cannot get a long id or a rugged id of an object which is not in memory. So a script attempting to do that on a reference which is not resolvable would get an error now. (i.e. the long id of "control "whichdoesnotexist" is a runtime error). This also suggests that getting such a reference as a string is also a runtime error. (the long id of tControlDoesntExistLongId is also a runtime error).

You can pass the reference around untouched and unmanipulated.

You can make the 'there is a' operator work on an unbound reference. It would return false.

When serializing, the reference obviously internally has the info to 'eventually' resolve it. Obviously it is fine if it is being serialized into a VCS format; if it is being put into a binary stackfile then its not clear to me right now what should happen. Perhaps it should be serialized as a 'defunct' object reference, or (if it makes sense) just serialize the UUID external reference.

Of course, one could argue that unresolved references should be resolved at the point of first use. A bit like if you use a long id which references a stack which is not in memory - the stack will be loaded. This might be another way to tackle things. It reduces the problem of unresolved objects refs to the case where it is not possible to resolve ever; rather than 'just no' which is probably to be considered a runtime error in all cases.

@peter-b
Copy link
Contributor Author

peter-b commented Aug 14, 2015

@runrevmark, @montegoulding : I already have the new custom MCValueRef type in my stackarr branch. It wraps an MCDelayedObjectId -- see commit d6abfe6.

@montegoulding
Copy link
Member

Hmm... when we get the behavior of an object and the behavior object isn't in memory we still get a meaningful result and no error is thrown. It seems if we wanted to be able to that we would need to retain both the UUID for internal use and the ID and stack for script use. Are we working too hard in order to not expose UUIDs as an optional object reference for scripters to use?

@runrevmark
Copy link
Contributor

@montegoulding: Indeed, but in that situation the application is basically in an error state. Remember that we can restrict the need for UUID references by ensuring that better referencing is used in such properties. i.e. name/card/stack triples. Thus the case where you might get a UUID reference which cannot be turned into a string would be quite rare.

Alternatively, the UUID reference also stores the 'last' human-readable resolution. That at least would give the user a chance to work out what's gone away, or what object's managed to screw up its UUID, or indeed what stack has had that object deleted from it.

@montegoulding
Copy link
Member

Hmm.. I guess you could throw an error while saving the stackdir if there is a non-unique object/card/stack name... I know I'd find that tedious unless I make myself a frontScript that assigns uuids as names but that's even worse than exposing UUIDs to script in the first place...

@runrevmark
Copy link
Contributor

@montegoulding: My main problem with UUIDs is that they offer no useful information at all. As soon as you have one which cannot be resolved it is useless. How do you find out what it should be attached to? This is why I am pretty sure that exposing them to script is a mistake - there has to be something human readable in a reference so a human has some chance of sorting out situations where 'things break'. Behaviors suffer this problem already - so it suggests there is something wrong with how we expect references to be expressed.

I don't see the requirement for any checking for uniqueness. It is something that developers need to be aware of - make sure you name things sensibly if you want to reference them from other stacks. I'm not sure why this seems like such an anathema in LC. C source files can be called whatever you want - but I don't seem any projects where you have filenames like 2387189cdef738.cpp.

@runrevmark
Copy link
Contributor

Btw, I should say that as a mechanism UUIDs are fine - but I'm not sure we've entirely got to the bottom of the problem exposing them is meant to solve. Perhaps its not surprising, control references in LC as it stands are somewhat inconsistently implemented and that's perhaps there is something to sort out there first to ensure that some of these issues we keep imagining in VCS don't come up.

(We'll expose them if we absolutely have to - however I'd rather not do so except as a last resort - if they are there people will use them, and I honestly don't think they are a wise thing to use)

@livecodestephen
Copy link
Contributor

I haven't really been following the ins and outs of this discussion, but can I point out that this isn't just a problem where 'things break'. If you're reviewing a diff from, say, a pull request and the developer has a reference from one UUID to another, then I have no idea what they've done, or if it's correct. And, if the UUID is in the filename I won't easily know where a change is.

@runrevmark
Copy link
Contributor

@livecodestephen: There's some discussion on the use-list about that under the BAF thread - it might be worth having a read. However, yes, that is one the main concerns about using UUIDs in the on-disk format - that could potentially be mitigated by appropriate tools though.

@montegoulding
Copy link
Member

I think using object names will work as long as:

  1. we throw an error when saving as stackdir if there is a non-unique stack/{card|sharedGroup}/control name (it's better to inform the developer that they made a mistake that will break their project here rather than overwrite one object with another silently etc
  2. all locations that currently use ID (icon, pattern, behavior, custom properties) start using names instead
  3. both (all) file formats need to retain the names not map to UUIDs or ID and stack name etc then back at runtime so that objects that are not in memory can still be referenced

The latter two conditions are bigger changes to the platform than I have been thinking would be acceptable. 2 particularly would require everyone (not just people that are prepared to make changes for VCS) to name referenced objects uniquely. I guess we could have iconName properties etc where you can set one or the other though?

@montegoulding
Copy link
Member

@peter-b Did you have a plan for dealing with resetting object properties to default values prior to export to avoid unnecessary merge conflicts? In lcVCS I dispatch lcVCSExport to the object being exported so that it can make itself presentable. I guess the engine could do the same thing. Mind you one of the reasons I ended up moving to calling a CLI to do the export was to avoid having to do that to objects you are working on just to make a commit. The CLI is called after a stack save and loads the stackFiles itself to export them so that these lcVCSExport handlers are run outside the IDE where you are working on your stacks.

@peter-b
Copy link
Contributor Author

peter-b commented Aug 14, 2015

@montegoulding This is something that we need to solve more generally in the engine/IDE, because it affects all stacks, not just exported ones!

@montegoulding
Copy link
Member

@livecodestephen There's a lot of very long emails in that thread and most of them are unrelated. The nuts of it is that we could have either (or both) code review tools built into the IDE or a web app that worked as a github service hook to we could do more meaningful code review outside github. I'd lean towards just doing it in the IDE personally. github api means we can list open PRs and check them out via refs/pulls and diff them locally against whichever branch they were against.

@montegoulding
Copy link
Member

@peter-b until now it hasn't really been a problem until deployment which is one of the reasons why I implemented savingStandalone. A scripted solution seems most logical to me because I can't imagine the tedium of setting default properties on objects (particularly on something like a resizable window with a relatively complex UI).

@peter-b peter-b closed this Apr 13, 2017
@peter-b peter-b deleted the feature-objectuuids branch April 13, 2017 12:10
@peter-b peter-b added the WIP label Apr 13, 2017
@peter-b peter-b restored the feature-objectuuids branch April 13, 2017 12:35
@peter-b peter-b reopened this Apr 13, 2017
@peter-b peter-b changed the title [[feature-objectuuids]] Add native support for object UUIDs to the engine. [DEAD][[feature-objectuuids]] Add native support for object UUIDs to the engine. Apr 13, 2017
@peter-b peter-b changed the title [DEAD][[feature-objectuuids]] Add native support for object UUIDs to the engine. 🍄 [[feature-objectuuids]] Add native support for object UUIDs to the engine. Apr 13, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
5 participants