Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recording historical data #43

Closed
stuartpb opened this issue Jun 16, 2015 · 21 comments
Closed

Recording historical data #43

stuartpb opened this issue Jun 16, 2015 · 21 comments

Comments

@stuartpb
Copy link
Member

So oDesk rebranded as Upwork and has moved the domain.

Going forward, how should these rebrands work? (What I've been doing, just moving the profile and changing my own uses of it, is clearly not viable going forward.) I'm thinking the old domain should have a migrate field that is like use but signals that data tied to the old domain should be moved to the new one.

For backreferencing, I'm thinking profiles should maybe also have a formerly array (Blotpass would then use this array like use).

Maybe instead of migrate, it should be moved.to, with maybe a moved.on that states the date that the move occured.

@stuartpb
Copy link
Member Author

OTOH, I don't like having the profiles include a graveyard of stub and stump profiles for dead domains (not to mention the DRY violation in manually maintaining forward and backward links). Profiles are supposed to be for the current, live state of sites - history is what Git is for.

I'm thinking maybe a "histories" (or "moves" or "renames" although those names are confusing) directory next to "profiles" that contains data for this kind of historical migration info (with documents that pretty much only include moved).

@stuartpb stuartpb added this to the v0.1.0 milestone Jun 16, 2015
@stuartpb
Copy link
Member Author

This needs to get figured out so I can review/update the oDesk profile.

@stuartpb
Copy link
Member Author

I really want to call this "migrations", but migrations already has a connotation for schemas so that's not really a solid plan

@stuartpb
Copy link
Member Author

Maybe "tracks" or "trails", ie. ones that you'd follow to find where something had moved?

@stuartpb
Copy link
Member Author

I think a "moves" directory, listing significant historical domain changes, is pretty concise. Maybe some synonym for "former names" as the files would likely just be arrays of old names, and the (approximate) dates that they moved to the next one.

@stuartpb
Copy link
Member Author

Don't forget that sometimes sites don't move so much as they just change their provider (ie. the use field).

@stuartpb
Copy link
Member Author

histories makes sense.

@stuartpb stuartpb changed the title Migrating domains Recording historical data Jan 25, 2016
@stuartpb
Copy link
Member Author

It'd be kind of cool / useful to record changes in password rules and the other minutia as well, but these things are usually sparsely-to-not-at-all documented beyond what we write here (in contrast to domain changes, which are high-profile and usually entail at least an announcement). Best to leave details like that to the Git history.

@stuartpb
Copy link
Member Author

Once we move to more than just profiles, I think it'll do to split docs into "types.md" (documentation for types, as described in #42, for structures like localized-text-arrays ie #12), and "fields/", containing "profiles.md" (the current docs/fields.md) and "histories.md" (for the structure of items in histories, natch). Also "stacks.md" or whatever ends up developing for #5 (if ever).

@stuartpb
Copy link
Member Author

To recap why it'll be "histories" and not "moves" or something like that, histories will list changes beyond just moves and renames, such as retitlings (on the same domain), provider (use) changes (possibly involving third-party auth transitions, see also #31), and shutdowns (aka the "gravestones" mentioned above).

Examples of provider changes: YouTube moving to Google accounts, TweetDeck moving to Twitter accounts.

@stuartpb
Copy link
Member Author

stuartpb commented Mar 8, 2016

Migrations are frustrating when they incorporate grandfathered accounts. coderbits.com is one example of this: registrations are now only accepted through topcoder.com, but old accounts set up through coderbits.com are still valid. Does this mean coderbits.com's old password rules etc. should still be exposed somewhere? Should they be removed if, at some point in the future, coderbits.com no longer accepts old accounts?

@stuartpb
Copy link
Member Author

stuartpb commented Mar 8, 2016

I think "histories" should actually be called "legacies", to reflect that it's only there to reflect changes that affect data in use today (legacy data).

@stuartpb
Copy link
Member Author

stuartpb commented Apr 24, 2016

Okay, so I'm certain I want to do this as a legacies directory of YAML files with names matching domains in profiles. What's the structure of the content?

I'm thinking each file is a list of objects with an on (or at) field stating the date of the event, and other fields describing the changes that occurred on that date.

There should probably be an "established" event at the start, so the histories can be worked back...

Actually, no: I think "legacies" should store data for domains at their old names (or non-move data at the domain's current name), with the events describing what happened to that name (ie "moved" or "merged" or whatever transformation).

Similarly, other changes will probably be of a form where the object describes what the site, at that moment, was "formerly". (Domain moves are the one exception, due to the site's old domain being an inherent part of the file, and not the new one.)

stuartpb added a commit that referenced this issue May 18, 2016
The legacy (#43) of Behance accounts before the Adobe migration is unknown, but the site doesn't seem to accept legacy accounts today.
stuartpb added a commit that referenced this issue May 18, 2016
The legacy (#43) of Behance accounts before the Adobe migration is unknown, but the site doesn't seem to accept legacy accounts today.
stuartpb added a commit that referenced this issue May 18, 2016
CondoInternet (now Wave G) uses an entirely different account system today, presumably the same one as Wave Broadband. The legacy (#43) of accounts on condointernet.net is essentially nonapplicable (accounts were created using email addresses as usernames, and passwords require a reset).

The new account system will be profiled separately.
stuartpb added a commit that referenced this issue May 18, 2016
CondoInternet (now Wave G) uses an entirely different account system today, presumably the same one as Wave Broadband. The legacy (#43) of accounts on condointernet.net is essentially nonapplicable (accounts were created using email addresses as usernames, and passwords require a reset).

The new account system will be profiled separately.
This was referenced Feb 7, 2017
@stuartpb
Copy link
Member Author

stuartpb commented Feb 7, 2017

Tying in versioning (#7): I guess legacies have to be in lockstep as a part of a whole suite of schemas that get versioned together, since they'll probably be part of the API, which is versioned by schema.

@stuartpb stuartpb reopened this Feb 19, 2017
@stuartpb
Copy link
Member Author

This has me kind of blocked on #19; I field weird moving oDesk's profile without a legacy record. That said, I'll just have to keep it in issues for now.

@stuartpb
Copy link
Member Author

Ugh, I also don't know what to do with compose.io, which has moved, at least in marketing, to compose.com. where the old domain is still where a lot of the nuts and bolts seem to be operating. Should compose.com say it's a "use" for compose.io right now, flipping them around after Legacies are worked out enough to handle the switchover?

@stuartpb
Copy link
Member Author

Or should compose.com just be used as the profile's canonical domain now, before there'd have to be any legacy transitioning anyway? (ie, does v0.1.0 want to ship with / get blocked on this issue hanging over it?)

@stuartpb
Copy link
Member Author

To make remembering this easier (and to allow for a little more leeway), each item will have a circa field instead of on or at: that way, even if the date isn't precise, it's still pretty much conveyed what it's meant to mean.

@stuartpb
Copy link
Member Author

moved will still get logged as moved.to, so that any other future details of the move can be kept under other children of moved.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 20, 2017

Another thing to keep in mind in the design of legacies, though it doesn't apply right now: there needs to be care taken in distinguishing when an account at the same name moved to a different site than the site that is currently at that name, like when live.com transitioned from an alias for Live365 to the hub for Windows Live.

This is as much a consumer concern as an editor concern, if not more so.

@stuartpb
Copy link
Member Author

stuartpb commented Feb 20, 2017

Anyway, I'm going to go ahead with this, using the circa and moved.to object lists I just described.

This also means the docs are going to move a bit, not quite #111 though (as #42 and #111 have largely been superseded by #146 and #147).

#111 was about changing the docs and tooling to match, but as #146 has since changed the plan so that tooling will not be tied to docs in the future, #111 is mostly about a natural structure for the documentation after being decoupled; however, for this to move forward with the current tooling implementation, there will need to be a slight change to represent that fields.md refers to fields in profiles and not legacies.

Or, actually, nah, I'll just leave it there, and have the documentation for legacies in the README for now (and fly blind in profiling legacies with no testing until proper-schema-validation is introduced).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant