Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove batch operations from the specification #76

Closed
acoburn opened this issue Mar 17, 2017 · 24 comments · Fixed by fcrepo/fcrepo-specification-atomic-operations#1
Closed

Comments

@acoburn
Copy link
Contributor

acoburn commented Mar 17, 2017

No description provided.

@ajs6f
Copy link
Contributor

ajs6f commented Mar 18, 2017 via email

@ajs6f
Copy link
Contributor

ajs6f commented Mar 18, 2017

@birkland
Copy link
Contributor

At this point, the spec is ambiguous as to the purpose of batch operations. The use cases on the wiki look out of date, or only minimally related to batch operations. It's hard to tell.

I will note that any use cases for fedora 4 transactions I've had in the past have all been related to rollback functionality; i.e. if an unrecoverable error is encountered while performing a related sequence of operations, it is convenient to just throw away everything that just occurred and leave the repository in a clean state.

There's a good discussion here from January 2016 on fedora-community and fedora-tech, which I think lead to the consensus weakening of "transactions" to "batch atomic operations" in the first place. For some reason, "transactions optional" and "memento for versioning" messages are intermingled with one another in the thread:
https://groups.google.com/forum/#!topic/fedora-community/LAsEn46N5N0

There were some use cases from ICPSR from @ummerh and from UVa

@ruebot
Copy link
Contributor

ruebot commented Mar 20, 2017

👍 for moving batch operations to MAY

@dannylamb
Copy link
Contributor

Being one of the people that originally provided a use case for batch operations, I must confess that I've had a change of heart over the course of islandora development. I've significantly altered my approach to avoid them, and will rely on either transactions either at the JMS level or some other Camel tricks to ensure I can restart failed multi-step workflows safely. I lose the ability to rollback (e.g. the half-baked structures will still exist until I fix my problem), but find that trade-off acceptable if it eases scaling.

BUT, I think they are still of merit to others, and am sure that many Fedora users are currently relying on them. So 👍 for MAY, since those using the reference implementation right now will still want them.

@ajs6f
Copy link
Contributor

ajs6f commented Mar 20, 2017

-1 to using MAY for any entire section of the spec. Either remove it (possibly creating a separate spec for it) or don't. MAY should be reserved to qualify or explicate behavior within a particular section of this spec.

@zimeon
Copy link
Contributor

zimeon commented Mar 20, 2017

Batch/atomic transactions are certainly a high bar which should be justified by strong use cases (and it seems that isn't the case). Whether made MAY or removed to a separate spec, support is already quite testable (send request Atomic-Start, see if you get Atomic-ID back. If not, then not supported).

@bseeger
Copy link

bseeger commented Mar 20, 2017

Following this and am -1 to using MAY for an entire section of the spec as well.

@ajs6f
Copy link
Contributor

ajs6f commented Mar 20, 2017

@zimeon The problem is not test-ability. It is replace-ability. If the test is negative, the client is now on the hook to operate in some (non-obvious, case-specific) manner that obviates the need for the facility. If that is the practical upshot, what is the value of an "optional" facility? Clients that expect to act against a generic Fedora API will have to work out their own "atomicity" facilities for their use cases. No assumption is possible.

I'm not disagreeing with your claim about test-ability. I am claiming that "made MAY" and "removed" are not both reasonable choices.

@zimeon
Copy link
Contributor

zimeon commented Mar 20, 2017

@ajs6f - I was just looking at whether a compliance tool (or client) could tell cleanly whether the facility for atomic operations were supported. I think it is OK, and I think that the same logic applies whether or not this is MAY (in which case one could consider a stronger requirement that non-supporting implementations give 501 in response to any request with an Atomic-Start header) or whether it should be a separate spec (in which case it isn't reasonable to demand anything extra for an implementation that implements on "core" and not "atomic extension"). I agree that a "work around" to provide atomicity over a non-atomic Fedora is no easy thing.

IMO, the MAY vs separate spec choice is really a style issue. However, I lean toward removing it to retain a cleaner notion of what complying with the Fedora API means.

@whikloj
Copy link
Contributor

whikloj commented Mar 20, 2017

I'll shed a tear for the time spent trying to help with this part of the specification, but it seems to me that removing it entirely is the best choice.
So burn it down!!!

@birkland
Copy link
Contributor

birkland commented Mar 21, 2017

It would be really nice if it were possible to narrow the scope or functionality enough such that batch atomic ops weren't such a burden. From a client perspective, I think atomic batch ops help to reduce the cognitive load of building applications against the repository. Absent sufficient narrowing, then removal is the next logical option.

@ajs6f
Copy link
Contributor

ajs6f commented Mar 21, 2017

The alternatives appear to be keep it or remove it entirely. I have put the question on the next tech meeting agenda.

@awoods
Copy link
Collaborator

awoods commented Mar 22, 2017

@no-reply ? @barmintor ? from the perspective of your own Fedora implementations, it would be value to get your thoughts on maintaining or removing the Batch Atomic Operations element of the specification.

@barmintor
Copy link
Contributor

This is why I am more interested in specifying conformant behaviors for messaging,batch,versioning, etc and their advertising than I am in talking about whether a particular behavior is required or not, which I think is the client's business.

@barmintor
Copy link
Contributor

For my part, using a Blazegraph backend gives you some tx support that makes this spec look pretty achievable, tho like MODE the binary component is a problem. On the other hand, if it's not being used in Hydra I would not put a high priority on its implementation.

@peichman-umd
Copy link

At UMD, we have relied on transactions in our batch loading process to make the logic in the client simpler and easier to follow. To that end, I echo what @birkland said above about atomic operations significantly easing the cognitive load of client development, especially long-running batch processes which may be running unattended for hours.

However, I do understand @acoburn's point about transactions being difficult or impossible to do in distributed or horizontally-scaled implementations. I certainly don't want the Fedora spec to stand in the way of these sorts of implementations.

I don't share @ajs6f's strong objection to make atomic operations a MAY level requirement, though I can see the potential for "optionality creep" eventually muddying the spec.

My proposal: If we remove atomic operations from the Fedora spec, move the atomic operations section into its own "microspec" (e.g., "Atomic-LDP"). Or, find a way to model transactions that fits with an existing REST/LDP spec. That way, implementations that want to support atomic operations can have a standard that describes how to do it, but it is not part of the Fedora API so implementations like @acoburn's would not be burdened with implementing it.

(Unfortunately, I cannot be at today's tech call, as I am at the DC FUG right now.)

@barmintor
Copy link
Contributor

@acoburn has provided a good example of difficulties in specifying merge behavior here.

@ruebot
Copy link
Contributor

ruebot commented Mar 23, 2017

Based on today's discussion in the Fedora Tech Call, I'll put in a PR later today that pulls out Atomic Batch Operations. @awoods, can you create a new repo for "Atomic-LDP", and I'll do the initial PR there with what as removed, along with respec boilerplate.

ruebot added a commit to yorkulibraries/fcrepo-specification that referenced this issue Mar 23, 2017
@ruebot
Copy link
Contributor

ruebot commented Mar 23, 2017

PR: #79

@awoods
Copy link
Collaborator

awoods commented Mar 23, 2017

@ruebot : does it need to be a new repo? or just a new document in:
https://github.com/fcrepo/fcrepo-specification ?

@ruebot
Copy link
Contributor

ruebot commented Mar 23, 2017

@awoods I think it should be a new repo, since the idea -- at least my interpretation of the meeting and @peichman-umd's above comment -- is that this is a separate specification.

@awoods
Copy link
Collaborator

awoods commented Mar 23, 2017

@ruebot : here it is: https://github.com/fcrepo/fcrepo-specification-atomic-operations

@ruebot
Copy link
Contributor

ruebot commented Mar 23, 2017

awoods pushed a commit that referenced this issue Mar 24, 2017
* Remove "Atomic Batch Operations"
* Resolves #76
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet