Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction Support #67

Merged
merged 38 commits into from Nov 3, 2022
Merged

Transaction Support #67

merged 38 commits into from Nov 3, 2022

Conversation

mikejritter
Copy link
Contributor

JIRA Ticket: https://fedora-repository.atlassian.net/browse/FCREPO-3673

What does this Pull Request do?

Adds support for transactions to the client.

What's new?

  • Add startTransaction for creating transactions from the client
  • Ability to retrieve the transaction uri from FcrepoResponse
  • Helper for adding Atomic-Id to requests

How should this be tested?

  • TODO

Additional Notes:

Currently this has a minimal set of changes for supporting transactions. I was considering adding methods for other api operations on transactions (commit, rollback, refresh), but they would be wrappers around get, post, etc, so I'm not sure how much value they would add. Depending on how much support is wanted in the client, more could be added.

Also I'm still in the process of writing integration tests, so for now I'm putting this up as a draft.

Interested parties

@fcrepo/committers

@mikejritter mikejritter marked this pull request as ready for review May 5, 2022 15:02
Copy link
Contributor

@bbpennel bbpennel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be very helpful to add an integration test to verify that the transaction functionality works, and to demonstrate what its usage looks like so we can get a feel for if it seems usable.

In our local usage of transactions at UNC against fcrepo 5, we have a more stateful arrangement where a transaction is started and then stored to the current thread and applied to all fcrepoClient requests in the same thread until the tx ends. I wouldn't want to build this into FcrepoClient since it needs to be more flexible, but I am wondering if it would make sense to provide a way to supply a txUrl to multiple requests without the user needing to call addTransaction on all of them.

A semi-stateful approach could be to provide a short lived TransactionalFcrepoClient that could be initialized with a txUri which it would add to all RequestBuilders it produces. This could have the additional benefit of preventing the client from attempting to start another tx from inside of a tx (for example, if the client had a method which creates a Binary resource with some properties in a tx, which is sometimes called from another method which creates a Container resource wrapping a binary resource all as one tx). A simpler approach could be to add a fcrepoClient.supplyTxUrl(Supplier<String>) method, which could allow users to supply a function that would return a txUrl when appropriate, but it might be less flexible.

Or maybe this is all overkill and should be left up to client to wrap around the fcrepoClient. What do you think?

src/main/java/org/fcrepo/client/FcrepoResponse.java Outdated Show resolved Hide resolved
src/main/java/org/fcrepo/client/RequestBuilder.java Outdated Show resolved Hide resolved
src/main/java/org/fcrepo/client/FcrepoClient.java Outdated Show resolved Hide resolved
@mikejritter
Copy link
Contributor Author

@bbpennel I pushed some changes for the comments you gave that were easy to hit. I think providing the Transaction API endpoints through a transaction method on the client ended up being a nice way to get support for them.

The TransactionalFcrepoClient is something I had been thinking about as well, but never ended up being much more than a thought. My idea was that when a transaction was started, the client would retrieve the Atomic-Id from the response and append the transaction to any request so long as it was still valid. I wasn't sure what error handling would look like in any failure scenarios, and at the time I had a limited understanding of the Transaction API. I think it's something worth investigating further, and at least get support in the client in some form.

@bbpennel
Copy link
Contributor

I'm not sure if TransactionalFcrepoClient would need to handle error cases internally, my guess is the user would probably want to control when transactions get rollback or committed.

@mikejritter
Copy link
Contributor Author

@bbpennel I thought I had pushed up changes the other week for the test updates, but I guess that never happened. When I was looking at the tests last week I also became less satisfied with how the validation of the endpoint is happening and ended up making a few changes so that you could not do things like fcr:tx/uuid/some-extra-info to cause the request to fail. This was mostly just expanding the regex to be similar to what's found in fcrepo.

Looking at this today I still don't really like using the raw URI, and am looking at wrapping the Transaction URI location in its own class which would be returned by FcrepoResponse. I'm also going to try and take some time this week to see if I can get a way to supply the txUrl to the client like you mentioned earlier. As I was moving things around today it felt like something we can do in these changes.

@mikejritter
Copy link
Contributor Author

@bbpennel When looking at adding a way to have a TransactionFcrepoClient, I was thinking we won't want to share the HttpClient in case it is closed by any client. We could instantiate a new instance in the FcrepoClientBuilder, which would be similar to how the client currently is created and an easy pattern to adapt. I'd say we could add the transaction id to the FcrepoClient, but then we either need to have the user remove the transaction id later or handle it ourselves (maybe after commit or rollback are called).

I was trying to understand how the fcrepoClient.supplyTxUrl(Supplier<String>) might work, but I'm not entirely certain how a user could use it other than retrieval of a single transaction id.

@bbpennel
Copy link
Contributor

Makes sense for them to have their own HttpClient. Having a transaction id on the regular FcrepoClient seems difficult to manage, particularly if an application using any shared FcrepoClient instances, so I would advocate for only have transaction state stored in a dedicated TransactionalFcrepoClient.

Took me a minute to remember what I meant with the fcrepoClient.supplyTxUrl(Supplier<String>) thing. What I was thinking was that the user would define a customer method that the fcrepoClient would call any time it was instantiating a request builder, where the return value from the Supplier would be a transaction id or null.

One use case would be similar to my local usage, where we bind transactions to threads. So the it might look something like:

ThreadLocal<String> txIdThread = new ThreadLocal<>();
fcrepoClient.supplyTxUrl(() -> {
    return txIdThread.get()
});

try (final var response = fcrepoClient.transaction(new URI(SERVER_ADDRESS)).start().perform()) {
     txIdThread.set(response.getTransactionUri());
}

// Any requests using this fcrepoClient in the current thread will be in the same transaction

// Commit the tx, clearing the tx from the current thread
try (final var response = client.transaction(new URI(location)).commit().perform()) {
      txIdThread.remove();
}

(note, I'm writing this in my browser so it might not be totally accurate)
This approach requires more from the user, since they have to decide/manage when to supply a transaction id, and manage its lifecycle separately. So I feel like having short lived TransactionalFcrepoClient clients is easier to use, but slightly less flexible. The two could probably coexist, but I don't know if its worth the complexity. i'm also not sure if its more common for people to use shared FcrepoClient instances, or make new ones whenever one is needed.

Copy link
Contributor

@bbpennel bbpennel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry I had these pending comments sitting around for a while

@@ -55,6 +58,20 @@ public void testTransactionCommit() throws Exception {
location = response.getTransactionUri().orElseThrow(() -> new IllegalStateException("No tx found"));
}

final var transactionalClient = FcrepoClient.client()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there might be a way to have the original client produce produce the tx client somehow. That could save having to pass credentials and other details around if they could simply be cloned from the originating client. Although it is also still good to have a way to directly produce a transactional client like this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of storing the credentials in the client so that we could do this. Agreed that it makes sense to have the client create the transactional client, which is what I initially wanted.

Updates the FcrepoClient constructor to take a HttpClientBuilder in order to construct new clients
src/main/java/org/fcrepo/client/DeleteBuilder.java Outdated Show resolved Hide resolved
* @param uri the Transaction to add to each request
* @return a TransactionFcrepoClient
*/
public TransactionalFcrepoClient transactionalClient(final FcrepoResponse.TransactionURI uri) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add a test of this method. I am also wondering if it would make sense to add a startTransactionalClient method to the TransactionBuilder, which would call start and then return a new TransactionalClient with the URI of the new transaction, since it seems like almost every usage of fcrepoClient.transactionalClient would require those same steps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about a way to create it from the TransactionalBuilder as well. At the moment I'm playing around with the idea of adding the startTxClient inside of the RequestBuilder from txBuilder.start(), though I don't know if adding an additional method of execution is a good idea since perform is already defined. Otherwise maybe the startTxClient could take a Map of headers so they can be set before execution.

@mikejritter
Copy link
Contributor Author

@bbpennel I rolled back a few changes and put the addTransaction back in the RequestBuilder and made a few other changes per the suggestions.

One thing I noticed while writing a test which uses the TransactionalFcrepoClient is that it would be nice if we could call txClient.transaction().commit() from it instead of having to pass the transaction id in again, e.g. txClient.transaction().commit(txId). Maybe just adding a shorthand commit/keepAlive/etc in the TransactionalFcrepoClient which will pass the txId for us?

Copy link
Contributor

@bbpennel bbpennel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, not having to pass the txId back into transaction operations makes sense when it should already be available from the client. Since there aren't really any options needed for building up interactions with transactions, it seems reasonable to shorten the root to performing the transaction operations.

try (final var response = client.transaction().start(new URI(SERVER_ADDRESS)).perform()) {
assertEquals(CREATED.getStatusCode(), response.getStatusCode());

location = response.getTransactionUri().orElseThrow(() -> new IllegalStateException("No tx found"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally against the optional, but from a expectations perspective, I think it'd make sense to have getTransactionUri return TransactionURI directly.

With the optional it is shorter to throw an error if there is no txUri. But if the user requested to start a tx and the response didn't have a txUri, it seems like that would indicate a server error of some sort, and should have already thrown an exception. Also, to get the uri right now, the user would have to call response.getTransactionUri().get().get() or similar with orElseThrow, which looks kind of weird.

I'm also trying to figure out what having a TransactionURI class is getting us versus passing back a URI. It only holds a single value, and doesn't seem to have any internal logic aside from calling URI.create.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was hoping the TransactionURI would provide more value but I can't really say much has emerged. I was actually thinking about rolling it back before initially pushing the commit with it but went with it just to see how it would go. I'm still kind of on the fence about it solely from using it in the TransactionBuilder.

I can see what you're saying wrt the Optional as well - failures in the request would have already occurred by the time we're calling getTransactionUri. Originally I wanted the start to return something like TxAwareFcrepoResponse but didn't want to make another class for that single method.

@mikejritter
Copy link
Contributor Author

@bbpennel After making some of the changes I've been trying to figure out what purpose the TransactionBuilder really serves and how many paths we really want to give for supporting transactions. With commit/status/rollback/keep_alive in the TransactionalClient, it almost feels like we should encourage use of that over the TransactionBuilder.

@bbpennel
Copy link
Contributor

Yes, I agree, it doesn't seem like providing the transaction() method is adding anything. I spent some time working through 4 different pathways to do the same thing currently present in this PR, and it only really seemed like using the full stateful TransactionalClient was a significant improvement in terms of readability over doing direct interactions with the tx API.

Also, for the sake of keeping method names actions, I'd suggest having the method for starting a transaction either be startTransaction or startTransactionalClient. The former makes the intent/outcome clearer, while the latter makes it clearer it's giving back a client.

TransactionalClient only

try (var txClient = fcrepoClient.startTransaction(baseUri)) {
    txClient.put(someUri);
    txClient.commit();
} catch (Exception e) {
    txClient.rollback();
}

Mix of TransactionBuilder and Client (similar to the integration tests)

try (final var response = client.transaction().start(new URI(SERVER_ADDRESS)).perform()) {
    txLocation = response.getTransactionUri();
}

try (var txClient = fcrepoClient.transactionalClient(txLocation)) {
    txClient.put(someUri);
    txClient.commit();
} catch (Exception e) {
    txClient.rollback();
}

Just using the tx builder (no txClient)

try (final var response = client.transaction().start(new URI(SERVER_ADDRESS)).perform()) {
    txLocation = response.getTransactionUri();

    fcrepoClient.put(someUri).addTransaction(txLocation).perform();
    fcrepoClient.transaction().commit(txLocation);
} catch (Exception e) {
    client.transaction().rollback(txLocation);
}

Do it yourself, without tx helpers

try (final var response = client.post(txEndpointUri).perform()) {
    txLocation = response.getTransactionUri();
}

try {
    fcrepoClient.put(someUri).addTransaction(txLocation).perform();
    fcrepoClient.put(txLocation).perform(txLocation);
} catch (Exception e) {
    fcrepoClient.delete(txLocation).perform(txLocation);
}

@mikejritter
Copy link
Contributor Author

@bbpennel I ended up removing TransactionBuilder and moved the logic into the FcrepoClient and TransactionFcrepoClient. I think overall the functionality feels better even if my experience is only in the integration tests. I did want to look at dropping the TransactionURI as well which I'm starting now.

Copy link
Contributor

@bbpennel bbpennel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, just left one comment related to addHeader for transaction method

public RequestBuilder transactionKeepAlive() {
return transaction().keepAlive(transactionURI);
public RequestBuilder keepAlive() {
return new RequestBuilder(transactionURI.get(), this) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reducing the set of operations available is probably a good thing. It may not be an issue, but I am realizing this does remove the ability to set headers. My understanding is that authz doesn't require any extra headers in fedora, so the only other use case that comes to mind is extensions like API-X (which seems to have become less of a thing), where headers might be used by the extensions for triggering behaviors. Anyways, I don't know if we need to expose addHeader/addLinkHeader or not, just was realizing the option was being removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to remember why I had gone with returning RequestBuilder in the first place - since that originally came from the TransactionBuilder maybe it was something in there. Either way it seems like it we could have these return PostBuilder/GetBuilder/etc which should expose the additional methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's true, although it would expose a lot of other methods too like addTransaction. But maybe its better to be under-perscriptive rather than over.

@bbpennel bbpennel merged commit 3393573 into fcrepo-exts:main Nov 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants