New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomic operations / transactions in TransferManagers #64

Closed
binwiederhier opened this Issue Mar 15, 2014 · 70 comments

Comments

Projects
None yet
4 participants
@binwiederhier
Member

binwiederhier commented Mar 15, 2014

All operations that perform actions on the remote repository can be interrupted by connection issues. The TransferManagers are dumb in the sense that they cannot ensure transactions, idempotency or atomicity of any kind.

To avoid repository corruption, some sort of transaction concept for remote operations is needed.

Affected:

  • UpOperation
  • DownOperation
  • CleanupOperation

@binwiederhier binwiederhier added this to the Release 0.2 (public beta) milestone Mar 15, 2014

@binwiederhier binwiederhier modified the milestones: Release 0.1.1 (developer alpha 2), Release 0.2.0 and beyond Mar 31, 2014

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Apr 18, 2014

Member

I have started working on this at https://github.com/pimotte/syncany/tree/feature/atomic-operations

The basic idea is to have a file listing all the files that are involved in the transaction and upload that before starting the transaction. Committing involves uploading that file, uploading all files involved to a temporary remote file.

Current state: I implemented a skeleton, the UpOperation makes the right calls. The listing file is not actually filled, so it's not functional, but allmost all current tests pass, so nothing has been broken (yet).

WIP, but if you see something weird, feel free to let me know.

Member

pimotte commented Apr 18, 2014

I have started working on this at https://github.com/pimotte/syncany/tree/feature/atomic-operations

The basic idea is to have a file listing all the files that are involved in the transaction and upload that before starting the transaction. Committing involves uploading that file, uploading all files involved to a temporary remote file.

Current state: I implemented a skeleton, the UpOperation makes the right calls. The listing file is not actually filled, so it's not functional, but allmost all current tests pass, so nothing has been broken (yet).

WIP, but if you see something weird, feel free to let me know.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Apr 18, 2014

Member

I'll look at this tonight/tomorrow. Very promising start!

Member

binwiederhier commented Apr 18, 2014

I'll look at this tonight/tomorrow. Very promising start!

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Apr 19, 2014

Member

Looking at pimotte/syncany@binwiederhier:develop...feature/atomic-operations, this looks very clean to me. Of course this means that the storage has to support the "move" operation, but I think is safe to assume that all the serious storage backends will have that.

Of course the fun-plugins (Picasa, Flickr) might have issues here.

Member

binwiederhier commented Apr 19, 2014

Looking at pimotte/syncany@binwiederhier:develop...feature/atomic-operations, this looks very clean to me. Of course this means that the storage has to support the "move" operation, but I think is safe to assume that all the serious storage backends will have that.

Of course the fun-plugins (Picasa, Flickr) might have issues here.

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Apr 19, 2014

Member

Imo the funplugins are just that. If we want to integrate those at some point, we can develop a "legacy-mode" in which files are uploaded directly, but considering stability that doesn't seem like a good idea regardless.

Current status: Transaction file is created and uploaded, can be read out and files in it will be ignored upon listing. Caveat: db-A-000000001 is listed in a transaction, but then uploaded again, so the TM should clean up before going to a new transaction. Am I correct in assuming this is not a problem between different clients, since the names between different clients differ anyway?

Member

pimotte commented Apr 19, 2014

Imo the funplugins are just that. If we want to integrate those at some point, we can develop a "legacy-mode" in which files are uploaded directly, but considering stability that doesn't seem like a good idea regardless.

Current status: Transaction file is created and uploaded, can be read out and files in it will be ignored upon listing. Caveat: db-A-000000001 is listed in a transaction, but then uploaded again, so the TM should clean up before going to a new transaction. Am I correct in assuming this is not a problem between different clients, since the names between different clients differ anyway?

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Apr 28, 2014

Member

Continued discussion here: pimotte#1

Member

binwiederhier commented Apr 28, 2014

Continued discussion here: pimotte#1

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Apr 29, 2014

Member

I just talked to @guitarlum a bit and here are some thoughts about the atomic operations in general.

An atomic CleanupOperation could look like this:

  1. upload tx delete file (contains files to delete)
  2. upload purge database version
  3. move purge database version
  4. delete multichunks
  5. delete tx delete file

After we defined that, we realized "up" and "cleanup" are entirely atomic when you look at the databases (database files). In fact, the only part that is truly dangerous is the "swapping" of files when databases are merged together.

So what we actually need is a process to delete unused multichunks on the remote storage -- so a process that looks at the list of multichunks on the remote storage and deletes all of them that are never referenced. If we had that, would there still be a need for "atomicity"?

Any comments? @pimotte?

Member

binwiederhier commented Apr 29, 2014

I just talked to @guitarlum a bit and here are some thoughts about the atomic operations in general.

An atomic CleanupOperation could look like this:

  1. upload tx delete file (contains files to delete)
  2. upload purge database version
  3. move purge database version
  4. delete multichunks
  5. delete tx delete file

After we defined that, we realized "up" and "cleanup" are entirely atomic when you look at the databases (database files). In fact, the only part that is truly dangerous is the "swapping" of files when databases are merged together.

So what we actually need is a process to delete unused multichunks on the remote storage -- so a process that looks at the list of multichunks on the remote storage and deletes all of them that are never referenced. If we had that, would there still be a need for "atomicity"?

Any comments? @pimotte?

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Apr 29, 2014

Member

Mmmm. I agree. Can we rely on the TransferManagers to provide an atomic upload operation (in the sense that it fails if the the upload is not transfered correctly)? If yes, we can cut out all the stuff in feature/atomic-operations.

So, with remote cleanup I agree it should be fine.

A bit of a sidetrack: I was convinced we needed the transactions, but that is actually another problem:

FTP connections tend to be a bit wonky, so the first sync up for a 2GB repo, it's kind of impossible to succeed, because there WILL be a failed multichunk upload which is not reattempted.

Member

pimotte commented Apr 29, 2014

Mmmm. I agree. Can we rely on the TransferManagers to provide an atomic upload operation (in the sense that it fails if the the upload is not transfered correctly)? If yes, we can cut out all the stuff in feature/atomic-operations.

So, with remote cleanup I agree it should be fine.

A bit of a sidetrack: I was convinced we needed the transactions, but that is actually another problem:

FTP connections tend to be a bit wonky, so the first sync up for a 2GB repo, it's kind of impossible to succeed, because there WILL be a failed multichunk upload which is not reattempted.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Apr 30, 2014

Member

Well, I am not sure about the plugin implementations, but the TM upload JavaDoc demands that "The file is either uploaded completely or nothing at all." [sic]. The FTP plugin first uploads to a temp file and then renames to the final filename.

About the removal of the transactions: I am not entirely sure that it's a good idea to go back to what's currently implemented in 'develop'. I think before we do anything, we need to look at the possibilities of what might happen if two/many operations are run in parallel.

Example: If A runs "up" and B runs "cleanup", B might/will delete the "lost" multichunks that A just uploaded. With your TX implementation, this won't happen. I'm still not sure if transactions are the right way to prevent this though.

Member

binwiederhier commented Apr 30, 2014

Well, I am not sure about the plugin implementations, but the TM upload JavaDoc demands that "The file is either uploaded completely or nothing at all." [sic]. The FTP plugin first uploads to a temp file and then renames to the final filename.

About the removal of the transactions: I am not entirely sure that it's a good idea to go back to what's currently implemented in 'develop'. I think before we do anything, we need to look at the possibilities of what might happen if two/many operations are run in parallel.

Example: If A runs "up" and B runs "cleanup", B might/will delete the "lost" multichunks that A just uploaded. With your TX implementation, this won't happen. I'm still not sure if transactions are the right way to prevent this though.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Apr 30, 2014

Member

Let's see what can happen:

  • "down" changes nothing on the remote repo, so running down can't immediately break anything remotely. It can, however, break something locally if stuff is deleted/cleaned up by another client while "down" is running.
  • As said above "up" uploads multichunks and a new database, so it temporarily creates an inconsistent state that might be cleaned up by another client's "cleanup".
  • Obviously "cleanup" adds a purge database version, deletes multichunks (and merges old database versions). It also leaves a temporary inconsistent state, because it first uploads purge database version and then deletes the multichunks.

So I conclude:

  • A.down() may run while B.up() or B.down() is running
  • A.up() may run while B.up() or B.down() is running
  • A.cleanup() may NOT run while B.down(), B.up() or B.cleanup() is running

Correct?

Member

binwiederhier commented Apr 30, 2014

Let's see what can happen:

  • "down" changes nothing on the remote repo, so running down can't immediately break anything remotely. It can, however, break something locally if stuff is deleted/cleaned up by another client while "down" is running.
  • As said above "up" uploads multichunks and a new database, so it temporarily creates an inconsistent state that might be cleaned up by another client's "cleanup".
  • Obviously "cleanup" adds a purge database version, deletes multichunks (and merges old database versions). It also leaves a temporary inconsistent state, because it first uploads purge database version and then deletes the multichunks.

So I conclude:

  • A.down() may run while B.up() or B.down() is running
  • A.up() may run while B.up() or B.down() is running
  • A.cleanup() may NOT run while B.down(), B.up() or B.cleanup() is running

Correct?

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte May 1, 2014

Member

So, if we decide superfluous multichunks are not inconsistencies, the problem lies in cleanup deleting multichunks while another client is downing and Up uploading a reference to a multichunk while cleanup is deleting that particular multichunk.

Just making cleanup atomic does not solve this issue. However, if we make cleanup atomic and "just" retry down() if we fail on fetching a multichunk? For the conflict between cleanup and up, could we just ignore orphan multichunks that are younger than 4 hours?

Member

pimotte commented May 1, 2014

So, if we decide superfluous multichunks are not inconsistencies, the problem lies in cleanup deleting multichunks while another client is downing and Up uploading a reference to a multichunk while cleanup is deleting that particular multichunk.

Just making cleanup atomic does not solve this issue. However, if we make cleanup atomic and "just" retry down() if we fail on fetching a multichunk? For the conflict between cleanup and up, could we just ignore orphan multichunks that are younger than 4 hours?

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier May 1, 2014

Member

So after discussing with Pim on IRC, here's what we agreed upon:

  • UpOperation and DownOperation will upload a ActionRemoteFile to indicate that the operation is running, and will delete it when the
  • CleanupOperation will only run if no action files of other clients are present; Action files of the own client will be deleted
  • The pattern for ActionRemoteFile will be: action-<up|down|cleanup>-<client-id>-<timestamp>. TMs should place action files in an actions subfolder.
  • The feature/atomic-operations branch will be discarded
Member

binwiederhier commented May 1, 2014

So after discussing with Pim on IRC, here's what we agreed upon:

  • UpOperation and DownOperation will upload a ActionRemoteFile to indicate that the operation is running, and will delete it when the
  • CleanupOperation will only run if no action files of other clients are present; Action files of the own client will be deleted
  • The pattern for ActionRemoteFile will be: action-<up|down|cleanup>-<client-id>-<timestamp>. TMs should place action files in an actions subfolder.
  • The feature/atomic-operations branch will be discarded
@christf

This comment has been minimized.

Show comment
Hide comment
@christf

christf May 1, 2014

Contributor

Please make the actionremotefile a directory as creating a directory is an atomic operation while creating a file is not.
Viele Grüße

Christof

Diese Nachricht wurde von meinem Mobiltelefon gesendet.

-------- Ursprüngliche Nachricht --------
Von: "Philipp C. Heckel" notifications@github.com
Gesendet: Thu May 01 20:44:32 MESZ 2014
An: syncany/syncany syncany@noreply.github.com
Betreff: Re: [syncany] Atomic operations / transactions in TransferManagers (#64)

So after discussing with Pim on IRC, here's what we agreed upon: - UpOperation and DownOperation will upload a ActionRemoteFile to indicate that the operation is running, and will delete it when the - CleanupOperation will only run if no action files of other clients are present; Action files of the own client will be deleted - The pattern for ActionRemoteFile will be: action---. TMs should place action files in an actions subfolder. - The feature/atomic-operations branch will be discarded --- Reply to this email directly or view it on GitHub: #64 (comment)

Contributor

christf commented May 1, 2014

Please make the actionremotefile a directory as creating a directory is an atomic operation while creating a file is not.
Viele Grüße

Christof

Diese Nachricht wurde von meinem Mobiltelefon gesendet.

-------- Ursprüngliche Nachricht --------
Von: "Philipp C. Heckel" notifications@github.com
Gesendet: Thu May 01 20:44:32 MESZ 2014
An: syncany/syncany syncany@noreply.github.com
Betreff: Re: [syncany] Atomic operations / transactions in TransferManagers (#64)

So after discussing with Pim on IRC, here's what we agreed upon: - UpOperation and DownOperation will upload a ActionRemoteFile to indicate that the operation is running, and will delete it when the - CleanupOperation will only run if no action files of other clients are present; Action files of the own client will be deleted - The pattern for ActionRemoteFile will be: action---. TMs should place action files in an actions subfolder. - The feature/atomic-operations branch will be discarded --- Reply to this email directly or view it on GitHub: #64 (comment)

binwiederhier added a commit that referenced this issue May 3, 2014

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier May 3, 2014

Member

This was surprisingly easy. Commit daf2fa2 already works completely for the local plugin. The other plugins should also work, but don't yet have subfolders. The AbstractTransferOperation was not necessary (just some sugar).

pheckel@platop /tmp/action1/a $ sy up -d
  (runs a long time, writes <repo>/actions/action-up-3670fto6ivfqk856-1399156080475)

pheckel@platop /tmp/action1/b $ sy cleanup -k1 
Cannot run cleanup while other clients are performing up/down/cleanup. Try again later.

Comments?

Member

binwiederhier commented May 3, 2014

This was surprisingly easy. Commit daf2fa2 already works completely for the local plugin. The other plugins should also work, but don't yet have subfolders. The AbstractTransferOperation was not necessary (just some sugar).

pheckel@platop /tmp/action1/a $ sy up -d
  (runs a long time, writes <repo>/actions/action-up-3670fto6ivfqk856-1399156080475)

pheckel@platop /tmp/action1/b $ sy cleanup -k1 
Cannot run cleanup while other clients are performing up/down/cleanup. Try again later.

Comments?

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte May 4, 2014

Member

Looks clean:D

On removing your own action files: Does the following scenario go right? User has daemon active, which is syncing up and manually runs cleanup in the same client. Is this a scenario we even want to account for?

Member

pimotte commented May 4, 2014

Looks clean:D

On removing your own action files: Does the following scenario go right? User has daemon active, which is syncing up and manually runs cleanup in the same client. Is this a scenario we even want to account for?

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier May 4, 2014

Member

Right now this is simply not possible because the HSQL database is locked by the watch process. It throws a very neat little stack trace:

$ sy up -d
debug
4-5-14 11:39:58.459 | DatabaseConnect | main       | INFO : Loading database driver org.hsqldb.jdbcDriver ...
4-5-14 11:40:08.916 | CommandLineClie | main       | SEVE : Command up FAILED. 
java.lang.RuntimeException: Cannot create new connection; database down?
    at org.syncany.database.DatabaseConnectionFactory.createConnection(DatabaseConnectionFactory.java:128)
    at org.syncany.database.DatabaseConnectionFactory.createConnection(DatabaseConnectionFactory.java:82)
    at org.syncany.config.Config.createDatabaseConnection(Config.java:240)
    at org.syncany.database.SqlDatabase.<init>(SqlDatabase.java:71)
    at org.syncany.operations.up.UpOperation.<init>(UpOperation.java:102)
    at org.syncany.Client.up(Client.java:112)
    at org.syncany.Client.up(Client.java:108)
    at org.syncany.cli.UpCommand.execute(UpCommand.java:41)
    at org.syncany.cli.CommandLineClient.runCommand(CommandLineClient.java:252)
    at org.syncany.cli.CommandLineClient.start(CommandLineClient.java:118)
    at org.syncany.Syncany.main(Syncany.java:55)
Caused by: java.sql.SQLException: Database lock acquisition failure: lockFile: org.hsqldb.persist.LockFile@88721731[file =/tmp/b/.syncany/db/local.db.lck, exists=true, locked=false, valid=false, ] method: checkHeartbeat read: 2014-05-04 09:40:07 heartbeat - read: -3560 ms.
    at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
    at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
    at org.hsqldb.jdbc.JDBCConnection.<init>(Unknown Source)
    at org.hsqldb.jdbc.JDBCDriver.getConnection(Unknown Source)
    at org.hsqldb.jdbc.JDBCDriver.connect(Unknown Source)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:233)
    at org.syncany.database.DatabaseConnectionFactory.createConnection(DatabaseConnectionFactory.java:117)
    ... 10 more
Caused by: org.hsqldb.HsqlException: Database lock acquisition failure: lockFile: org.hsqldb.persist.LockFile@88721731[file =/tmp/b/.syncany/db/local.db.lck, exists=true, locked=false, valid=false, ] method: checkHeartbeat read: 2014-05-04 09:40:07 heartbeat - read: -3560 ms.
    at org.hsqldb.error.Error.error(Unknown Source)
    at org.hsqldb.error.Error.error(Unknown Source)
    at org.hsqldb.persist.LockFile.newLockFileLock(Unknown Source)
    at org.hsqldb.persist.Logger.acquireLock(Unknown Source)
    at org.hsqldb.persist.Logger.open(Unknown Source)
    at org.hsqldb.Database.reopen(Unknown Source)
    at org.hsqldb.Database.open(Unknown Source)
    at org.hsqldb.DatabaseManager.getDatabase(Unknown Source)
    at org.hsqldb.DatabaseManager.newSession(Unknown Source)
    ... 16 more
Error: Cannot create new connection; database down?
       Refer to help page using '--help'.

For the future I don't see the need (yet) to allow running watch and up/down/cleanup simultaneously on the same client. Do you?

Member

binwiederhier commented May 4, 2014

Right now this is simply not possible because the HSQL database is locked by the watch process. It throws a very neat little stack trace:

$ sy up -d
debug
4-5-14 11:39:58.459 | DatabaseConnect | main       | INFO : Loading database driver org.hsqldb.jdbcDriver ...
4-5-14 11:40:08.916 | CommandLineClie | main       | SEVE : Command up FAILED. 
java.lang.RuntimeException: Cannot create new connection; database down?
    at org.syncany.database.DatabaseConnectionFactory.createConnection(DatabaseConnectionFactory.java:128)
    at org.syncany.database.DatabaseConnectionFactory.createConnection(DatabaseConnectionFactory.java:82)
    at org.syncany.config.Config.createDatabaseConnection(Config.java:240)
    at org.syncany.database.SqlDatabase.<init>(SqlDatabase.java:71)
    at org.syncany.operations.up.UpOperation.<init>(UpOperation.java:102)
    at org.syncany.Client.up(Client.java:112)
    at org.syncany.Client.up(Client.java:108)
    at org.syncany.cli.UpCommand.execute(UpCommand.java:41)
    at org.syncany.cli.CommandLineClient.runCommand(CommandLineClient.java:252)
    at org.syncany.cli.CommandLineClient.start(CommandLineClient.java:118)
    at org.syncany.Syncany.main(Syncany.java:55)
Caused by: java.sql.SQLException: Database lock acquisition failure: lockFile: org.hsqldb.persist.LockFile@88721731[file =/tmp/b/.syncany/db/local.db.lck, exists=true, locked=false, valid=false, ] method: checkHeartbeat read: 2014-05-04 09:40:07 heartbeat - read: -3560 ms.
    at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
    at org.hsqldb.jdbc.JDBCUtil.sqlException(Unknown Source)
    at org.hsqldb.jdbc.JDBCConnection.<init>(Unknown Source)
    at org.hsqldb.jdbc.JDBCDriver.getConnection(Unknown Source)
    at org.hsqldb.jdbc.JDBCDriver.connect(Unknown Source)
    at java.sql.DriverManager.getConnection(DriverManager.java:571)
    at java.sql.DriverManager.getConnection(DriverManager.java:233)
    at org.syncany.database.DatabaseConnectionFactory.createConnection(DatabaseConnectionFactory.java:117)
    ... 10 more
Caused by: org.hsqldb.HsqlException: Database lock acquisition failure: lockFile: org.hsqldb.persist.LockFile@88721731[file =/tmp/b/.syncany/db/local.db.lck, exists=true, locked=false, valid=false, ] method: checkHeartbeat read: 2014-05-04 09:40:07 heartbeat - read: -3560 ms.
    at org.hsqldb.error.Error.error(Unknown Source)
    at org.hsqldb.error.Error.error(Unknown Source)
    at org.hsqldb.persist.LockFile.newLockFileLock(Unknown Source)
    at org.hsqldb.persist.Logger.acquireLock(Unknown Source)
    at org.hsqldb.persist.Logger.open(Unknown Source)
    at org.hsqldb.Database.reopen(Unknown Source)
    at org.hsqldb.Database.open(Unknown Source)
    at org.hsqldb.DatabaseManager.getDatabase(Unknown Source)
    at org.hsqldb.DatabaseManager.newSession(Unknown Source)
    ... 16 more
Error: Cannot create new connection; database down?
       Refer to help page using '--help'.

For the future I don't see the need (yet) to allow running watch and up/down/cleanup simultaneously on the same client. Do you?

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier May 4, 2014

Member

Obviously 'cleanup' still needs to cleanup lost multichunks. I'll implement that next.

Member

binwiederhier commented May 4, 2014

Obviously 'cleanup' still needs to cleanup lost multichunks. I'll implement that next.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 6, 2014

Member

Looks good so far. I'd suggest to merge develop->atomic-operations to keep the diff small and make the switching between branches less painful -- since we merged modules.

  • Merge develop->atomic-operations
Member

binwiederhier commented Sep 6, 2014

Looks good so far. I'd suggest to merge develop->atomic-operations to keep the diff small and make the switching between branches less painful -- since we merged modules.

  • Merge develop->atomic-operations
@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Sep 7, 2014

Member

Current status:

Unreliable test 4_1 fails for me if the whole test suite is ran, but not if it is ran in isolation, or just org.syncany.tests.plugins.unreliable_local.UploadInterruptedTest is ran.
Unreliable test 4_3 fails for me because of a double action file. I'm not sure how this got broken, and I'm not sure if it is a correct test. @binwiederhier can you check this?

I want to do the merge after all current tests pass, then i'll continue moving the final functionality to TransactionAwareTM, adding things that were not there before and then write more tests.

Member

pimotte commented Sep 7, 2014

Current status:

Unreliable test 4_1 fails for me if the whole test suite is ran, but not if it is ran in isolation, or just org.syncany.tests.plugins.unreliable_local.UploadInterruptedTest is ran.
Unreliable test 4_3 fails for me because of a double action file. I'm not sure how this got broken, and I'm not sure if it is a correct test. @binwiederhier can you check this?

I want to do the merge after all current tests pass, then i'll continue moving the final functionality to TransactionAwareTM, adding things that were not there before and then write more tests.

@pimotte pimotte self-assigned this Sep 7, 2014

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 7, 2014

Member

Sure. Will check this tomorrow.

Member

binwiederhier commented Sep 7, 2014

Sure. Will check this tomorrow.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 8, 2014

Member
  1. 4_1 failed because of a wrong assumption. Now fixed.
  2. 4_3: Fixed the action file count. They are left over whenever an operation fails. They get cleaned up after a successful operation.
  3. 4_3: This is a valid test. The first 'up' fails while moving the second multichunk. The 2nd 'up' fails while trying to roll back the first 'up'. And the 3rd 'up' finally succeeds. All that's left now is to clean up the temp files.
  4. 4_3: During the rollback, moving/deleting non-existing files (e.g. temp->temp, ..) fails and is retried 3 times only to be disregarded afterwards (because the original file does not exist. With the real retry-sleep-time, this would take 9 seconds: Moving RemoteFile[name=temp-AzSGWyaYaCammqPTnVqJ] to temp. file RemoteFile[name=temp-FjcaNlwmzdMzOQGxrswu] .... There are more files that don't exist and throw exceptions. We should do something about that. 9s * non-existing files in TX is a lot.
Member

binwiederhier commented Sep 8, 2014

  1. 4_1 failed because of a wrong assumption. Now fixed.
  2. 4_3: Fixed the action file count. They are left over whenever an operation fails. They get cleaned up after a successful operation.
  3. 4_3: This is a valid test. The first 'up' fails while moving the second multichunk. The 2nd 'up' fails while trying to roll back the first 'up'. And the 3rd 'up' finally succeeds. All that's left now is to clean up the temp files.
  4. 4_3: During the rollback, moving/deleting non-existing files (e.g. temp->temp, ..) fails and is retried 3 times only to be disregarded afterwards (because the original file does not exist. With the real retry-sleep-time, this would take 9 seconds: Moving RemoteFile[name=temp-AzSGWyaYaCammqPTnVqJ] to temp. file RemoteFile[name=temp-FjcaNlwmzdMzOQGxrswu] .... There are more files that don't exist and throw exceptions. We should do something about that. 9s * non-existing files in TX is a lot.
@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 8, 2014

Member

Also, I re-merged 'develop->atomic-operations', because it still showed some conflicts when I merged it.

Member

binwiederhier commented Sep 8, 2014

Also, I re-merged 'develop->atomic-operations', because it still showed some conflicts when I merged it.

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Sep 8, 2014

Member

@ earlier questions:

  • Combining action/transaction files, initial thought: Nice to have. Second thought: Action files prevent cleanup from starting while up is running. If we do this, we have to make extra checks and a client might do chunking for nothing.
  • Names in transaction files: Also nice to have.

However, I would like to postpone both of these until after the current branch has been merged, under the header: premature optimization.

Member

pimotte commented Sep 8, 2014

@ earlier questions:

  • Combining action/transaction files, initial thought: Nice to have. Second thought: Action files prevent cleanup from starting while up is running. If we do this, we have to make extra checks and a client might do chunking for nothing.
  • Names in transaction files: Also nice to have.

However, I would like to postpone both of these until after the current branch has been merged, under the header: premature optimization.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 8, 2014

Member

Agreed.

Member

binwiederhier commented Sep 8, 2014

Agreed.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 9, 2014

Member
  • Folder for the temp files?
Member

binwiederhier commented Sep 9, 2014

  • Folder for the temp files?
@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 9, 2014

Member

I reordered the TX.commit() method to make it more readable and clean-code-like. There are no logic changes whatsoever. Testing worked fine for me so far; and I'm underway with the final code review. I'll continue tonight.

Member

binwiederhier commented Sep 9, 2014

I reordered the TX.commit() method to make it more readable and clean-code-like. There are no logic changes whatsoever. Testing worked fine for me so far; and I'm underway with the final code review. I'll continue tonight.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 9, 2014

Member
Member

binwiederhier commented Sep 9, 2014

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 9, 2014

Member

A bit of readability again; no actual logic changes: c3558ba
Code-wise it looks really good. Some more testing is necessary, right?

Member

binwiederhier commented Sep 9, 2014

A bit of readability again; no actual logic changes: c3558ba
Code-wise it looks really good. Some more testing is necessary, right?

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Sep 9, 2014

Member

Current status:

Imo: getting ready for merge. I'm starting to update the other plugins. FTP should be done at: https://github.com/syncany/syncany-plugin-ftp/tree/feature/atomic-operations

Member

pimotte commented Sep 9, 2014

Current status:

Imo: getting ready for merge. I'm starting to update the other plugins. FTP should be done at: https://github.com/syncany/syncany-plugin-ftp/tree/feature/atomic-operations

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 10, 2014

Member

Regarding your FTP status: retrieveFile returns false if it's not successful. Maybe you can use that.

Member

binwiederhier commented Sep 10, 2014

Regarding your FTP status: retrieveFile returns false if it's not successful. Maybe you can use that.

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Sep 11, 2014

Member

TODO:

  • make actionhandler tx-unaware
Member

pimotte commented Sep 11, 2014

TODO:

  • make actionhandler tx-unaware
@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 11, 2014

Member

Congratulations @pimotte on a great piece of art! We finally merged the branch (#224)!
We'll leave this open until we're sure that it works perfectly.

Member

binwiederhier commented Sep 11, 2014

Congratulations @pimotte on a great piece of art! We finally merged the branch (#224)!
We'll leave this open until we're sure that it works perfectly.

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Sep 11, 2014

Member

TODO:
Add move(), StorageMoveException, StorageFileNotFoundException to the following plugins:

  • ftp (not merged)
  • webdav
  • sftp
  • s3 (no StorageFileNotFoundException!)
  • samba
Member

pimotte commented Sep 11, 2014

TODO:
Add move(), StorageMoveException, StorageFileNotFoundException to the following plugins:

  • ftp (not merged)
  • webdav
  • sftp
  • s3 (no StorageFileNotFoundException!)
  • samba
@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 12, 2014

Member

For each plugin:

  • bump version in build.gradle.
  • move()
  • StorageMoveException
  • StorageFileNotFoundException
  • add createTM(..., Config)
  • change Multi_c_hunkRemoteFile
  • add transactionsPath and tempPath
  • add creation of transactionsPath and tempPath to TM.init()

I'm doing S3

Member

binwiederhier commented Sep 12, 2014

For each plugin:

  • bump version in build.gradle.
  • move()
  • StorageMoveException
  • StorageFileNotFoundException
  • add createTM(..., Config)
  • change Multi_c_hunkRemoteFile
  • add transactionsPath and tempPath
  • add creation of transactionsPath and tempPath to TM.init()

I'm doing S3

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 12, 2014

Member

And Samba

Member

binwiederhier commented Sep 12, 2014

And Samba

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 12, 2014

Member

@pimotte The S3 plugin can't detect if a file does not exist, at least not without a big performance impact. How bad is this? Does the "non-existing-temp -> somefile"-move only happen in rollback cases? If so that's okay I guess, if not, we might need to think of something else.

Member

binwiederhier commented Sep 12, 2014

@pimotte The S3 plugin can't detect if a file does not exist, at least not without a big performance impact. How bad is this? Does the "non-existing-temp -> somefile"-move only happen in rollback cases? If so that's okay I guess, if not, we might need to think of something else.

@pimotte

This comment has been minimized.

Show comment
Hide comment
@pimotte

pimotte Sep 12, 2014

Member

For documentation: The move back only happens if a cleanup fails, is rolled back and deleting files had already started.

Member

pimotte commented Sep 12, 2014

For documentation: The move back only happens if a cleanup fails, is rolled back and deleting files had already started.

@cr0

This comment has been minimized.

Show comment
Hide comment
@cr0

cr0 Sep 12, 2014

Member

FYI: I tested the samba plugin with two connected clients and overlapping actions. Everything works like I expected.

Member

cr0 commented Sep 12, 2014

FYI: I tested the samba plugin with two connected clients and overlapping actions. Everything works like I expected.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 12, 2014

Member

And WebDAV

Member

binwiederhier commented Sep 12, 2014

And WebDAV

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 12, 2014

Member

All done. Now we test.

Member

binwiederhier commented Sep 12, 2014

All done. Now we test.

@binwiederhier

This comment has been minimized.

Show comment
Hide comment
@binwiederhier

binwiederhier Sep 17, 2014

Member

Since this is in the develop branch, I'm closing the biggest issue of the century. Although there are already related issues in #227.

Member

binwiederhier commented Sep 17, 2014

Since this is in the develop branch, I'm closing the biggest issue of the century. Although there are already related issues in #227.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment