Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow interactive requests to reopen a re-created db instance #3050

Merged
merged 1 commit into from Jul 31, 2020

Conversation

nickva
Copy link
Contributor

@nickva nickva commented Jul 30, 2020

Previously, if a database was re-created on another node, a request with that database might have found the previous db instance in the cache. In that case it would have correctly reopened the db while in a transaction, but, because
the old db instance was deleted it would throw a database_does_not_exist which was not the correct behavior.

To prevent that from happening, introduce an interactive = true|false option when opening a database. User requests may specify that option and then when the db is re-opened, it will allow it to automatically upgrade to the new db instance instead of returning an error.

Background processes will still get a database_doest_not_exist error if they keep a db open which has now been re-created.

The interactive option may also be used in the future to set other transaction parameters like timeouts and retries that might be different for interactive requests vs background tasks.

Previously the UUID was checked in an attempt to prevent successfully re-opening an old database instance, in case it was deleted and re-created. However that is not necessary as the open line above would have already fetched the current db version

@nickva
Copy link
Contributor Author

nickva commented Jul 30, 2020

Saw some unexpected success test failures:

   fabric2_db_crud_tests:40: crud_test_ (recreate_db)...*failed*
in function fabric2_db_crud_tests:'-recreate_db/1-fun-4-'/1 (test/fabric2_db_crud_tests.erl, line 160)
in call from fabric2_db_crud_tests:recreate_db/1 (test/fabric2_db_crud_tests.erl, line 160)
**error:{assertException,
    [{module,fabric2_db_crud_tests},
     {line,160},
     {expression,"fabric2_db : get_db_info ( Db1 )"},
     {pattern,"{ error , database_does_not_exist , [...] }"},
     {unexpected_success,
         {ok,[{cluster,{[{n,0},{q,0},{r,...},{...}]}},

From https://github.com/apache/couchdb/blob/prototype/fdb-layer/src/fabric/test/fabric2_db_crud_tests.erl#L160

So perhaps the existing code wasn't totally wrong. The question is what should happen if we end up finding a previous instance of a db in the cache. With the current PR we'll let the user roll with the old db, and when they perform a transaction operation, we'd silently upgrade it to the new db version. And then they can keep holding on that stale handle and be none the wiser.

In the previous scheme, any old db handle immediately threw a database_does_not_exist error which bubbled up to the client right away. That wasn't correct either, since a user may delete and recreate the db on node1 then make a re-quest for db_info to node2, get the previous db instance, then get a 404 response, which is wrong.

@nickva nickva changed the title Do not check db instance UUID in fabric2_fdb:reopen/1 call Allow interactive requests to reopen a re-created db instance Jul 30, 2020
@nickva
Copy link
Contributor Author

nickva commented Jul 30, 2020

Updated approach: the idea is that we have different behaviors for interactive requests vs background tasks.

Interactive requests should be able to auto-reopen to a new db instance if it was re-created. Background tasks, like indexers, for example, would want to handle the database_doest_not_exist errors, since they don't want to keep holding on to a stale database handle that will continuously be reopen when doing any transactions.

@nickva nickva requested a review from davisp July 30, 2020 23:48
@nickva nickva force-pushed the fix-db-recreation-bug branch 2 times, most recently from 952e757 to 993e2d4 Compare July 31, 2020 00:04
Copy link
Member

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but lets convince ourselves that the security properties changing are enforced correctly in a test. As as adding the inverse of the test you have that shows that interactive=false would throw the error we're expecting.

@nickva
Copy link
Contributor Author

nickva commented Jul 31, 2020

@davisp good call about the security test, I don't think we enforce security check on the new instance.

We do have a test for when we reopen in a non-interactive way:

{ok, Db1} = fabric2_db:open(DbName, []),
?assertEqual(ok, fabric2_db:delete(DbName, [])),
?assertMatch({ok, _}, fabric2_db:create(DbName, [])),
?assertError(database_does_not_exist, fabric2_db:get_db_info(Db1)),

@davisp
Copy link
Member

davisp commented Jul 31, 2020

Ah, good catch on there being a test already. Lets add a comment there to let future us know there's a subtle thing being tested with the new interactive flag.

Previously, if a database was re-created on another node, a request with that
database might have found the previous db instance in the cache. In that case
it would have correctly reopened the db while in a transaction, but, because
the old db instance was deleted it would throw a database_does_not_exist which
was not the correct behavior.

To prevent that from happening, introduce an interactive = true|false option
when opening a database. User requests may specify that option and then when
the db is re-opened, it will allow it to automatically upgrade to the new db
instance instead returning an error.

Background processes will still get a database_doest_not_exist error if they
keep a db open which has now been re-created.

The interactive option may also be used in the future to set other transaction
parameters like timeouts and retries that might be different for interactive
requests vs background tasks.
Copy link
Member

@davisp davisp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@nickva nickva merged commit 8b49b0d into prototype/fdb-layer Jul 31, 2020
@nickva nickva deleted the fix-db-recreation-bug branch July 31, 2020 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants