Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not permanently give connections out when starting the sandbox #96

Closed
josevalim opened this issue Oct 10, 2017 · 16 comments
Closed

Do not permanently give connections out when starting the sandbox #96

josevalim opened this issue Oct 10, 2017 · 16 comments
Milestone

Comments

@josevalim
Copy link
Member

See discussion here: https://elixirforum.com/t/phoenix-testing-with-ecto-2-sandbox-access-from-processes/9174/9

@fishcakez is this something we can do?

@josevalim
Copy link
Member Author

@fishcakez yes, it is something we can do. :) Please see attached PR.

The first commit reclaims connections when we change the mode to shared. The second commit generalizes the logic and reclaims whenever there is a mode change.

Another option is to make the reclaiming explicit and users can do it whenever they change the mode (or the Ecto sandbox could encapsulate it). Although I believe this provides a better behaviour given that the Ecto sandbox already has this:

    # If the mode is set to anything but shared, let's
    # automatically checkin the current connection to
    # force it to act according to the chosen mode.
    if mode in [:auto, :manual] do
      checkin(repo, [])
    end

@josevalim
Copy link
Member Author

josevalim commented Oct 10, 2017

Btw, we have Ecto failures when running on DBConnection master, we may need to make this DBConnection 2.0 (not this particular change but the next DBConnection version altogether).

@fishcakez
Copy link
Member

fishcakez commented Oct 10, 2017

Master should be backwards compatible, I likely missed something 🤔 . This change would make it backwards incompatible though. I am also worried that checkin all will give processes impossible database view.

For example if a GenServer runs a query on init/1 and automatically gets a sandbox. Then shared mode is activated and the connection is checked in. Then the next query the GenServer runs will be in the shared sandbox, which will be different to the initial one. Therefore I don't think we should make this change, and the desired behavior in the forum post feels like a bug to me.

I think we need to look an alternative as both suggestions can hit this issue. Rather than switching the sandbox, I think we would prefer to delay sandbox selection until the test is setup. We could do this by forcing the processes into a queue until shared mode is setup (or a different mode change). Something like starting the repo in mode: :block, processes in the application might make calls but will be blocking, as if in pool queue, then mode change resolves all blocking processes: either they all fail with manual or all get a sandbox with auto/shared.

The manual behavior seems undesirable or awkward. It might be we can add an option to queue implicit checkouts in manual mode. Also any queries in an init/1 would block and prevent the application from starting. The mode couldn't change because the applications must be started before the tests (by default). This could potentially force people to write more resilient code but it is likely too heavy handed.

I think blocking beforehand is preferable to revoking, but the approach needs some more thought to get it to work nicely.

@josevalim
Copy link
Member Author

We cannot block because sometimes we need to do operations before the test starts, such as running migrations, seeding immutable data to the DB, etc.

@fishcakez
Copy link
Member

Can we commence blocking after the setup but before the application is started?

@josevalim
Copy link
Member Author

@fishcakez I tried to explore this today but it is tricky because the repository starts with the application so we would need a mechanism where we configure the repository before it is even running and also set flags such as mix test --no-start. :(

@fishcakez
Copy link
Member

My previous advice in this situation has been to stop required apps, set shared mode and restart them. I think we need to make it more obvious when something like this is required, and what these processes are. Perhaps we can log a warning including all the processes that currently own a sandbox when the mode changes to shared, or more drastically kill them.

@josevalim
Copy link
Member Author

Can you really restart the app? The issue is that we can only set the mode after the app starts and by then the service may already have acquired a connection.

@fishcakez
Copy link
Member

It depends if the repo is in the app or not. If it is in the same app it requires using configuration to set the mode (or defaulting to shared). It would probably be done using an init callback to switch the configuration based on another side effect.

@josevalim
Copy link
Member Author

@fishcakez which is a lot of indirection to get the desired behaviour. I can totally understand not merging the proposed PR but we need better alternatives so we solve 1. the discovery issue and 2. the issue in itself.

@fishcakez
Copy link
Member

If we revoke existing sandboxes on a mode change then the process owning or allowed on the sandbox are in inconsistent or bad state. This means we would need to kill them if we want to continue with a mode change, much like a code load. We can log an error or warning for every sandbox we kill.

Note that this bug exists for processes allowed on a sandbox when the owner checks in...

@josevalim
Copy link
Member Author

@fishcakez would it be possible for us to warn/raise if we are in shared mode and you are using a connection that is not the shared one?

@fishcakez
Copy link
Member

fishcakez commented Oct 12, 2017

@josevalim, do you mean instead of killing the process, delay the error until a process tries to do a connection checkout (checkout the connection from an existing sandbox that is now stale and so no longer backed by a connection)? Yes we can do that. We may want to consider that for processes that do queries on a sandbox via an allow and then lose the sandbox because of owner checkin.

@josevalim
Copy link
Member Author

@fishcakez yeah, I think that is probably the best approach then. I will give it a try later.

@fishcakez fishcakez added this to the v1.2 milestone Oct 22, 2017
@fishcakez
Copy link
Member

I implemented the errors at https://github.com/elixir-ecto/db_connection/tree/jf-shared-stop. I am unsure how worthwhile the change is because we are forced to treat {shared, pid} as an explicit allow. Otherwise it forces an application restart between shared tests because pids previously allowed via a previous shared mode either need explicit allowance (i.e. not shared) or to be restarted. Thats possible without this change, and anyone wanting to ensure correctness would already do it. It would also creates unnecessary burden when testing longer lived stateless processes that don't become inconsistent.

Then only new "revoked" errors occur when in auto mode because an error already occurs in manual mode if not explicit ownership.

@fishcakez
Copy link
Member

Closing as #97 is merged but may need to revisit whether we need to error on revoking connections if leads to bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants