Allow SQLAlchemy factories to be committed on create #309

Closed
zyskowsk opened this Issue Jun 22, 2016 · 21 comments

Comments

Projects
None yet
4 participants
@zyskowsk

It seems like the general consensus is that one should only flush the current session during a test case, and roll back on each tear down. Due to the nature of my testing setup it is more convenient to commit instead of flush on create. Have you considered adding a meta flag similar to force_flush that would allow the session to be committed?

for example:

class SQLAlchemyModelFactory(base.Factory):
    """Factory for SQLAlchemy models. """

    _options_class = SQLAlchemyOptions
    class Meta:
        abstract = True

    @classmethod
    def _create(cls, model_class, *args, **kwargs):
        """Create an instance of the model, and save it to the database."""
        session = cls._meta.sqlalchemy_session
        obj = model_class(*args, **kwargs)
        session.add(obj)
+       if cls._meta.force_commit:
+          session.commit()
        elif cls._meta.force_flush:
            session.flush()
        return obj

overriding SQLAlchemyModelFactory._create is my current solution. Any suggestions on how to properly have Factories be committed on creation is welcome! Thanks!

@jeffwidman

This comment has been minimized.

Show comment
Hide comment
@jeffwidman

jeffwidman Jun 22, 2016

Member

I think this was solved by #262

I'm going to close this, but if for some reason you're looking for something different leave a comment and we can re-open.

Member

jeffwidman commented Jun 22, 2016

I think this was solved by #262

I'm going to close this, but if for some reason you're looking for something different leave a comment and we can re-open.

@jeffwidman jeffwidman closed this Jun 22, 2016

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jun 23, 2016

Hey @jeffwidman thanks for the quick response! enabling force_flush is not exactly what I am looking for. Let me try and better explain my situation, I may be missing something simple.

The situation I am running into is that I have server-side defaults on all of my models for created_at and updated_at. These defaults are set with db.func.now() in postgres, which is defined by the start time of the last transaction.

This means that if I only ever call session.flush(), even if I issue an INSERT followed by an UPDATE, on the same row, the updated_at value will remain the same until the current transaction is committed.

I am hoping to test that when I update an object through an api, the updated_at value changes. However, when I create a factory with force_flush enabled, and then issue an update on that row late, the updated_at column still reflects the start time of the open transaction.

For this reason I want Factory.create call to issue a commit on the session, not just a flush.

Hey @jeffwidman thanks for the quick response! enabling force_flush is not exactly what I am looking for. Let me try and better explain my situation, I may be missing something simple.

The situation I am running into is that I have server-side defaults on all of my models for created_at and updated_at. These defaults are set with db.func.now() in postgres, which is defined by the start time of the last transaction.

This means that if I only ever call session.flush(), even if I issue an INSERT followed by an UPDATE, on the same row, the updated_at value will remain the same until the current transaction is committed.

I am hoping to test that when I update an object through an api, the updated_at value changes. However, when I create a factory with force_flush enabled, and then issue an update on that row late, the updated_at column still reflects the start time of the open transaction.

For this reason I want Factory.create call to issue a commit on the session, not just a flush.

@jeffwidman jeffwidman reopened this Jun 23, 2016

@jeffwidman

This comment has been minimized.

Show comment
Hide comment
@jeffwidman

jeffwidman Jun 23, 2016

Member

Oh my bad and you even mentioned force_flush in your initial comment. Sorry about that.

I'm not opposed to adding a force_commit, and there may be times it's useful, but I'm not sure it's necessary for this particular situation situation. While the SQL standard says now() and current_timestamp() are the start of the transaction, PostgreSQL also supports the non-SQL-standard times clock_timestamp() and statement_timestamp() which do change during a transaction:
https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

So if you call one of those, as long as the statement is flushed to the DB, it should get the actual timestamp as of right now. IIRC, you can call them from SQLAlchemy via db.func.statement_timestamp() and SQLAlchemy should pass the literal function name straight through.

Lastly, this is an aside from the actual issue but I'm curious how you're setting server-side defaults for updated_at? Unlike created_at, SQLAlchemy's implementation of updated_at as a server-side default doesn't actually modify the database, it only tells SQLAlchemy to not set the updated_at on commit because you've setup your DB somehow to handle it. I'm curious what route you went there?

Member

jeffwidman commented Jun 23, 2016

Oh my bad and you even mentioned force_flush in your initial comment. Sorry about that.

I'm not opposed to adding a force_commit, and there may be times it's useful, but I'm not sure it's necessary for this particular situation situation. While the SQL standard says now() and current_timestamp() are the start of the transaction, PostgreSQL also supports the non-SQL-standard times clock_timestamp() and statement_timestamp() which do change during a transaction:
https://www.postgresql.org/docs/current/static/functions-datetime.html#FUNCTIONS-DATETIME-CURRENT

So if you call one of those, as long as the statement is flushed to the DB, it should get the actual timestamp as of right now. IIRC, you can call them from SQLAlchemy via db.func.statement_timestamp() and SQLAlchemy should pass the literal function name straight through.

Lastly, this is an aside from the actual issue but I'm curious how you're setting server-side defaults for updated_at? Unlike created_at, SQLAlchemy's implementation of updated_at as a server-side default doesn't actually modify the database, it only tells SQLAlchemy to not set the updated_at on commit because you've setup your DB somehow to handle it. I'm curious what route you went there?

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jun 24, 2016

Hey! No worries. Thanks for the tip on clock_timestamp() and statement_timestamp(). I think I am still going to commit on factory create in the spirit of keeping my testing env as close to prod as possible. I try and open up a PR today and we can move the conversation over there.

As for my updated_at setup, i misspoke (or spoke lazily) I am simply using the SQLAlchemy standard onupdate=db.func.now().

Hey! No worries. Thanks for the tip on clock_timestamp() and statement_timestamp(). I think I am still going to commit on factory create in the spirit of keeping my testing env as close to prod as possible. I try and open up a PR today and we can move the conversation over there.

As for my updated_at setup, i misspoke (or spoke lazily) I am simply using the SQLAlchemy standard onupdate=db.func.now().

@jeffwidman

This comment has been minimized.

Show comment
Hide comment
@jeffwidman

jeffwidman Jun 24, 2016

Member

Sounds fine, I'm certainly open to the PR.

For the implementation, I'd like to deprecate the force_flush() API and instead switch to a forcing param that accepts False/``flush/commit`, as it makes no sense to force a `flush` and a `commit`. It should default to `False`. I'm not sure what a good name is for the param though... thoughts?

At the next major version bump we can drop the force_flush(), which shouldn't impact that many folks as we just added it recently. That'll also be a good time to drop the deprecated fuzzy fixtures that we're instead going to source from faker (#271). I added this to the 3.0 milestone as a reminder to remove deprecated functionality.

Member

jeffwidman commented Jun 24, 2016

Sounds fine, I'm certainly open to the PR.

For the implementation, I'd like to deprecate the force_flush() API and instead switch to a forcing param that accepts False/``flush/commit`, as it makes no sense to force a `flush` and a `commit`. It should default to `False`. I'm not sure what a good name is for the param though... thoughts?

At the next major version bump we can drop the force_flush(), which shouldn't impact that many folks as we just added it recently. That'll also be a good time to drop the deprecated fuzzy fixtures that we're instead going to source from faker (#271). I added this to the 3.0 milestone as a reminder to remove deprecated functionality.

@jeffwidman jeffwidman added this to the 3.0 milestone Jun 24, 2016

@rbarrois

This comment has been minimized.

Show comment
Hide comment
@rbarrois

rbarrois Jun 26, 2016

Member

@jeffwidman indeed, if we want to support multiple post-creation behaviors, a unique parameters seems to be the way to go.

What about sqlalchemy_session_mode = 'donothing' | 'flush' | 'commit'? I'd rather avoid mixed types (False vs 'flush').

Member

rbarrois commented Jun 26, 2016

@jeffwidman indeed, if we want to support multiple post-creation behaviors, a unique parameters seems to be the way to go.

What about sqlalchemy_session_mode = 'donothing' | 'flush' | 'commit'? I'd rather avoid mixed types (False vs 'flush').

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jun 27, 2016

Hey y'all!

I agree, moving to a single flag makes a lot of sense. I propose sqlalchemy_session_action = None | 'flush' | 'commit', very similar to what @rbarroios proposed, except that I don't feel terrible about mixing types if the default type is None. It feels strange to have a default string type that is never used.

Also definitely excited that you plan on deprecating fuzzy fixtures and to rely on Faker!

I just opened a pr at #310. I suggest we move this conversation over there.

zyskowsk commented Jun 27, 2016

Hey y'all!

I agree, moving to a single flag makes a lot of sense. I propose sqlalchemy_session_action = None | 'flush' | 'commit', very similar to what @rbarroios proposed, except that I don't feel terrible about mixing types if the default type is None. It feels strange to have a default string type that is never used.

Also definitely excited that you plan on deprecating fuzzy fixtures and to rely on Faker!

I just opened a pr at #310. I suggest we move this conversation over there.

@jeffwidman

This comment has been minimized.

Show comment
Hide comment
@jeffwidman

jeffwidman Jun 27, 2016

Member

What about sqlalchemy_session_persistence = None | 'flush' | 'commit'?

Agree that it's generally weird to mix types, but None feels right for the default... my original suggestion of False is a bad idea.

_persistence seems more descriptive than _mode or _action, although I don't have a strong opinion either way.

FYI: Let's keep the general design discussion here in the issue. Comments on the specific code implementation can go in the PR.

Sometimes in the world of open source PRs get abandoned and since they can't be re-assigned to another user, we as maintainers have to close them eventually... it's annoying when another person comes along to fix later to say "please look at this issue, plus the other notes over in this rejected PR". Better thing is keep the general design discussion in the issue, then everything is in one place for someone else to take a stab at later if necessary.

Member

jeffwidman commented Jun 27, 2016

What about sqlalchemy_session_persistence = None | 'flush' | 'commit'?

Agree that it's generally weird to mix types, but None feels right for the default... my original suggestion of False is a bad idea.

_persistence seems more descriptive than _mode or _action, although I don't have a strong opinion either way.

FYI: Let's keep the general design discussion here in the issue. Comments on the specific code implementation can go in the PR.

Sometimes in the world of open source PRs get abandoned and since they can't be re-assigned to another user, we as maintainers have to close them eventually... it's annoying when another person comes along to fix later to say "please look at this issue, plus the other notes over in this rejected PR". Better thing is keep the general design discussion in the issue, then everything is in one place for someone else to take a stab at later if necessary.

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jun 28, 2016

Good point on keeping the conversation isolated in one location, I like that idea a lot. I also agree that _persistance is more descriptive. My vote goes for

sqlalchemy_session_persistence = None | 'flush' | 'commit'.

@rbarrois do you have a strong opinion between _persistance, _mode, _action ?

Good point on keeping the conversation isolated in one location, I like that idea a lot. I also agree that _persistance is more descriptive. My vote goes for

sqlalchemy_session_persistence = None | 'flush' | 'commit'.

@rbarrois do you have a strong opinion between _persistance, _mode, _action ?

@antoine-lizee

This comment has been minimized.

Show comment
Hide comment
@antoine-lizee

antoine-lizee Jul 4, 2016

I just want to confirm that this will end up changing the behavior of all the MyFactory(...) calls as it is by default a redirection to MyFactory.create(...).

antoine-lizee commented Jul 4, 2016

I just want to confirm that this will end up changing the behavior of all the MyFactory(...) calls as it is by default a redirection to MyFactory.create(...).

@jeffwidman

This comment has been minimized.

Show comment
Hide comment
@jeffwidman

jeffwidman Jul 5, 2016

Member

Yes, the default strategy for MyFactory(...) is create(), so whatever you set as the sqlalchemy_session_persistence (or whatever the arg gets named) will affect MyFactory(...) calls. However, the default for sqlalchemy_session_persistence will be to not persist, so effectively the default behavior for MyFactory(...) calls does not change.

Member

jeffwidman commented Jul 5, 2016

Yes, the default strategy for MyFactory(...) is create(), so whatever you set as the sqlalchemy_session_persistence (or whatever the arg gets named) will affect MyFactory(...) calls. However, the default for sqlalchemy_session_persistence will be to not persist, so effectively the default behavior for MyFactory(...) calls does not change.

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jul 5, 2016

Exactly, the default persistence strategy will not be changing, so honestly this doesn't change the behavior of constructor calls. Of course anyone relying on 'MyFactory(...)' to flush the session via 'force_flush' will have to switch to 'sqlalchemy_session_persistence' (or what ever we settle on) but other than that this change shouldn't change any existing behavior

zyskowsk commented Jul 5, 2016

Exactly, the default persistence strategy will not be changing, so honestly this doesn't change the behavior of constructor calls. Of course anyone relying on 'MyFactory(...)' to flush the session via 'force_flush' will have to switch to 'sqlalchemy_session_persistence' (or what ever we settle on) but other than that this change shouldn't change any existing behavior

@antoine-lizee

This comment has been minimized.

Show comment
Hide comment
@antoine-lizee

antoine-lizee Jul 6, 2016

Just wanted to point it out - I'm totally fine with it. We're actually using build() as the default strategy so the behavior is closer to the real object constructor anyway.

Just wanted to point it out - I'm totally fine with it. We're actually using build() as the default strategy so the behavior is closer to the real object constructor anyway.

@rbarrois

This comment has been minimized.

Show comment
Hide comment
@rbarrois

rbarrois Jul 9, 2016

Member

@zyskowsk it seems that persistence provides the best description of this!

Regarding the deprecation path, what do you think about the following plan:

  • Release 2.8.0 contains the new flag, and auto-sets it from force_flush — with a DeprecationWarning
  • Release 3.0.0 drops force_flush

Beyond this, I'm unsure as to the impact on .build() / .create(): if the default is kept at None (i.e "do nothing"), then:

  • Calls to .build() won't be affected
  • Calls to .create() won't change from the current ones
  • Users will have to opt-in to that new behavior.

Final question: I'm not familiar with SQLAlchemy; would users want to set that persistence=commit flag globally instead of a per-factory basis?

Member

rbarrois commented Jul 9, 2016

@zyskowsk it seems that persistence provides the best description of this!

Regarding the deprecation path, what do you think about the following plan:

  • Release 2.8.0 contains the new flag, and auto-sets it from force_flush — with a DeprecationWarning
  • Release 3.0.0 drops force_flush

Beyond this, I'm unsure as to the impact on .build() / .create(): if the default is kept at None (i.e "do nothing"), then:

  • Calls to .build() won't be affected
  • Calls to .create() won't change from the current ones
  • Users will have to opt-in to that new behavior.

Final question: I'm not familiar with SQLAlchemy; would users want to set that persistence=commit flag globally instead of a per-factory basis?

@jeffwidman

This comment has been minimized.

Show comment
Hide comment
@jeffwidman

jeffwidman Jul 10, 2016

Member

would users want to set that persistence=commit flag globally instead of a per-factory basis?

Yes, definitely. There might be some edge cases where you only want to set it on a specific factory, or override the global setting in a specific factory, but most common case you'll want to set it globally.

Interested in hearing other opinions on this.

Member

jeffwidman commented Jul 10, 2016

would users want to set that persistence=commit flag globally instead of a per-factory basis?

Yes, definitely. There might be some edge cases where you only want to set it on a specific factory, or override the global setting in a specific factory, but most common case you'll want to set it globally.

Interested in hearing other opinions on this.

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jul 12, 2016

Hey y'all! Sorry i got side tracked for a bit there. @rbarrois as @jeffwidman mentioned, yes it is very useful to be able to set persistence=commit globally. Personally, I would concerned if i were working in a testing environnent that set session level persistance behavior on a per factory basis, that seems scary (but I agree with @jeffwidman that there may be some edge cases so we should support it). Currently we have a base factory class that has this behavior which all other factories inherit.

I plan on updating the PRq in the next few days so we can get this thing merged.

Hey y'all! Sorry i got side tracked for a bit there. @rbarrois as @jeffwidman mentioned, yes it is very useful to be able to set persistence=commit globally. Personally, I would concerned if i were working in a testing environnent that set session level persistance behavior on a per factory basis, that seems scary (but I agree with @jeffwidman that there may be some edge cases so we should support it). Currently we have a base factory class that has this behavior which all other factories inherit.

I plan on updating the PRq in the next few days so we can get this thing merged.

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jul 14, 2016

Yo, I just pushed an update on #310 that uses force_flush if true and issues a DeprecationWarning. I looked around and it seemed to me that we don't have a well defined pattern for setting global state within an app using factory_boy, so I left out a way to set sqlalchemy_session_persistence globally until we have a conversation about global options setting in general.

Personally I actually don't mind creating a base class for my app and setting global options there, but if we were to start supporting a more standard way of setting global options I would be in favor of having a config file that factory_boy reads from. I am not sure exactly what this would look like though. Are there other option values that would make sense to set globally across multiple factories? I also may be missing something here!

If there are more options that would be helpful to set globally, and there is not a current way of doing this I vote for doing that all at once and making it a new PR.

Yo, I just pushed an update on #310 that uses force_flush if true and issues a DeprecationWarning. I looked around and it seemed to me that we don't have a well defined pattern for setting global state within an app using factory_boy, so I left out a way to set sqlalchemy_session_persistence globally until we have a conversation about global options setting in general.

Personally I actually don't mind creating a base class for my app and setting global options there, but if we were to start supporting a more standard way of setting global options I would be in favor of having a config file that factory_boy reads from. I am not sure exactly what this would look like though. Are there other option values that would make sense to set globally across multiple factories? I also may be missing something here!

If there are more options that would be helpful to set globally, and there is not a current way of doing this I vote for doing that all at once and making it a new PR.

@rbarrois

This comment has been minimized.

Show comment
Hide comment
@rbarrois

rbarrois Jul 27, 2016

Member

@zyskowsk thanks for the update, it looks pretty well!

I don't mind handling the deprecation path myself, it's quite complex to get warnings to show up in the proper place ;)

As for the global option, it would be interesting to have such a mechanism — we already have a custom way to prime Fuzzy's random seed, and more configuration would be required for the auto_factory branch, so that's definitely something we need on some roadmap.

However, it's not strictly required for this feature, although a "nice to have" :)

So, I'm 👍 on merging this, and including it in the next release — let's plan for "in the next few weeks"?

Member

rbarrois commented Jul 27, 2016

@zyskowsk thanks for the update, it looks pretty well!

I don't mind handling the deprecation path myself, it's quite complex to get warnings to show up in the proper place ;)

As for the global option, it would be interesting to have such a mechanism — we already have a custom way to prime Fuzzy's random seed, and more configuration would be required for the auto_factory branch, so that's definitely something we need on some roadmap.

However, it's not strictly required for this feature, although a "nice to have" :)

So, I'm 👍 on merging this, and including it in the next release — let's plan for "in the next few weeks"?

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Jul 28, 2016

Awesome! Thanks @rbarrois, i just fixed a few small things on the branch. Are you going to commit the warning changes to the same branch? I would love to follow along

Awesome! Thanks @rbarrois, i just fixed a few small things on the branch. Are you going to commit the warning changes to the same branch? I would love to follow along

@rbarrois

This comment has been minimized.

Show comment
Hide comment
@rbarrois

rbarrois Aug 6, 2016

Member

@zyskowsk I've added the warnings change; but messed up with github's UI so the changes arrived in zyskowsk#1 ...

I also improved the mocking code, based on https://docs.python.org/3/library/unittest.mock.html#auto-speccing.

Let me know what you think of it :-)

Member

rbarrois commented Aug 6, 2016

@zyskowsk I've added the warnings change; but messed up with github's UI so the changes arrived in zyskowsk#1 ...

I also improved the mocking code, based on https://docs.python.org/3/library/unittest.mock.html#auto-speccing.

Let me know what you think of it :-)

@rbarrois rbarrois removed the DesignDecision label Aug 6, 2016

@zyskowsk

This comment has been minimized.

Show comment
Hide comment
@zyskowsk

zyskowsk Aug 12, 2016

@rbarrios, looks great to me, thanks for the improvements, I merged your changes into my branch

zyskowsk commented Aug 12, 2016

@rbarrios, looks great to me, thanks for the improvements, I merged your changes into my branch

@rbarrois rbarrois closed this in 3e28a95 Aug 13, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment