New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

first() should set a `LIMIT 1` #959

Closed
keredson opened this Issue Jun 3, 2016 · 19 comments

Comments

Projects
None yet
5 participants
@keredson
Contributor

keredson commented Jun 3, 2016

when looking at #958, i noticed:

class Person(pw.Model):
  name = pw.CharField()
  class Meta:
    database = db

alice = Person.create(name='Alice')
print Person.select().where(Person.name=='Alice').first()

generates this sql:

SELECT "t1"."id", "t1"."name" FROM "person" AS t1 WHERE ("t1"."name" = ?)

why would this not set a LIMIT 1? even if the library code stops the underlying sql lib's iteration it's likely already transferred way more data than that. (psql's default is 1000 rows at a time IIRC)

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

It is a design decision...and documented. Seriously what is wrong with you --you're creating issues for documented design decisions you disagree with. I get it that you may not understand why something is a certian way, but creating an issue telling me to change an API because it's incorrect in your opinion is not a good way to win me over (implied by "first() should set a LIMIT 1").

Docs:

http://docs.peewee-orm.com/en/latest/peewee/api.html#SelectQuery.first

@coleifer coleifer closed this Jun 3, 2016

@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

Seriously what is wrong with you.

i don't see the utility of your personal attacks here.

maybe it was a design decision, maybe it was a post-hoc justification of an unintentional implementation quirk, it's impossible for me to tell. either way "caching the entire result set when the user asked for the first record because they might want more later" is a weak argument for such surprising behavior. is questioning that choice (regardless of how you got there) not legitimate in your mind?

@vkhvst

This comment has been minimized.

vkhvst commented Jun 3, 2016

@keredson makes a reasonable argument for setting limit 1 especially because the result set can be very large.. at the same time, it would be good to know if people expect this caching behavior and use it for that reason

@arikfr

This comment has been minimized.

arikfr commented Jun 3, 2016

To be honest, I find this behavior surprising too. It's true the documentation mentions this, but maybe it's worth putting a clearer message about this. @coleifer would you mind a pull request with this kind of update to the documentation?

@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

just to be clear, i'm not advocating changing this behavior in a point release or anything. but if a major 3.0 rewrite is in the works (which @coleifer has already indicated would be API changing), that seems like a reasonable opportunity. (possibly in v2.9 if there was a clear consensus, or some version of from peewee.__future__ import first, or any number of other obvious options, but all that discussion seems secondary to "is this a good idea at all?")

perhaps a better name for the existing behavior would be peek()?

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

maybe it was a post-hoc justification of an unintentional implementation quirk, it's impossible for me to tell

Well, if you look at the git commit history you'll see that it clearly was that way intentionally from the outset. Run a git blame on the first() method and the first() docs and you'll see.

But furthermore, who are you to even suggest that it's a post-hoc justification of something unintended? That's rude to me, as the primary author of the project, and insinuates that I create bad APIs, then try to justify them by coming up with bogus reasons in the docs. That's not a helpful thing to say, nor is it appropriate.

For those wondering why I'm upset with @keredson, here's the context of my earlier comment... #958 (comment)

I do agree that the documentation should be more clear and would accept a PR or can just type it up real quick either way.

Regarding changing the semantics, I don't feel that is necessary since .get() already has the semantics you're looking for: it will retrieve an object and apply a LIMIT 1. I believe the reason I left .first() as it is, is so that multiple calls to .first() do not necessarily result in multiple queries, as they would with .get().

@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

I believe the reason I left .first() as it is, is so that multiple calls to .first() do not necessarily result in multiple queries

yeah, i just noticed that myself. that should also probably be added to the documentation, as most of the rest of the API is non-caching. (i'd argue that caching the results is something that should be done very sparingly, as the user can easily do it themselves by wrapping w/ list(), etc.)

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

i'd argue that caching the results is something that should be done very sparingly, as the user can easily do it themselves by wrapping w/ list(), etc.

It is done sparingly. And I'd argue it's a bad API to require a user to wrap something in a list when Peewee itself can determine whether or not the results from a previous iteration are still valid.

keredson added a commit to keredson/peewee that referenced this issue Jun 3, 2016

added a `limit 1` to calls to `first()` as discussed in coleifer#959
previous result set caching behavior was moved to a new method `peek()`
@arikfr

This comment has been minimized.

arikfr commented Jun 3, 2016

I don't feel that is necessary since .get() already has the semantics
you're looking for

Not exactly. get() throws an exception if no value is found while first()
returns None.
On Jun 3, 2016 5:05 PM, "Charles Leifer" notifications@github.com wrote:

Context of my earlier comment... #958 (comment)
#958 (comment)

At any rate, I think the documentation should be more clear and would
accept a PR or can just type it up real quick either way.

Regarding changing the semantics, I don't feel that is necessary since
.get() already has the semantics you're looking for. I believe the reason I
left .first() as it is, is so that multiple calls to .first() do not
necessarily result in multiple queries, as they would with .get().


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#959 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AAEXLEn0S0-nJkdAtV71PjLZDa-QQPmwks5qIDSKgaJpZM4ItJhE
.

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

That's correct -- I should have been more clear -- I just meant that .get() applies a LIMIT and will either return a record or raise an exception. It'd of course be easy to add a wrapper to your own implementation:

class MyModel(Model):
    @classmethod
    def get_or_none(cls, *query):
        try:
            return cls.get(*query)
        except cls.DoesNotExist:
            pass

coleifer pushed a commit that referenced this issue Jun 3, 2016

@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

It'd of course be easy to add a wrapper to your own implementation:

if i did this, does it work generally, or just on the base model class? can i do MyModel.select().where(something).get_or_none() just by defining this method?

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

Um, wow, that's something you can probably type into a python console and try out. -- And regarding the second part of the question, no unfortunately, you would need to modify the select query class in that case.

@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

wow, that's something you can probably type into a python console and try out.

ironically, i meant it as a compliment. i assumed it did, otherwise your suggestion would make no sense. i was expecting a response along the lines of "yep! pretty cool, eh?" but you're right, it can't.

class Person(pw.Model):
  name = pw.CharField()
  class Meta:
    database = db
  @classmethod
  def get_or_none(cls, *query):
      try:
          return cls.get(*query)
      except cls.DoesNotExist:
          pass

print Person.get_or_none(Person.name=="derek")
print Person.select().where(Person.name=="derek").get_or_none()
$ python test2.py
None
Traceback (most recent call last):
  File "test2.py", line 34, in <module>
    print Person.select().where(Person.name=="derek").get_or_none()
AttributeError: 'SelectQuery' object has no attribute 'get_or_none'

that's pretty useless in this context.

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

Well, typically you call .get() from the context of a model class, so adding the classmethod to the model class should cover you.

In [1]: db = SqliteDatabase(':memory:')

In [2]: class Base(Model):
   ...:     class Meta:
   ...:         database = db
   ...:     @classmethod
   ...:     def get_or_none(cls, *query):
   ...:         try:
   ...:             return cls.get(*query)
   ...:         except cls.DoesNotExist:
   ...:             pass
   ...:         

In [3]: class Person(Base):
   ...:     name = CharField()
   ...:     

In [4]: Person.create_table()

In [5]: Person.create(name='charlie')
Out[5]: <__main__.Person at 0x7fa54d378810>

In [6]: Person.get_or_none(Person.name == 'charlie').name
Out[6]: u'charlie'

In [7]: Person.get_or_none(Person.name == 'nobody')
@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

it's much more common to do Class.ALL.join(x).join(y).get(unique_condition). honestly i'm rarely just getting one row by itself.

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

Wait, huh? Peewee doesn't have a Class.ALL. You could instead do this:

Person.join(x).join(y).limit(1).first()
@keredson

This comment has been minimized.

Contributor

keredson commented Jun 3, 2016

Wait, huh?

I mean:

I rarely need to do: Person.get(condition) or .first() or any of those top level terminating methods on the models just by themselves. (Which is the only circumstance under which your get_or_none suggestion is useful.)

Almost always I'm getting an object with some subset of it's related objects through some sort of join API.

This is exceedingly common in my experience for any non-trivial app. Which is why the usability of the join API is so important in an ORM. Even minor (or seemingly trivial) improvements have large scale effects when they're in APIs you're using all day every day, multiplied by the number of developers in your organization. If I seem passionate about what seem to you like minor "that at most just saves a few seconds" style issues this is why. I'm looking at the aggregate.

@coleifer

This comment has been minimized.

Owner

coleifer commented Jun 3, 2016

Well actually first() is only available on the SelectQuery class, while get() is on both.

@harry91

This comment has been minimized.

harry91 commented Jan 3, 2018

@coleifer I really don't think @keredson was rude in the first message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment