New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #18757, #14462, #21565 -- Reworked database-python type conversions #3047

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
5 participants
@mjtamlyn
Member

mjtamlyn commented Aug 12, 2014

Make resolve_columns a permanent fixture, remove convert_values.

Current implementation passes on sqlite at least, but does destroy performance. Need a nicer converter-collection based approach. Also some tests that we can use the new override from_db_value, and that #21565 is fixed in the process.

  • Make the tests pass
  • Refactor to use "convertor-collector" based system
  • Add tests for #21565
  • Add tests for #14462
  • Add documentation for from_db_value and appropriate test suite for it
@mjtamlyn

This comment has been minimized.

Show comment
Hide comment
@mjtamlyn

mjtamlyn Aug 16, 2014

Member

Woo! We have a green light. http://djangoci.com/job/django-pull-requests/738/

Updating the description with the current todo list.

Member

mjtamlyn commented Aug 16, 2014

Woo! We have a green light. http://djangoci.com/job/django-pull-requests/738/

Updating the description with the current todo list.

@mjtamlyn mjtamlyn changed the title from [wip] POC for #18757. to Fixed #18757, #14462, #21565 -- Reworked database-python type conversions Aug 19, 2014

@timgraham

View changes

Show outdated Hide outdated django/db/models/fields/subclassing.py
from django.utils.deprecation import RemovedInDjango20Warning
warnings.warn("SubfieldBase has been deprecated. Use from_db_value instead.",

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

Field.from_db_value()

@timgraham

timgraham Aug 19, 2014

Member

Field.from_db_value()

@timgraham

View changes

Show outdated Hide outdated tests/field_subclassing/fields.py
@@ -1,86 +0,0 @@
from __future__ import unicode_literals

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

Shouldn't these stay (with warnings silenced) until SubfieldBase is removed? Otherwise, it looks like we have no testing of it (unless I missed something).

@timgraham

timgraham Aug 19, 2014

Member

Shouldn't these stay (with warnings silenced) until SubfieldBase is removed? Otherwise, it looks like we have no testing of it (unless I missed something).

@timgraham

View changes

Show outdated Hide outdated docs/releases/1.8.txt
will be removed in Django 2.0. Historically, it was used to handle fields where
type conversion was needed when loading from the database, but it was not used
in ``.values()`` calls or in aggregates. It has been replaced with
:meth:`~Field.from_db_value`.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

Not in a current module, so need the full path to the method for the link to work.
Also please add to deprecation timeline.

@timgraham

timgraham Aug 19, 2014

Member

Not in a current module, so need the full path to the method for the link to work.
Also please add to deprecation timeline.

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
Converting database values to Python objects
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. versionchanged:: 1.8

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

versionchanged should be below the heading. Also, it's helpful to include a description ofwhat changed (e.g. SubfieldBase was the old way of doing things... now use from_db_value())

@timgraham

timgraham Aug 19, 2014

Member

versionchanged should be below the heading. Also, it's helpful to include a description ofwhat changed (e.g. SubfieldBase was the old way of doing things... now use from_db_value())

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
complex than strings, dates, integers or floats, then you may need to override
:meth:`~Field.to_python` and :meth:`~Field.from_db_value`.
If present for the field subclass, ``from_db_value`` will be called in all

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

If not using :meth: then please include () after method names to improve readability.

@timgraham

timgraham Aug 19, 2014

Member

If not using :meth: then please include () after method names to improve readability.

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
If present for the field subclass, ``from_db_value`` will be called in all
circumstances when the data is loaded from the database, including in
aggregates and values calls.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

In the release notes, you used .values() which seems more clear. I guess simply "values" could refer to .values() and .values_list() but perhaps this could be clarified.

@timgraham

timgraham Aug 19, 2014

Member

In the release notes, you used .values() which seems more clear. I guess simply "values" could refer to .values() and .values_list() but perhaps this could be clarified.

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
circumstances when the data is loaded from the database, including in
aggregates and values calls.
``to_python`` is called by deserialization and during the :meth:`~Field.clean`

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

Field.clean is a broken link right now as that method isn't documented.

@timgraham

timgraham Aug 19, 2014

Member

Field.clean is a broken link right now as that method isn't documented.

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
:meth:`~Field.to_python`. As a general rule, the method should deal gracefully
with any of the following arguments:
complex than strings, dates, integers or floats, then you may need to override
:meth:`~Field.to_python` and :meth:`~Field.from_db_value`.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

I'd switch the order here to match the order in which you introduce the methods

@timgraham

timgraham Aug 19, 2014

Member

I'd switch the order here to match the order in which you introduce the methods

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
:meth:`.to_python`::
database, so we need to be able to process strings and ``None`` in the
``from_db_value``. In the ``to_python``, we need to also handle ``Hand``
instances.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

Need to use code-block or :: here for syntax highlighting.

@timgraham

timgraham Aug 19, 2014

Member

Need to use code-block or :: here for syntax highlighting.

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
if value is None:
return value
p1 = re.compile('.{26}')

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

A description about what's going on here (expected inputs/outputs) would be helpful. The purpose isn't clear to me.

@timgraham

timgraham Aug 19, 2014

Member

A description about what's going on here (expected inputs/outputs) would be helpful. The purpose isn't clear to me.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

(also move to class or module level to avoid repetition?)

@timgraham

timgraham Aug 19, 2014

Member

(also move to class or module level to avoid repetition?)

@timgraham

View changes

Show outdated Hide outdated docs/ref/models/fields.txt
Converts a value as returned by the database (or a serializer) to a
Python object. It is the reverse of :meth:`get_prep_value`.
Converts a value as returned by the database to a Python object. It is
the revere of :meth:`get_prep_value`.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

reverse*

@timgraham

timgraham Aug 19, 2014

Member

reverse*

@timgraham

View changes

Show outdated Hide outdated docs/ref/models/fields.txt
Converts the value into the correct Python object. It acts as the
reverse of :meth:`value_to_string`, and is also called in
:meth:`clean`.

This comment has been minimized.

@timgraham

timgraham Aug 19, 2014

Member

Again, clean isn't documented.

@timgraham

timgraham Aug 19, 2014

Member

Again, clean isn't documented.

@shaib

View changes

Show outdated Hide outdated django/db/backends/oracle/base.py
converters.append(self.convert_datefield_value)
elif internal_type == 'TimeField':
converters.append(self.convert_timefield_value)
converters.append[self.convert_empty_values]

This comment has been minimized.

@shaib

shaib Aug 25, 2014

Member

append(...), not append[...].

@shaib

shaib Aug 25, 2014

Member

append(...), not append[...].

@mjtamlyn

This comment has been minimized.

Show comment
Hide comment
@mjtamlyn

mjtamlyn Aug 25, 2014

Member

Status update:

I've now got djangobench reporting no more than a 1.1x slowdown on any benchmark, with most running the same. There is basically no change in the performance on any standard benchmark. There will be a performance hit when resolve_columns is run, but on mysql at least this will now happen less often than it used to so a net win.

Documentation has been all tidied up as per @timgraham's comments.

I have some issues with some other databases which the CI server is finding which I should hopefully be able to pin down tomorrow morning. Once these are resolved we should be good to merge.

Any comments @akaariai?

Member

mjtamlyn commented Aug 25, 2014

Status update:

I've now got djangobench reporting no more than a 1.1x slowdown on any benchmark, with most running the same. There is basically no change in the performance on any standard benchmark. There will be a performance hit when resolve_columns is run, but on mysql at least this will now happen less often than it used to so a net win.

Documentation has been all tidied up as per @timgraham's comments.

I have some issues with some other databases which the CI server is finding which I should hopefully be able to pin down tomorrow morning. Once these are resolved we should be good to merge.

Any comments @akaariai?

@mjtamlyn

This comment has been minimized.

Show comment
Hide comment
@mjtamlyn

mjtamlyn Aug 26, 2014

Member

The CI is very flaky this morning... The most recent test run has a few configurations which died out due to failed checkouts, and some of the Oracle builds refused connections.

The one oracle build which worked - http://djangoci.com/job/django-pull-requests/814/database=oracle11,python=python2.7/console - seems to have a bunch of errors which I know I fixed in 86c2404, as well as a strange collection of unrelated failures largely seeming to result from a language leak. I guess this might happen where some other test is failing.

I'm going to try to retrigger the build and see if it is any happier second time round.

Member

mjtamlyn commented Aug 26, 2014

The CI is very flaky this morning... The most recent test run has a few configurations which died out due to failed checkouts, and some of the Oracle builds refused connections.

The one oracle build which worked - http://djangoci.com/job/django-pull-requests/814/database=oracle11,python=python2.7/console - seems to have a bunch of errors which I know I fixed in 86c2404, as well as a strange collection of unrelated failures largely seeming to result from a language leak. I guess this might happen where some other test is failing.

I'm going to try to retrigger the build and see if it is any happier second time round.

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
Historically, Django provided a metaclass called ``SubfieldBase`` which
always called :meth:`~Field.to_python` on assignment. This did not play
nicely with custom database transformations, aggregation or values queries,

This comment has been minimized.

@timgraham

timgraham Aug 26, 2014

Member

comma after aggregation
"it has" on next line

@timgraham

timgraham Aug 26, 2014

Member

comma after aggregation
"it has" on next line

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
complex than strings, dates, integers or floats, then you'll need to override
:meth:`~Field.to_python`. As a general rule, the method should deal gracefully
with any of the following arguments:
complex than strings, dates, integers or floats, then you may need to override

This comment has been minimized.

@timgraham

timgraham Aug 26, 2014

Member

comma after integers

@timgraham

timgraham Aug 26, 2014

Member

comma after integers

@timgraham

View changes

Show outdated Hide outdated docs/howto/custom-model-fields.txt
``value`` as an argument, like :meth:`~Field.to_python` and
:meth:`~Field.get_prep_value`, should handle the case when ``value`` is
``None``.
For the ``to_python()``, if anything goes wrong during value conversion, you

This comment has been minimized.

@timgraham

timgraham Aug 26, 2014

Member

chop "the"

@timgraham

timgraham Aug 26, 2014

Member

chop "the"

@timgraham

View changes

Show outdated Hide outdated docs/ref/models/fields.txt
The default implementation returns ``value``, which is the common case
when the database backend already returns the correct Python type.
This method is not used for most Django fields as the database backend

This comment has been minimized.

@timgraham

timgraham Aug 26, 2014

Member

Django -> built-in?

@timgraham

timgraham Aug 26, 2014

Member

Django -> built-in?

converters.append(self.convert_datefield_value)
elif internal_type == 'TimeField':
converters.append(self.convert_timefield_value)
converters.append(self.convert_empty_values)

This comment has been minimized.

@shaib

shaib Aug 26, 2014

Member

Can we avoid this for non-string fields?

@shaib

shaib Aug 26, 2014

Member

Can we avoid this for non-string fields?

This comment has been minimized.

@mjtamlyn

mjtamlyn Sep 3, 2014

Member

Is there a reliable list way of determining which ones count? Several fields are stored as strings...

@mjtamlyn

mjtamlyn Sep 3, 2014

Member

Is there a reliable list way of determining which ones count? Several fields are stored as strings...

@shaib

View changes

Show outdated Hide outdated django/db/backends/oracle/base.py
def convert_binaryfield_value(self, value, field):
if isinstance(value, Database.LOB):
value = value.read()

This comment has been minimized.

@shaib

shaib Aug 26, 2014

Member

call force_bytes() here (parallel to above function)?

@shaib

shaib Aug 26, 2014

Member

call force_bytes() here (parallel to above function)?

@shaib

View changes

Show outdated Hide outdated django/db/backends/oracle/base.py
# Convert 1 or 0 to True or False
elif value in (1, 0) and field and field.get_internal_type() in ('BooleanField', 'NullBooleanField'):
if value is None and field.empty_strings_allowed:
value = ''

This comment has been minimized.

@shaib

shaib Aug 26, 2014

Member

returning b'' for BinaryField (as original code did) should be required under Python 3, IIRC.

@shaib

shaib Aug 26, 2014

Member

returning b'' for BinaryField (as original code did) should be required under Python 3, IIRC.

converters = {}
index_extra_select = len(self.query.extra_select)
for i, field in enumerate(fields):
if field:

This comment has been minimized.

@shaib

shaib Aug 26, 2014

Member

When is field falsey?

@shaib

shaib Aug 26, 2014

Member

When is field falsey?

This comment has been minimized.

@mjtamlyn

mjtamlyn Sep 3, 2014

Member

Under certain circumstances with .extra() and annotation I believe.

@mjtamlyn

mjtamlyn Sep 3, 2014

Member

Under certain circumstances with .extra() and annotation I believe.

@shaib

This comment has been minimized.

Show comment
Hide comment
@shaib

shaib Aug 27, 2014

Member

Great work, overall. The amount of clean-up is impressive.

One optimization proposal: it seems that connection.get_db_converters(self, internal_type) would benefit from memoizing, or even explicitly building a dictionary internal_type->converter_list for all the core internal_types.

Member

shaib commented Aug 27, 2014

Great work, overall. The amount of clean-up is impressive.

One optimization proposal: it seems that connection.get_db_converters(self, internal_type) would benefit from memoizing, or even explicitly building a dictionary internal_type->converter_list for all the core internal_types.

@shaib

View changes

Show outdated Hide outdated django/db/backends/oracle/compiler.py
@@ -48,7 +32,7 @@ def as_sql(self, with_limits=True, with_col_aliases=False):
high_where = ''
if self.query.high_mark is not None:
high_where = 'WHERE ROWNUM <= %d' % (self.query.high_mark,)
sql = 'SELECT * FROM (SELECT ROWNUM AS "_RN", "_SUB".* FROM (%s) "_SUB" %s) WHERE "_RN" > %d' % (sql, high_where, self.query.low_mark)
sql = 'SELECT * FROM (SELECT "_SUB".* FROM (%s) "_SUB", ROWNUM AS "_RN" %s) WHERE "_RN" > %d' % (sql, high_where, self.query.low_mark)

This comment has been minimized.

@shaib

shaib Aug 27, 2014

Member

SELECT * FROM (SELECT "_SUB".*, ROWNUM AS "_RN" FROM (%s) "_SUB" %s) WHERE ...

(ROWNUM AS "_RN" should be part of the SELECT clause, not FROM clause).

@shaib

shaib Aug 27, 2014

Member

SELECT * FROM (SELECT "_SUB".*, ROWNUM AS "_RN" FROM (%s) "_SUB" %s) WHERE ...

(ROWNUM AS "_RN" should be part of the SELECT clause, not FROM clause).

@shaib

This comment has been minimized.

Show comment
Hide comment
@shaib

shaib Aug 27, 2014

Member

It seems all the specific converters are defined as DatabaseOperations methods for no good reason (they don't use self, and aren't referenced outside of get_db_converters()). If the intention is to provide hooks for inheritance, I think it's cleaner to make them classmethods. Other than that, they could probably be left as independent or nested functions, and in many cases even lambdas.

Member

shaib commented Aug 27, 2014

It seems all the specific converters are defined as DatabaseOperations methods for no good reason (they don't use self, and aren't referenced outside of get_db_converters()). If the intention is to provide hooks for inheritance, I think it's cleaner to make them classmethods. Other than that, they could probably be left as independent or nested functions, and in many cases even lambdas.

@akaariai

This comment has been minimized.

Show comment
Hide comment
@akaariai

akaariai Aug 28, 2014

Member

The code looks good to me. We will want to do further cleanup later on (better aggregate handling and date queries), but that can wait for separate patch.

Two docs comments:

  1. Documentation should mention that Field doesn't even implement from_db_value, that is you can't call super().from_db_value(). The docs mention that from_db_value is not used by most core fields, but this doesn't make clear that for example IntegerField does not have attribute from_db_value.
  2. The deprecation documentation should also mention that conversions on assignment do not happen with from_db_value. This could be a significant change for users, so this should be emphasized. (That is, implementing from_db_value might not be enough).
    EDIT: Just to make clear, I mean doing "obj.custom_field = somevalue" calls to_python with SubfieldBase, but not with from_db_value. If this turns out to be a large problem we could implement a generic way for fields to call to_python when assignment happens. But, I don't expect this to be a big problem, so lets wait and see if we get a lot of complaints...
Member

akaariai commented Aug 28, 2014

The code looks good to me. We will want to do further cleanup later on (better aggregate handling and date queries), but that can wait for separate patch.

Two docs comments:

  1. Documentation should mention that Field doesn't even implement from_db_value, that is you can't call super().from_db_value(). The docs mention that from_db_value is not used by most core fields, but this doesn't make clear that for example IntegerField does not have attribute from_db_value.
  2. The deprecation documentation should also mention that conversions on assignment do not happen with from_db_value. This could be a significant change for users, so this should be emphasized. (That is, implementing from_db_value might not be enough).
    EDIT: Just to make clear, I mean doing "obj.custom_field = somevalue" calls to_python with SubfieldBase, but not with from_db_value. If this turns out to be a large problem we could implement a generic way for fields to call to_python when assignment happens. But, I don't expect this to be a big problem, so lets wait and see if we get a lot of complaints...
@manfre

View changes

Show outdated Hide outdated django/db/backends/__init__.py
return int(value)
return value
def get_db_converters(self, internal_type):
return []

This comment has been minimized.

@manfre

manfre Aug 28, 2014

Contributor

A doc string is needed for get_db_converters so any 3rd party backends know exactly what is expected from this method

@manfre

manfre Aug 28, 2014

Contributor

A doc string is needed for get_db_converters so any 3rd party backends know exactly what is expected from this method

Fixed #18757, #14462, #21565 -- Reworked database-python type convers…
…ions

Complete rework of translating data values from database

Deprecation of SubfieldBase, removal of resolve_columns and
convert_values in favour of a more general converter based approach and
public API Field.from_db_value(). Now works seamlessly with aggregation,
.values() and raw queries.

Thanks to akaariai in particular for extensive advice and inspiration,
also to shaib, manfre and timograham for their reviews.
@mjtamlyn

This comment has been minimized.

Show comment
Hide comment
@mjtamlyn

mjtamlyn Sep 3, 2014

Member

Committed in e910340

Member

mjtamlyn commented Sep 3, 2014

Committed in e910340

@mjtamlyn mjtamlyn closed this Sep 3, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment