Skip to content

Using homogenize() after denormalize() results in some rows without row_names #756

@Kirkman

Description

@Kirkman

I noticed recently that if you use table.homogenize() to fill in missing rows after earlier running table.denormalize(), the new filler rows will not have row_names, while the original rows will have row_names.

The problem is that this will lead to later errors when you try to invoke methods like .order_by() on the resulting table.
Because some rows have row_names and others don't, you'll get this error message:

File "/python3.8/site-packages/agate/table/order_by.py", line 46, in <listcomp>
    row_names = [self._row_names[i] for i in indices]
IndexError: tuple index out of range

This error can be reproduced with the following test code:

import agate
from decimal import Decimal

data = [
	{ 'date':'2021-04-01', 'name':'England', 'distributed':1000, 'administered':700 },
	{ 'date':'2021-04-02', 'name':'England', 'distributed':1100, 'administered':800 },
	{ 'date':'2021-04-04', 'name':'England', 'distributed':1300, 'administered':1000 },
	{ 'date':'2021-04-05', 'name':'England', 'distributed':1400, 'administered':1100 },
	{ 'date':'2021-04-01', 'name':'Mexico', 'distributed':1000, 'administered':700 },
	{ 'date':'2021-04-02', 'name':'Mexico', 'distributed':1100, 'administered':800 },
	{ 'date':'2021-04-04', 'name':'Mexico', 'distributed':1300, 'administered':1000 },
	{ 'date':'2021-04-05', 'name':'Mexico', 'distributed':1400, 'administered':1100 },
]

table = agate.Table.from_object(
	data,
	column_types=agate.TypeTester(force={
		'date': agate.Text(),
	})
)

table = table.denormalize(
	key='date',
	property_column='name',
	value_column='administered',
)

table = table.homogenize(
	'date', 
	[ '2021-04-01', '2021-04-02', '2021-04-03', '2021-04-04', '2021-04-05' ], 
	[ Decimal(0), Decimal(0) ]
)

table = table.order_by('date')

I don't use row_names myself, so I don't know why it's necessary to have .denormalize() automatically generate them. But I assume preserving that is important, so I guess the fix would be to make .homogenize() also generate row_names?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions