-
Notifications
You must be signed in to change notification settings - Fork 158
Closed
Labels
Description
I noticed recently that if you use table.homogenize() to fill in missing rows after earlier running table.denormalize(), the new filler rows will not have row_names, while the original rows will have row_names.
The problem is that this will lead to later errors when you try to invoke methods like .order_by() on the resulting table.
Because some rows have row_names and others don't, you'll get this error message:
File "/python3.8/site-packages/agate/table/order_by.py", line 46, in <listcomp>
row_names = [self._row_names[i] for i in indices]
IndexError: tuple index out of range
This error can be reproduced with the following test code:
import agate
from decimal import Decimal
data = [
{ 'date':'2021-04-01', 'name':'England', 'distributed':1000, 'administered':700 },
{ 'date':'2021-04-02', 'name':'England', 'distributed':1100, 'administered':800 },
{ 'date':'2021-04-04', 'name':'England', 'distributed':1300, 'administered':1000 },
{ 'date':'2021-04-05', 'name':'England', 'distributed':1400, 'administered':1100 },
{ 'date':'2021-04-01', 'name':'Mexico', 'distributed':1000, 'administered':700 },
{ 'date':'2021-04-02', 'name':'Mexico', 'distributed':1100, 'administered':800 },
{ 'date':'2021-04-04', 'name':'Mexico', 'distributed':1300, 'administered':1000 },
{ 'date':'2021-04-05', 'name':'Mexico', 'distributed':1400, 'administered':1100 },
]
table = agate.Table.from_object(
data,
column_types=agate.TypeTester(force={
'date': agate.Text(),
})
)
table = table.denormalize(
key='date',
property_column='name',
value_column='administered',
)
table = table.homogenize(
'date',
[ '2021-04-01', '2021-04-02', '2021-04-03', '2021-04-04', '2021-04-05' ],
[ Decimal(0), Decimal(0) ]
)
table = table.order_by('date')
I don't use row_names myself, so I don't know why it's necessary to have .denormalize() automatically generate them. But I assume preserving that is important, so I guess the fix would be to make .homogenize() also generate row_names?