You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now we keep track of the part IDs in the assignment, so that if you started with nodes assigned to "DISTRICT_1", "DISTRICT_2", etc., thats what you get for every step in the chain. One consequence of that is that updaters generally return dictionaries with those district labels as keys.
We should instead assign districts integer indices 0, ..., n and have our updaters return tuples.
99% of the time we're only interested in the values of the dictionary anyway, so it will remove a step in the output collection process. It also is more appropriate to use an immutable data structure here. And on the theoretical side, the labelings of districts aren't really canonical, especially when using ReCom which completely redraws two districts at each step. So deemphasizing the labels seems like a good move. I think it might also make it cleaner for saving outputs in DataFrames or xarray data structures.
This change will probably break everyone's code.
The text was updated successfully, but these errors were encountered:
maxhully
changed the title
Updaters should return tuples instead of dictionaries
Updaters should return pandas.Series instead of dictionaries
Apr 11, 2019
OK, my new perspective on this is that we should try to make things as interoperable with pandas or numpy as possible. Our users will end up learning pandas anyways (most likely), and on top of leveraging pandas's functionality, I think it's a good idea for us to minimize the number of new weird gerrychain idioms that a person has to learn.
Right now we keep track of the part IDs in the assignment, so that if you started with nodes assigned to "DISTRICT_1", "DISTRICT_2", etc., thats what you get for every step in the chain. One consequence of that is that updaters generally return dictionaries with those district labels as keys.
We should instead assign districts integer indices 0, ..., n and have our updaters return tuples.
99% of the time we're only interested in the values of the dictionary anyway, so it will remove a step in the output collection process. It also is more appropriate to use an immutable data structure here. And on the theoretical side, the labelings of districts aren't really canonical, especially when using ReCom which completely redraws two districts at each step. So deemphasizing the labels seems like a good move. I think it might also make it cleaner for saving outputs in DataFrames or xarray data structures.
This change will probably break everyone's code.
The text was updated successfully, but these errors were encountered: