Skip to content

Commit

Permalink
Cookbook: USA Today diversity index.
Browse files Browse the repository at this point in the history
  • Loading branch information
onyxfish committed Sep 4, 2015
1 parent b578e3b commit f41b806
Show file tree
Hide file tree
Showing 2 changed files with 33 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
0.7.0
-----

* Cookbook: USA Today diversity index.
* Cookbook: filter to top x%. (#47)
* Cookbook: fuzzy string search example. (#176)
* Values to coerce to true/false can now be overridden for BooleanType.
Expand Down
32 changes: 32 additions & 0 deletions docs/cookbook/calculations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,3 +138,35 @@ This code can now be applied to any :class:`.Table` just as any other :class:`.C
])
The resulting column will contain an integer measuring the edit distance between the value in the column and the comparison string.
USA Today Diversity Index
=========================
The `USA Today Diversity Index <http://www.usatoday.com/story/news/nation/2014/10/21/diversity-index-data-how-we-did-report/17432103/>`_ is a widely cited method for evaluating the racial diversity of a given area. Using a custom :class:`.Computation` makes it simple to calculate.
Assuming that your data has a column for the total population, another for the population of each race and a final column for the hispanic population, you can implement the diversity index like this:
.. code-block:: python
class USATodayDiversityIndex(agate.Computation):
def get_computed_column_type(self, table):
return agate.NumberType()
def run(self, row):
race_squares = 0
for race in ['white', 'black', 'asian', 'american_indian', 'pacific_islander']:
race_squares += (row[race] / row['population']) ** 2
hispanic_squares = (row['hispanic'] / row['population']) ** 2
hispanic_squares += (1 - (row['hispanic'] / row['population'])) ** 2
return (1 - (race_squares * hispanic_squares)) * 100
We apply the diversity index like any other computation:
.. code-block:: Python
with_index = table.compute([
('diversity_index', USATodayDiversityIndex())
])

0 comments on commit f41b806

Please sign in to comment.