PHP Implementation of Jenks Natural Breaks Optimization for Choropleth Mapping
The Jenks natural breaks optimization method allows you to break up data points into the best possible number of groupings, with the best possible contents of each group for choropleth mapping.
Here's how it works (from Wikipedia):
The method requires an iterative process. That is, calculations must be repeated using different breaks in the dataset to determine which set of breaks has the smallest in-class variance. The process is started by dividing the ordered data into groups. Initial group divisions can be arbitrary. There are four steps that must be repeated:
- Calculate the sum of squared deviations between classes (SDBC).
- Calculate the sum of squared deviations from the array mean (SDAM).
- Subtract the SDBC from the SDAM (SDAM-SDBC). This equals the sum of the squared deviations from the class means.
- After inspecting each of the SDBC, a decision is made to move one unit from the class with the largest SDBC toward the class with the lowest SDBC.
New class deviations are then calculated, and the process is repeated until the sum of the within class deviations reaches a minimal value.
- A is the set of values that have been ordered from 1 to N.
- 1 ≤ i < j < N
- Mean i..j is the mean of the class bounded by i and j.
I studied up on this and wrote it while working for a startup a few years ago, in 2009. I found that all of the available choroplethic mapping solutions available had inadequate splits of data when asked to create a map. Unfortunately, very few implementations of Jenks exist outside of professional cartography packages.
When re-writing this, I had chosen to use the Google Charts API for our maps so, you should be able to use this to simply output a map assuming you provide all the necessary parameters.
This was kind of based on another script that was in French (which I don't speak very well, at all). The original script had many issues and bugs and wasn't as flexible as I wanted it to be. So, I re-wrote it.
This one works well and is flexible in terms of the datasets. It has been tested pretty thoroughly and, as far as I can tell, is correct.
- Need a chance to go over it again and make improvements and/or more comments.
- Bug test.
- Write example usage.
- See if the Google Maps stuff even works anymore.
- Make a blog post / homepage.
Open-source and free for use.
Copyright 2012 David Drake
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.