# Compressing tabular PLSS data with Python 

This notebook is available here: https://github.com/MD-Troyer/blog-notebooks

Generating raw PLSS data for a polygon is pretty trivial. A simple spatial intersection with a plss grid in ArcMap will do the trick. What we would typically like however, is that list of Twn/Rng/Sec/QQ1/QQ2 data consolidated according to completely represented units. For example, if we have all 16 quarter-quarters for a single section, we would like to represent that as a single row [as 'entire section'] rather than listing all 16 QQ rows individually. Similarly, if we have all 4 quarter-quarters for a single quarter section, we would like to represent that as a single row [as 'entire quarter'] rather than listing all 4 Q rows individually.

The following illustrates how to to compress PLSS data in this way with Python.

In [1]:
from collections import defaultdict
import numpy as np
import pandas as pd
import csv
import copy

In [2]:
csv_path = \
    r'C:\Users\michael\Documents\_projects\compress-plss\compress-plss.csv'

# We will use a default dictionary init with an empty list to sort the rows
plss = defaultdict(list)

# Read the csv, row-by-row, skip the header
reader = csv.reader(open(csv_path, 'r'))
# Skip the header row
reader.next()

source_rows = []

# Source table format:
# | PM | Twn | Rng | QQ1 | QQ2 |
    
# For each row
for row in reader:
    # Keep the source rows for comparison
    source_rows.append(row)
    try:
        # Get the PM/Twn/Rng/Sec 
        pm  = row[0]; twn = row[1]; rng = row[2]; sec = row[3]
        # Pass to defaultdict with concatenated PM-TWN-RNG-SEC as key
        # and a tuple of (QQ1, QQ2) as the value -> into the empty list
        # If key does not exist, create it, else append (QQ1, QQ2) to list
        plss[pm +'-'+ twn +'-'+ rng +'-'+ sec].append([row[4], row[5]])
    except: pass  # If something weird, just skip it
    
print_list = []

# For each PM-Twn-Rng-Sec in defaultdict
for section in plss.keys():
    # Split the key back up to PM, Twn, Rng, Sec as a list
    splits = section.split('-')
    
    # Count the number of (QQ1, QQ2) tuples associated with each PM/Twn/Rng/Sec
    if len(plss[section]) == 16:
        # If 16 we have a whole section
        # add ['entire', 'section'] to our split list 
        # queue for writing to csv in print_list
        splits.extend(['entire', 'section'])
        print_list.append(splits)
        # Move on to next section
        continue
        
    # Create another default dictionary init with an empty list 
    # to sort the quarters individually
    quarters = defaultdict(list)
    
    # quarters are the (QQ1, QQ2) tuples
    for quarter in plss[section]:
        # Pass to defaultdict with QQ1 as key, QQ2 as the value -> into the 
        # empty list - create if does not exist, else append QQ2 to list
        quarters[quarter[1]].append(quarter)
        
    # Count the number of QQ2s (q_list) associated with each QQ1 (quarter)
    for quarter, q_list in quarters.items():
        # if we have 4 we have an entire quarter
        if len(q_list) == 4:
            splits_ = copy.copy(splits)
            # add ['entire', quarter] to a copy of split list 
            # so we don't step all over our own list (which is iterating above)
            # and queue for writing to csv in print_list
            splits_.extend(['entire', quarter])
            print_list.append(splits_)
        else: 
            # If we don't have all 4 we can't compress - keep them all
            for q in q_list:
                splits_ = copy.copy(splits)
                splits_.extend(q)
                print_list.append(splits_)

In [3]:
print 'Source Rows:'
for row in source_rows:
    print row
print
print 'Final Rows:'
for row in print_list:
    print row

Source Rows:
['6th', '15S', '71W', '1', 'NE', 'NE']
['6th', '15S', '71W', '1', 'NW', 'NE']
['6th', '15S', '71W', '1', 'SE', 'NE']
['6th', '15S', '71W', '1', 'SW', 'NE']
['6th', '15S', '71W', '1', 'NE', 'NW']
['6th', '15S', '71W', '1', 'NW', 'NW']
['6th', '15S', '71W', '1', 'SE', 'NW']
['6th', '15S', '71W', '1', 'SW', 'NW']
['6th', '15S', '71W', '1', 'NE', 'SE']
['6th', '15S', '71W', '1', 'NW', 'SE']
['6th', '15S', '71W', '1', 'SE', 'SE']
['6th', '15S', '71W', '1', 'SW', 'SE']
['6th', '15S', '71W', '1', 'NE', 'SW']
['6th', '15S', '71W', '1', 'NW', 'SW']
['6th', '15S', '71W', '1', 'SE', 'SW']
['6th', '15S', '71W', '1', 'SW', 'SW']
['6th', '16S', '71W', '2', 'NE', 'NE']
['6th', '16S', '71W', '2', 'NW', 'NE']
['6th', '16S', '71W', '2', 'SE', 'NE']
['6th', '16S', '71W', '2', 'SW', 'NE']
['6th', '16S', '71W', '2', 'NE', 'NW']
['6th', '16S', '71W', '2', 'NW', 'NW']
['6th', '16S', '71W', '2', 'SE', 'NW']
['6th', '16S', '71W', '2', 'SW', 'NW']
['6th', '16S', '71W', '2', 'NE', 'SE']
['6th', '16S

Much better!