# This Notebook explores the SCAR GeoMAP dataset released in 2019
## Cox S.C., Smith Lyttle B. and the GeoMAP team (2019). Lower Hutt, New Zealand. GNS Science. Release v.201907.
### [Data Available Here](https://data.gns.cri.nz/ata_geomap/index.html?content=/mapservice/Content/antarctica/www/index.html)
### Notebook by Sam Elkind

Initially, I'll look at the data in terms of polygon counts. This section will be focused on examining the data schema and frequency of values occurring within specific fields. This investigation will focus on finding inconsistencies in the data attribution, but also could stimulate some discussion regarding relationships between columns.

Next, I'll look at the data in terms of polygon area and data attribution. How much surface water has been mapped? How much till has been mapped? How much outcropping rock is of Jurassic age?

### Configure packages, paths, and load data

In [1]:
import os
import geopandas as gpd
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import display
import pprint as pp
from tabulate import tabulate

In [2]:
def plot_value_counts(field_name, values_to_plot, counts, counts_norm):
    fig, ax = plt.subplots(2, 1, figsize=(30,15))
    fig.tight_layout(pad=2.0)
    fig.subplots_adjust(top=.94)
    fig.suptitle(f"Frequency of {field_name} values", size=18)

    ax[0].set_title(field_name)
    ax[1].set_title(f"{field_name} normalized")
    for i, v in enumerate(counts[:values_to_plot]):
        ax[0].text(i - .5, v, str(v), color='black', fontweight='bold')
    for i, v in enumerate(counts_norm[:values_to_plot]):
        ax[1].text(i - .5, v, f"{str(v * 100)[:3]}%", color='black', fontweight='bold')
    ax[0].bar(counts.index[:values_to_plot], counts[:values_to_plot])
    ax[1].bar(counts_norm.index[:values_to_plot], counts_norm[:values_to_plot])

In [3]:
geol_path = f"{os.getcwd()}/data/ATA_SCAR_GeoMAP_geology.gdb"
print(geol_path)

/home/sam/geomap/data/ATA_SCAR_GeoMAP_geology.gdb


In [4]:
data = gpd.read_file(geol_path)

## Let's start by looking at the number of unique values for these two fields

In [5]:
display(data[["NAME", "DESCR"]].nunique())

NAME     666
DESCR    757
dtype: int64

## There are more descriptions than names, that kinda seems weird, I would expect to see a 1-1 relationship with these fields. Perhaps complexes with a varied lithology were given the same name value but different, more granular descriptions.

### Let's take a look at the unique pairs of values that occur.

In [6]:
unique_pairs = data[["NAME", "DESCR"]].drop_duplicates()
unique_pairs["pair_id"] = range(len(unique_pairs.index))

In [7]:
display(unique_pairs)

Unnamed: 0,NAME,DESCR,pair_id
0,marine sedimentary and metasedimentary rocks (...,unfossiliferous low grade regional metamorphic...,0
3,intermediate intrusive rocks (early Jurassic t...,intermediate intrusive rocks (early Jurassic t...,1
5,Paleozoic-Triassic metamorphic rock,regionally metamorphosed rocks ranging from Pa...,2
7,sedimentary rocks (Paleozic to mid-Jurassic),inferred sedimentary rocks and low-grade meta...,3
10,Antarctic Peninsula Volcanic Group,"calc-alkaline volcanic suite, lava flows predo...",4
...,...,...,...
94142,Shaw-Clemence Complex,"aluminous gneisses, quartz feldspathic gneisse...",797
94939,,younger till,798
94940,,older till,799
95112,,Orthopyroxene-biotite-quartz-plagioclase gneis...,800


#### Looks like there are a lot of names that have different descriptions. Let's see how many pairs have "None"s in the name column 

In [8]:
null_names = unique_pairs[(unique_pairs["NAME"].isnull()) | (unique_pairs["NAME"] == " ") | (unique_pairs["NAME"] == "")]

In [9]:
display(null_names)

Unnamed: 0,NAME,DESCR,pair_id
222,,regionally metamorphosed rocks ranging from Ar...,20
61780,,"Gabbro-diorite and melamonzogranite, coeval wi...",468
85559,,Orthopyroxene-quartz-feldspar gneiss (tonaliti...,670
85560,,Layerd biotite-garnet-quartz-feldspar gneiss; ...,671
85562,,Hornblende-clinopyroxene-orthopyroxene quartz ...,672
...,...,...,...
93976,,Bt and Hb-Bt granite plutons,795
94939,,younger till,798
94940,,older till,799
95112,,Orthopyroxene-biotite-quartz-plagioclase gneis...,800


In [10]:
print(f"{unique_pairs.shape[0] - null_names.shape[0]} unique pairs have a NAME without a value, but a description with a value")
print(f"{data[data['NAME'].isnull()].shape[0]} polygons have a NAME without a value. Let's get a list of the unique sources for these polygons so we can check them if needed")

737 unique pairs have a NAME without a value, but a description with a value
5119 polygons have a NAME without a value. Let's get a list of the unique sources for these polygons so we can check them if needed


In [11]:
null_name_sources = data[(data['NAME'].isnull()) | (data["NAME"] == " ") | (data["NAME"] == "")][["SOURCECODE", "MAPSYMBOL", "NAME", "SOURCE"]]

In [12]:
display(null_name_sources.drop_duplicates(["SOURCECODE","SOURCE"]))

Unnamed: 0,SOURCECODE,MAPSYMBOL,NAME,SOURCE
222,m,?n,,Thomson & Harris 1979_Southern Graham Land
14398,m,?n,,Thomson et al. 1982 North Palmer Land
20274,m,?n,,Burton-Johnson & Riley 2015
61780,GHgra,EOd,,Pertusati et al. 2012
85559,Pp,Rzn,,Sheraton 1985. Geology of Enderby Land and Wes...
...,...,...,...,...
93976,AR-PPg1,ALg,,"Mikhalsky etal 2001, Prince Charles Mountains"
94939,Ty,Czs,,Ishikawa et al. 2000. Geological map of Mount ...
94940,To,Czs,,Ishikawa et al. 2000. Geological map of Mount ...
95112,Ppp,Rzn,,Sheraton 1985. Geology of Enderby Land and Wes...


### A significant number of polys have no name but have a source code. Do all of these source codes lack a name?

## Let's move on to see which names have multiple descriptions

In [13]:
# We want a sorted list of all the NAME values for which there are multiple descriptions. They are sorted by the number of different descriptions
name_descr_sets = sorted([(i, data[data["NAME"] == i]["DESCR"].unique()) for i in data["NAME"].unique() if len(data[data["NAME"] == i]["DESCR"].unique()) > 1], key=lambda x: len(x[1]), reverse=True)

In [53]:
print(f"{len(name_descr_sets)} names have more than one description")
display(name_descr_sets)

35 names have more than one description


[('Marie Byrd Land Volcanics: basalt',
  array(['Hawaiite', 'Basanite', 'Basanite, hawaiite, tephrite',
         'Basalt, basaltic hyaloclastite, cinder cone, tuff cone',
         'Basaltic hyaloclastite', 'Alkali basalt & hawaiite',
         'Basalt tuff cone', 'Basanite flows and pyroclastics',
         'Basalt flows and basaltic hydrovolcanic rocks',
         'Tephrite, basanite', 'Basanite, tephritoid',
         'Basalt, hawaiite, basaltic hyaloclastite',
         'Basalt, basaltic pyroclastics'], dtype=object)),
 ('Melbourne volcanic province',
  array(['Peralkaline trachyte, quartz-trachyte, peralkaline rhyolite',
         'Phonolite',
         'Variably differentiated alkali volcanics forming major and composite strato-volcanoes and other minor centres; alkali-basanite to trachyte-rhyolite',
         'Alkali basalt, basanite, hawaiite',
         'Mugearite, benmoreite, trachyandesite', 'Trachyte',
         'Trachyte with tristanite and trachyandesite', 'Basanite',
         'Basa

#### Starting from the top...

In [15]:
pp.pp(name_descr_sets[0])
print(len(name_descr_sets[0][1]))

('Marie Byrd Land Volcanics: basalt',
 array(['Hawaiite', 'Basanite', 'Basanite, hawaiite, tephrite',
       'Basalt, basaltic hyaloclastite, cinder cone, tuff cone',
       'Basaltic hyaloclastite', 'Alkali basalt & hawaiite',
       'Basalt tuff cone', 'Basanite flows and pyroclastics',
       'Basalt flows and basaltic hydrovolcanic rocks',
       'Tephrite, basanite', 'Basanite, tephritoid',
       'Basalt, hawaiite, basaltic hyaloclastite',
       'Basalt, basaltic pyroclastics'], dtype=object))
13


Makes sense that this name has multiple descriptions. Let's see how many source codes correspond with this name

In [40]:
mbl_basalt_code_mask = data["NAME"] == name_descr_sets[0][0]
mbl_cols = ["SOURCECODE", "DESCR", "MAPSYMBOL", "NAME", "SOURCE"]
mbl_unique_cols = ["SOURCECODE", "DESCR"]
mbl_basalt_codes = data[mbl_basalt_code_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")

In [41]:
print(f"There are {len(mbl_basalt_codes)} unique source codes for the polygons with the name 'Marie Byrd Land Volcanics: basalt'. There are 13 unique descriptions.")
display(mbl_basalt_codes)

There are 17 unique source codes for the polygons with the name 'Marie Byrd Land Volcanics: basalt'. There are 13 unique descriptions.


Unnamed: 0,SOURCECODE,DESCR,MAPSYMBOL,NAME,SOURCE
41977,Pb_LeM8b2,Alkali basalt & hawaiite,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 2013 (Fig.8B)
42233,Pb_LeM2c1,Basalt flows and basaltic hydrovolcanic rocks,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 2013 (Fig.2C)
42094,Pb_LM84a,Basalt tuff cone,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 1984
41839,Pb_Hart97a,"Basalt, basaltic hyaloclastite, cinder cone, t...",Czb,Marie Byrd Land Volcanics: basalt,Hart et al. 1997
42600,Pb_LMB16B.3a,"Basalt, basaltic pyroclastics",Czb,Marie Byrd Land Volcanics: basalt,LeMasurier & Thomson 1990 (Fig B.16B.3)
42536,Pb_LMB16D.1_ins,"Basalt, hawaiite, basaltic hyaloclastite",Czb,Marie Byrd Land Volcanics: basalt,LeMasurier & Thomson 1990 (Fig B.16D.1)_Inset
41857,Pb_LMB16D.1b,Basaltic hyaloclastite,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier & Thomson 1990 (Fig B.16D.1)
42479,Pb_Kip14a,Basanite,Czb,Marie Byrd Land Volcanics: basalt,Kipf et al. 2014
42936,Pb_LeM6b1,Basanite,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 2013 (Fig.6B)
41804,Pb_LeM4b1,Basanite,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 2013 (Fig.4B)


#### The is an example where the source codes are likely derived from sample labels in some sort of petrology study. The sources cite specific figures. Perhaps some sort of standardization could be done on the SOURCECODE field from a more generalized mapping source that could be cited alongside these sources. On the other hand, MAPSYMBOL already serves this purpose
#### From another perspective, this is a case that demonstrates the value of the GeoMap project, we have successfully captured multiple levels of granularity of geological classification, captured within the data schema of source code, description, name, and map symbol. An even deeper description could likely be found for each of the source codes within the cited sources.

### Just for curiosity's sake, let's see how many names fall under the MBL volcanics category

In [43]:
mbl_volcanics_mask_string = "Marie Byrd Land Volcanics"
mbl_volcanics_name_mask = data["NAME"].str.contains(mbl_volcanics_mask_string).any(level=0)
mbl_volcanics_names = data[mbl_volcanics_name_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")

In [44]:
print(f"there are {len(mbl_volcanics_names)} unique (sourcecode, description) combinations that have the string '{mbl_volcanics_mask_string}' in the NAME ")
display(mbl_volcanics_names)

there are 51 unique (sourcecode, description) combinations that have the string 'Marie Byrd Land Volcanics' in the NAME 


Unnamed: 0,SOURCECODE,DESCR,MAPSYMBOL,NAME,SOURCE
41977,Pb_LeM8b2,Alkali basalt & hawaiite,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 2013 (Fig.8B)
40816,Pb,"Basalt and basanite as dikes, cinder cones and...",Czv,Marie Byrd Land Volcanics,Siddoway et al. unpublished mapping
42233,Pb_LeM2c1,Basalt flows and basaltic hydrovolcanic rocks,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 2013 (Fig.2C)
42237,Pb_LeM2c1?,Basalt flows and basaltic hydrovolcanic rocks,Czb,Marie Byrd Land Volcanics: basalt inferred,LeMasurier 2013 (Fig.2C)
42094,Pb_LM84a,Basalt tuff cone,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier 1984
41839,Pb_Hart97a,"Basalt, basaltic hyaloclastite, cinder cone, t...",Czb,Marie Byrd Land Volcanics: basalt,Hart et al. 1997
42600,Pb_LMB16B.3a,"Basalt, basaltic pyroclastics",Czb,Marie Byrd Land Volcanics: basalt,LeMasurier & Thomson 1990 (Fig B.16B.3)
42536,Pb_LMB16D.1_ins,"Basalt, hawaiite, basaltic hyaloclastite",Czb,Marie Byrd Land Volcanics: basalt,LeMasurier & Thomson 1990 (Fig B.16D.1)_Inset
41857,Pb_LMB16D.1b,Basaltic hyaloclastite,Czb,Marie Byrd Land Volcanics: basalt,LeMasurier & Thomson 1990 (Fig B.16D.1)
42091,Pb_Pan4a,Basanite,Czb,Marie Byrd Land Volcanics: basalt,Panter et al. 1994


#### Let's skip all further instances of MBL volcanics. This looks like a well studied subject and sets of descriptions for a single name that fall under this category will likely be cases similar to mbl basalts

## Moving on to the next name with multiple descriptions

In [46]:
pp.pp(name_descr_sets[1])
print(len(name_descr_sets[1][1]))

('Melbourne volcanic province',
 array(['Peralkaline trachyte, quartz-trachyte, peralkaline rhyolite',
       'Phonolite',
       'Variably differentiated alkali volcanics forming major and composite strato-volcanoes and other minor centres; alkali-basanite to trachyte-rhyolite',
       'Alkali basalt, basanite, hawaiite',
       'Mugearite, benmoreite, trachyandesite', 'Trachyte',
       'Trachyte with tristanite and trachyandesite', 'Basanite',
       'Basanite, hawaiite', 'Peralkaline rhyolite',
       'Alkali basalt, basanite, tephrite'], dtype=object))
11


#### I suspect that this name will be a similar case to the mbl volcanics

In [47]:
mel_vol_source_name_mask = data["NAME"] == name_descr_sets[1][0]
mel_vol_polys = data[mel_vol_source_name_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")


In [48]:
display(mel_vol_polys)
print(f"Melbourne volcanic province has {len(mel_vol_polys)} distinct source codes")

Unnamed: 0,SOURCECODE,DESCR,MAPSYMBOL,NAME,SOURCE
61211,Mev_WoF1a,"Alkali basalt, basanite, hawaiite",Czb,Melbourne volcanic province,Wörner et al. 1989 Fig1
73799,Mev_WVF14a,"Alkali basalt, basanite, tephrite",Czb,Melbourne volcanic province,Wörner & Viereck 1989 Fig14
63621,Mev_LMA61c,Basanite,Czb,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
64114,Mev_KF912c,"Basanite, hawaiite",Czb,Melbourne volcanic province,Kyle 1982
62002,Mev_WoF1c,"Mugearite, benmoreite, trachyandesite",Cza,Melbourne volcanic province,Wörner et al. 1989 Fig1
71469,Mev_KF912c,Peralkaline rhyolite,Czf,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
60905,Mev_LMA61b,"Peralkaline trachyte, quartz-trachyte, peralka...",Czf,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
70567,Mev_KF912c,"Peralkaline trachyte, quartz-trachyte, peralka...",Czf,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
61140,Mev_KF912b,Phonolite,Czf,Melbourne volcanic province,Kyle 1982
62049,Mev_WoF1b,Trachyte,Czf,Melbourne volcanic province,Wörner et al. 1989 Fig1


Melbourne volcanic province has 13 distinct source codes


#### This is pretty much an identical case to the MBL basalts

### Next...

In [39]:
pp.pp(name_descr_sets[2])
print(len(name_descr_sets[2][1]))

('late granitoid',
 array(['late-stage unfoliated muscovite-biotite granite and biotite granite',
       'Late-stage granitoids, homogeneous and massive to foliated; may include pegmatites and enclaves; postdate Vanda Dikes',
       'Fine homogeneous equigranular leucocratic biotite granodiorite in dikes stocks and plugs',
       'Hornblende-biotite-alkali feldspar quartz monzonite to granite in small stocks plugs and sills; locally porphyritic',
       'Porphyritic hornblende biotite quartz monzodiorite, monzonite and quartz monzonite forming sills and plugs in Pearse Valley',
       'Fine equigranular hornblende clinopyroxene granodiorite at western Kukri Hills; pre-dates Vanda Dikes'],
      dtype=object))
6


In [49]:
late_granitoid_name_mask = data["NAME"] == name_descr_sets[2][0]
late_granitoid_src_descr_combos = data[late_granitoid_name_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")

In [50]:
display(late_granitoid_src_descr_combos)

Unnamed: 0,SOURCECODE,DESCR,MAPSYMBOL,NAME,SOURCE
57396,ggk,Fine equigranular hornblende clinopyroxene gra...,Eg,late granitoid,"Cox, S.C.; Turnbull, I.M.; Isaac, M.J.; Townse..."
55116,ggr,Fine homogeneous equigranular leucocratic biot...,Eg,late granitoid,"Cox, S.C.; Turnbull, I.M.; Isaac, M.J.; Townse..."
55342,gq,Hornblende-biotite-alkali feldspar quartz monz...,Og,late granitoid,"Cox, S.C.; Turnbull, I.M.; Isaac, M.J.; Townse..."
54573,gg,"Late-stage granitoids, homogeneous and massive...",EOg,late granitoid,"Cox, S.C.; Turnbull, I.M.; Isaac, M.J.; Townse..."
57029,ggm,Porphyritic hornblende biotite quartz monzodio...,Eg,late granitoid,"Cox, S.C.; Turnbull, I.M.; Isaac, M.J.; Townse..."
44248,gg,late-stage unfoliated muscovite-biotite granit...,EOg,late granitoid,Goodge et al. 1993


Nothing to see here really... next

In [51]:
pp.pp(name_descr_sets[3])
print(len(name_descr_sets[3][1]))

('Hallett volcanic province',
 array(['Predominantly basanite, basalt and hawaiite', 'Trachyte',
       'basanite, hawaiite, mugearite',
       'Predominantly mugearite, benmoreite, and trachyte ', 'Mugearite'],
      dtype=object))
5


In [54]:
hallett_name_mask = data["NAME"] == name_descr_sets[3][0]
hallett_src_descr_combos = data[mel_vol_source_name_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")

In [55]:
display(hallett_src_descr_combos)

Unnamed: 0,SOURCECODE,DESCR,MAPSYMBOL,NAME,SOURCE
61211,Mev_WoF1a,"Alkali basalt, basanite, hawaiite",Czb,Melbourne volcanic province,Wörner et al. 1989 Fig1
73799,Mev_WVF14a,"Alkali basalt, basanite, tephrite",Czb,Melbourne volcanic province,Wörner & Viereck 1989 Fig14
63621,Mev_LMA61c,Basanite,Czb,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
64114,Mev_KF912c,"Basanite, hawaiite",Czb,Melbourne volcanic province,Kyle 1982
62002,Mev_WoF1c,"Mugearite, benmoreite, trachyandesite",Cza,Melbourne volcanic province,Wörner et al. 1989 Fig1
71469,Mev_KF912c,Peralkaline rhyolite,Czf,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
60905,Mev_LMA61b,"Peralkaline trachyte, quartz-trachyte, peralka...",Czf,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
70567,Mev_KF912c,"Peralkaline trachyte, quartz-trachyte, peralka...",Czf,Melbourne volcanic province,LeMasurier & Thomson 1990 (Fig A.6.1)
61140,Mev_KF912b,Phonolite,Czf,Melbourne volcanic province,Kyle 1982
62049,Mev_WoF1b,Trachyte,Czf,Melbourne volcanic province,Wörner et al. 1989 Fig1


#### Looks like more of the same situation

In [56]:
pp.pp(name_descr_sets[4])
print(len(name_descr_sets[4][1]))

('older ice sheet margin till',
 array(['Till in moraines on margins of ice sheets, ice shelves, or large glaciers occupying major valleys; commonly degraded or scree covered; multiple advances not always differentiated',
       'Till in moraines on margins of ice sheets, ice shelves, or large glaciers occupying the major valleys: commonly degraded and covered by scree: multiple advances not always differentiated',
       'Bouldery sandy till, locally matrix-rich and water-laid; includes glaciolacustrine and glaciofluvial sediment',
       'Poorly sorted bouldery sandy till, slightly weathered and modified in elevated position away from present ice sheet or glacier'],
      dtype=object))
4


In [59]:
old_sheet_till_name_mask = data["NAME"] == name_descr_sets[4][0]
old_sheet_till_src_descr_combos = data[old_sheet_till_name_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")

In [62]:
display(old_sheet_till_src_descr_combos)

Unnamed: 0,SOURCECODE,DESCR,MAPSYMBOL,NAME,SOURCE
43686,Qti2,"Bouldery sandy till, locally matrix-rich and w...",Qs,older ice sheet margin till,Grindley & Laird 1969
43875,Qti3,"Poorly sorted bouldery sandy till, slightly we...",Qs,older ice sheet margin till,Joy et al. 2014
43885,Qti4,"Poorly sorted bouldery sandy till, slightly we...",Qs,older ice sheet margin till,Storey et al. 2010
40463,Qti,"Till in moraines on margins of ice sheets, ice...",Qs,older ice sheet margin till,GeoMAP
43684,Qti,"Till in moraines on margins of ice sheets, ice...",Qs,older ice sheet margin till,Grindley & Laird 1969


- There is impressive consistency in description contents between sources over multiple years.
-  Grindley & Laird, 1969 seem to be the original source.
- If the GeoMap sourced polys intend to have their description wording based off of G + L, 1969, the phrasing and punctuation have been changed a bit. This should probably be changed to be consistent.

In [56]:
pp.pp(name_descr_sets[4])
print(len(name_descr_sets[4][1]))

('older ice sheet margin till',
 array(['Till in moraines on margins of ice sheets, ice shelves, or large glaciers occupying major valleys; commonly degraded or scree covered; multiple advances not always differentiated',
       'Till in moraines on margins of ice sheets, ice shelves, or large glaciers occupying the major valleys: commonly degraded and covered by scree: multiple advances not always differentiated',
       'Bouldery sandy till, locally matrix-rich and water-laid; includes glaciolacustrine and glaciofluvial sediment',
       'Poorly sorted bouldery sandy till, slightly weathered and modified in elevated position away from present ice sheet or glacier'],
      dtype=object))
4


In [63]:
def create_src_descr_combos_by_name(name, data):
    cols = ["SOURCECODE", "DESCR", "MAPSYMBOL", "NAME", "SOURCE"]
    unique_cols = ["SOURCECODE", "DESCR"]
    sort_cols = ["DESCR"]
    mask = data["NAME"] == name
    return data[mask][display_cols].drop_duplicates(unique_cols).sort_values(sort_cols)
    

In [66]:
print(name_descr_sets[4][0])

older ice sheet margin till


In [65]:
old_sheet_till_combos = create_src_descr_combos_by_name(name_descr_sets[4][0], data)
# old_sheet_till_src_descr_combos = data[old_sheet_till_name_mask][mbl_cols].drop_duplicates(mbl_unique_cols).sort_values("DESCR")

TypeError: string indices must be integers

In [64]:
display(old_sheet_till_combos)

NameError: name 'old_sheet_till_combos' is not defined