## Pulling down mercury-stained sheets from NMNH DWCA

Attempting to grab data from scratch to replicate "Applications of deep convolutional neural networks to digitized natural history collections" (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5680669/)

In [1]:
import tarfile
import pandas as pd
import numpy as np

### Getting barcodes from Figshare

First, we need to download the image bundles from Figshare in order to get their image barcodes. They are posted separately by [stained](https://smithsonian.figshare.com/articles/dataset/Mercury-stained_botany_images_for_deep_learning/5423083) and [unstained](https://smithsonian.figshare.com/articles/dataset/Unstained_botany_images_for_deep_learning/5423098) datasets.

In [2]:
stained_barcodes = []
with tarfile.open("stained.tar.gz", "r:gz") as tar:
    for filename in tar.getnames():
        if filename.endswith('.jpg'):
            barcode = filename.split('/')[1].split('.')[0]
            stained_barcodes.append(barcode)
stained_barcodes[:5]

['00000140', '00000162', '00000185', '00000209', '00000231']

In [3]:
unstained_barcodes = []
with tarfile.open("unstained.tar.gz", "r:gz") as tar:
    for filename in tar.getnames():
        if filename.endswith('.jpg'):
            barcode = filename.split('/')[1].split('.')[0]
            unstained_barcodes.append(barcode)
unstained_barcodes[:5]

['00000001', '00000003', '00000015', '00000020', '00000021']

In [4]:
stained_barcode_df = pd.DataFrame(stained_barcodes, columns=['barcode'])
stained_barcode_df['stain_status'] = 'stained'
stained_barcode_df.head()

Unnamed: 0,barcode,stain_status
0,140,stained
1,162,stained
2,185,stained
3,209,stained
4,231,stained


In [5]:
unstained_barcode_df = pd.DataFrame(unstained_barcodes, columns=['barcode'])
unstained_barcode_df['stain_status'] = 'unstained'
unstained_barcode_df.head()

Unnamed: 0,barcode,stain_status
0,1,unstained
1,3,unstained
2,15,unstained
3,20,unstained
4,21,unstained


In [20]:
stain_status_df = pd.concat([stained_barcode_df, unstained_barcode_df])
stain_status_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 15553 entries, 0 to 7776
Data columns (total 2 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   barcode       15553 non-null  object
 1   stain_status  15553 non-null  object
dtypes: object(2)
memory usage: 364.5+ KB


In [21]:
stain_status_df['stain_status'].value_counts()

unstained    7777
stained      7776
Name: stain_status, dtype: int64

In [22]:
stain_status_df.to_csv('barcodes_from_figshare.tsv', index=False, sep='\t')

### Pulling multimedia data from NMNH DarwinCore Archive

Here is the link to the Smithsonian NMNH IPT: https://collections.nmnh.si.edu/ipt/resource?r=nmnh_extant_dwc-a

In [9]:
multimedia_df = pd.read_csv('nmnh_multimedia_1_35.tsv.gz', 
                            dtype={'providerLiteral':'category',
                                   'description':'string'},
                            sep='\t', compression='gzip')
multimedia_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10357547 entries, 0 to 10357546
Data columns (total 19 columns):
 #   Column                     Dtype   
---  ------                     -----   
 0   id                         object  
 1   identifier                 object  
 2   type                       object  
 3   title                      object  
 4   rights                     object  
 5   rights.1                   object  
 6   UsageTerms                 object  
 7   WebStatement               object  
 8   licenseLogoURL             object  
 9   source                     object  
 10  creator                    object  
 11  providerLiteral            category
 12  description                string  
 13  subjectCategoryVocabulary  object  
 14  scientificName             float64 
 15  accessURI                  object  
 16  format                     object  
 17  PixelXDimension            int64   
 18  PixelYDimension            int64   
dtypes: category(1), flo

In [10]:
multimedia_df['providerLiteral'].value_counts()

Smithsonian Institution, NMNH, Botany                   9257341
Smithsonian Institution, NMNH, Mammals                   577050
Smithsonian Institution, NMNH, Invertebrate Zoology      184797
Smithsonian Institution, NMNH, Entomology                167637
Smithsonian Institution, NMNH, Fishes                    134203
Smithsonian Institution, NMNH, Birds                      23401
Smithsonian Institution, NMNH, Amphibians & Reptiles      13118
Name: providerLiteral, dtype: int64

In [11]:
len(multimedia_df[multimedia_df.duplicated(keep='first')])

5942814

**Uh oh, it looks like somehow a large portion of the dataset has been duplicated?**

In [12]:
multimedia_df = multimedia_df.drop_duplicates()

In [13]:
multimedia_df['providerLiteral'].value_counts()

Smithsonian Institution, NMNH, Botany                   3314799
Smithsonian Institution, NMNH, Mammals                   577050
Smithsonian Institution, NMNH, Invertebrate Zoology      184706
Smithsonian Institution, NMNH, Entomology                167637
Smithsonian Institution, NMNH, Fishes                    134193
Smithsonian Institution, NMNH, Birds                      23401
Smithsonian Institution, NMNH, Amphibians & Reptiles      12947
Name: providerLiteral, dtype: int64

In [14]:
botany_barcodes = multimedia_df[(multimedia_df['providerLiteral'] == 'Smithsonian Institution, NMNH, Botany') &\
                                (multimedia_df['description']).str.lower().str.contains('barcode')].copy()
botany_barcodes.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 3219718 entries, 1091 to 10357542
Data columns (total 19 columns):
 #   Column                     Dtype   
---  ------                     -----   
 0   id                         object  
 1   identifier                 object  
 2   type                       object  
 3   title                      object  
 4   rights                     object  
 5   rights.1                   object  
 6   UsageTerms                 object  
 7   WebStatement               object  
 8   licenseLogoURL             object  
 9   source                     object  
 10  creator                    object  
 11  providerLiteral            category
 12  description                string  
 13  subjectCategoryVocabulary  object  
 14  scientificName             float64 
 15  accessURI                  object  
 16  format                     object  
 17  PixelXDimension            int64   
 18  PixelYDimension            int64   
dtypes: category(1), f

In [15]:
botany_barcodes.sample(5)

Unnamed: 0,id,identifier,type,title,rights,rights.1,UsageTerms,WebStatement,licenseLogoURL,source,creator,providerLiteral,description,subjectCategoryVocabulary,scientificName,accessURI,format,PixelXDimension,PixelYDimension
3122654,http://n2t.net/ark:/65665/37bad2ed7-d9d4-4296-...,http://collections.nmnh.si.edu/media/index.php...,image,03421802.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,https://www.si.edu/sites/default/files/icons/c...,"US National Herbarium, Department of Botany, N...",Conveyor Belt,"Smithsonian Institution, NMNH, Botany",Barcode 03421802,Specimen/Object,,http://n2t.net/ark:/65665/m3fbd8c332-c077-4ebb...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",6770,8929
742898,http://n2t.net/ark:/65665/3c555b42a-d7b1-401d-...,http://collections.nmnh.si.edu/media/index.php...,image,01605638.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,https://www.si.edu/sites/default/files/icons/c...,"US National Herbarium, Department of Botany, N...",Conveyor Belt,"Smithsonian Institution, NMNH, Botany",Barcode 01605638,Specimen/Object,,http://n2t.net/ark:/65665/m3afd5de9c-ed6c-43a2...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",6762,9004
3834870,http://n2t.net/ark:/65665/3331b5601-143a-4a1f-...,http://collections.nmnh.si.edu/media/index.php...,image,00069562.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,https://www.si.edu/sites/default/files/icons/c...,"US National Herbarium, Department of Botany, N...",Sophia Lee,"Smithsonian Institution, NMNH, Botany","Grubb, P. J. 2859, US National Herbarium Sheet...",Specimen/Object,,http://n2t.net/ark:/65665/m318a7d23e-9de7-410d...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",3700,5368
1226719,http://n2t.net/ark:/65665/355749c65-b5df-4027-...,http://collections.nmnh.si.edu/media/index.php...,image,01923345.TIF,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,https://www.si.edu/sites/default/files/icons/c...,,"Digitization Interns, Laura Tancredi","Smithsonian Institution, NMNH, Botany","Bartlett, H. H. 13105, US National Herbarium S...",Documentation,,http://n2t.net/ark:/65665/m329e098b9-8e91-471d...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",3840,5760
10353045,http://n2t.net/ark:/65665/380447a4d-2b5d-40c6-...,http://collections.nmnh.si.edu/media/index.php...,image,00772564.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,https://www.si.edu/sites/default/files/icons/c...,"US National Herbarium, Department of Botany, N...",Daniel Fernicola,"Smithsonian Institution, NMNH, Botany","Stergios, B. G. 19777, US National Herbarium S...",Specimen/Object,,http://n2t.net/ark:/65665/m343f7c289-46fa-469f...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",3744,5616


In [16]:
botany_barcodes.sample(5).to_dict(orient='records')

[{'id': 'http://n2t.net/ark:/65665/39674926d-4f71-4603-adfe-e380a3bdce74',
  'identifier': 'http://collections.nmnh.si.edu/media/index.php?irn=13702418',
  'type': 'image',
  'title': '03349335.tif',
  'rights': 'CC0',
  'rights.1': 'CC0',
  'UsageTerms': 'https://creativecommons.org/publicdomain/zero/1.0/',
  'WebStatement': 'https://naturalhistory.si.edu/research/nmnh-collections/museum-collections-policies',
  'licenseLogoURL': 'https://www.si.edu/sites/default/files/icons/cc0.svg',
  'source': 'US National Herbarium, Department of Botany, NMNH, Smithsonian Institution',
  'creator': 'Conveyor Belt',
  'providerLiteral': 'Smithsonian Institution, NMNH, Botany',
  'description': 'Barcode 03349335',
  'subjectCategoryVocabulary': 'Specimen/Object',
  'scientificName': nan,
  'accessURI': 'http://n2t.net/ark:/65665/m34e0f736d-34ee-4d1f-9e30-181342c9b6d1',
  'format': 'tiff, jpeg, jpeg, jpeg, jpeg, jpeg',
  'PixelXDimension': 6823,
  'PixelYDimension': 9015},
 {'id': 'http://n2t.net/ark

In [17]:
def extract_barcode(description_text):
    space_split = description_text.lower().split()
    barcode_idx = space_split.index('barcode')
    if len(space_split) == barcode_idx + 1:
        return np.nan
    else:
        barcode_number = space_split[barcode_idx + 1].strip('.').strip(',')
        return barcode_number

In [18]:
botany_barcodes['barcode'] = botany_barcodes['description'].apply(extract_barcode)
botany_barcodes[['description','barcode']].sample(20)

Unnamed: 0,description,barcode
979875,Barcode 01110601,1110601
10172234,Barcode 03786451,3786451
2230489,Barcode 01818691,1818691
3172440,Barcode 02971914,2971914
3057720,Barcode 03212896,3212896
2470766,Barcode 01772268,1772268
3585884,"Palmer, E. 973, US National Herbarium Sheet 20...",894369
1377319,Barcode 00288822,288822
2469909,"Pringle, C. G. 9928, US National Herbarium She...",817155
2653992,Barcode 02143277,2143277


In [19]:
botany_barcodes['barcode_len'] = botany_barcodes['barcode'].str.len()
botany_barcodes['barcode_len'].value_counts()

8.0     3219629
7.0           3
19.0          1
11.0          1
2.0           1
Name: barcode_len, dtype: int64

In [23]:
mercury_merge = stain_status_df.merge(botany_barcodes, on='barcode', how='left')
mercury_merge.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 18823 entries, 0 to 18822
Data columns (total 22 columns):
 #   Column                     Non-Null Count  Dtype   
---  ------                     --------------  -----   
 0   barcode                    18823 non-null  object  
 1   stain_status               18823 non-null  object  
 2   id                         18661 non-null  object  
 3   identifier                 18661 non-null  object  
 4   type                       18661 non-null  object  
 5   title                      18661 non-null  object  
 6   rights                     18661 non-null  object  
 7   rights.1                   18661 non-null  object  
 8   UsageTerms                 18661 non-null  object  
 9   WebStatement               18661 non-null  object  
 10  licenseLogoURL             16506 non-null  object  
 11  source                     18527 non-null  object  
 12  creator                    18519 non-null  object  
 13  providerLiteral            1866

In [24]:
len(mercury_merge[mercury_merge.duplicated(subset='barcode')])

3272

**Uh oh, even after dropping complete duplicate records, there are still 3272 duplicate barcodes**

In [25]:
mercury_merge[mercury_merge.duplicated(subset='barcode',keep=False)].sort_values('barcode').head(10).to_dict(orient='records')

[{'barcode': '00000209',
  'stain_status': 'stained',
  'id': 'http://n2t.net/ark:/65665/350435d2c-8228-4f1c-b2ed-99b4bb0ab20d',
  'identifier': 'http://collections.nmnh.si.edu/media/index.php?irn=10142667',
  'type': 'image',
  'title': '00000209.tif',
  'rights': 'CC0',
  'rights.1': 'CC0',
  'UsageTerms': 'https://creativecommons.org/publicdomain/zero/1.0/',
  'WebStatement': 'https://naturalhistory.si.edu/research/nmnh-collections/museum-collections-policies',
  'licenseLogoURL': 'https://www.si.edu/sites/default/files/icons/cc0.svg',
  'source': 'Specimen from Department of Botany, NMNH, Smithsonian Institution',
  'creator': 'Ingrid P. Lin',
  'providerLiteral': 'Smithsonian Institution, NMNH, Botany',
  'description': 'US National Herbarium specimen, barcode 00000209',
  'subjectCategoryVocabulary': 'Specimen/Object',
  'scientificName': nan,
  'accessURI': 'http://n2t.net/ark:/65665/m3325dc959-7428-4973-b804-f4c8273d7cbc',
  'format': 'tiff, jpeg, jpeg, jpeg, jpeg, jpeg',
  'Pi

The first duplicate barcode (00000209) appears to have 2 different specimen IDs:

* http://n2t.net/ark:/65665/3ce233bf5-d0e1-4967-9d59-acc0726d5588
* http://n2t.net/ark:/65665/350435d2c-8228-4f1c-b2ed-99b4bb0ab20d

It shows the same herbarium sheet, but the 2 links have slightly different specimen data. This is because there are 2 different specimens on the same sheet!

In [26]:
single_sheets = mercury_merge.drop_duplicates(subset='barcode',keep=False)
single_sheets.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 12306 entries, 0 to 18822
Data columns (total 22 columns):
 #   Column                     Non-Null Count  Dtype   
---  ------                     --------------  -----   
 0   barcode                    12306 non-null  object  
 1   stain_status               12306 non-null  object  
 2   id                         12144 non-null  object  
 3   identifier                 12144 non-null  object  
 4   type                       12144 non-null  object  
 5   title                      12144 non-null  object  
 6   rights                     12144 non-null  object  
 7   rights.1                   12144 non-null  object  
 8   UsageTerms                 12144 non-null  object  
 9   WebStatement               12144 non-null  object  
 10  licenseLogoURL             11230 non-null  object  
 11  source                     12020 non-null  object  
 12  creator                    12027 non-null  object  
 13  providerLiteral            1214

In [27]:
single_sheets['rights'].value_counts()

CC0                       11230
Usage Conditions Apply      914
Name: rights, dtype: int64

In [28]:
single_oa = single_sheets[single_sheets['rights'] == 'CC0']
single_oa.head()

Unnamed: 0,barcode,stain_status,id,identifier,type,title,rights,rights.1,UsageTerms,WebStatement,...,creator,providerLiteral,description,subjectCategoryVocabulary,scientificName,accessURI,format,PixelXDimension,PixelYDimension,barcode_len
0,140,stained,http://n2t.net/ark:/65665/3b487a6b2-3b6f-4b94-...,http://collections.nmnh.si.edu/media/index.php...,image,00000140.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,...,Ingrid P. Lin,"Smithsonian Institution, NMNH, Botany","US National Herbarium specimen, barcode 00000140",Specimen/Object,,http://n2t.net/ark:/65665/m36e1bbdd7-8c33-4a87...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",7319.0,10319.0,8.0
1,162,stained,http://n2t.net/ark:/65665/344cc518e-410d-47d3-...,http://collections.nmnh.si.edu/media/index.php...,image,00000162.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,...,Ingrid P. Lin,"Smithsonian Institution, NMNH, Botany","US National Herbarium specimen, barcode 00000162",Specimen/Object,,http://n2t.net/ark:/65665/m37df7ca9f-c121-4c3e...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",3876.0,4968.0,8.0
2,185,stained,http://n2t.net/ark:/65665/3dcdf7935-e858-4580-...,http://collections.nmnh.si.edu/media/index.php...,image,00000185.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,...,Ingrid P. Lin,"Smithsonian Institution, NMNH, Botany","US National Herbarium specimen, barcode 00000185",Specimen/Object,,http://n2t.net/ark:/65665/m3c809b3ca-2e52-48a2...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",7319.0,10319.0,8.0
5,231,stained,http://n2t.net/ark:/65665/3d1856a3b-d09a-4ad9-...,http://collections.nmnh.si.edu/media/index.php...,image,00000231.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,...,Ingrid P. Lin,"Smithsonian Institution, NMNH, Botany","US National Herbarium specimen, barcode 00000231",Specimen/Object,,http://n2t.net/ark:/65665/m329973bc2-3529-4f95...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",3732.0,4892.0,8.0
6,232,stained,http://n2t.net/ark:/65665/3652daf45-abaa-4f26-...,http://collections.nmnh.si.edu/media/index.php...,image,00000232.tif,CC0,CC0,https://creativecommons.org/publicdomain/zero/...,https://naturalhistory.si.edu/research/nmnh-co...,...,Ingrid P. Lin,"Smithsonian Institution, NMNH, Botany","US National Herbarium specimen, barcode 00000232",Specimen/Object,,http://n2t.net/ark:/65665/m387dc9ab2-cba2-4ba6...,"tiff, jpeg, jpeg, jpeg, jpeg, jpeg",3668.0,4868.0,8.0


In [29]:
cols_to_keep = ['barcode','stain_status','id','accessURI']
single_oa = single_oa[cols_to_keep]
single_oa.head()

Unnamed: 0,barcode,stain_status,id,accessURI
0,140,stained,http://n2t.net/ark:/65665/3b487a6b2-3b6f-4b94-...,http://n2t.net/ark:/65665/m36e1bbdd7-8c33-4a87...
1,162,stained,http://n2t.net/ark:/65665/344cc518e-410d-47d3-...,http://n2t.net/ark:/65665/m37df7ca9f-c121-4c3e...
2,185,stained,http://n2t.net/ark:/65665/3dcdf7935-e858-4580-...,http://n2t.net/ark:/65665/m3c809b3ca-2e52-48a2...
5,231,stained,http://n2t.net/ark:/65665/3d1856a3b-d09a-4ad9-...,http://n2t.net/ark:/65665/m329973bc2-3529-4f95...
6,232,stained,http://n2t.net/ark:/65665/3652daf45-abaa-4f26-...,http://n2t.net/ark:/65665/m387dc9ab2-cba2-4ba6...


In [40]:
specimen_cols = ['id','institutionCode','catalogNumber',
                 'recordedBy','identifiedBy',
                 'year','month','day','verbatimEventDate',
                 'country','stateProvince','county','locality',
                 'scientificName']
specimen_dtypes={'institutionCode':'category',
                  'recordedBy':'category',
                  'identifiedBy':'category',
                  'year':'category',
                  'month':'category',
                  'day':'category',
                  'country':'category',
                  'stateProvince':'category',
                  'county':'category',
                  'scientificName':'category'}
specimen_df = pd.read_csv('nmnh_occurrence_1_35.tsv.gz',
                          usecols=specimen_cols,
                          dtype=specimen_dtypes,
                            sep='\t', compression='gzip')
specimen_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8484939 entries, 0 to 8484938
Data columns (total 14 columns):
 #   Column             Dtype   
---  ------             -----   
 0   id                 object  
 1   institutionCode    category
 2   catalogNumber      object  
 3   recordedBy         category
 4   year               category
 5   month              category
 6   day                category
 7   verbatimEventDate  object  
 8   country            category
 9   stateProvince      category
 10  county             category
 11  locality           object  
 12  identifiedBy       category
 13  scientificName     category
dtypes: category(10), object(4)
memory usage: 463.3+ MB


In [42]:
single_oa_merged = single_oa.merge(specimen_df, on='id', how='left')
single_oa_merged.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 11230 entries, 0 to 11229
Data columns (total 17 columns):
 #   Column             Non-Null Count  Dtype   
---  ------             --------------  -----   
 0   barcode            11230 non-null  object  
 1   stain_status       11230 non-null  object  
 2   id                 11230 non-null  object  
 3   accessURI          11230 non-null  object  
 4   institutionCode    11227 non-null  category
 5   catalogNumber      11181 non-null  object  
 6   recordedBy         11227 non-null  category
 7   year               9866 non-null   category
 8   month              7281 non-null   category
 9   day                4726 non-null   category
 10  verbatimEventDate  10351 non-null  object  
 11  country            11223 non-null  category
 12  stateProvince      8204 non-null   category
 13  county             1729 non-null   category
 14  locality           8964 non-null   object  
 15  identifiedBy       386 non-null    category
 16  scie

Weird, 3 specimen ids from the multimedia_df weren't matched in the specimen_df. The ark ids do resolve to a specimen page, so this is especially strange.

In [45]:
single_oa_merged[pd.isnull(single_oa_merged['institutionCode'])][['barcode','id','accessURI']].to_dict(orient='records')

[{'barcode': '00588658',
  'id': 'http://n2t.net/ark:/65665/3e6279d93-50bd-4e4f-862e-890302d76274',
  'accessURI': 'http://n2t.net/ark:/65665/m374271295-f593-4887-98dc-aa97d14578c5'},
 {'barcode': '00323691',
  'id': 'http://n2t.net/ark:/65665/3129c61ba-c8e0-473a-81d5-01ed98cd2072',
  'accessURI': 'http://n2t.net/ark:/65665/m3c29e056f-5865-41dd-ab4e-15b23e8ef6af'},
 {'barcode': '01101316',
  'id': 'http://n2t.net/ark:/65665/35642686e-0f48-41b8-8cc4-7bf07246d090',
  'accessURI': 'http://n2t.net/ark:/65665/m3330aabad-b4f6-4c22-a57b-7d6eb1bde76a'}]

In [46]:
single_oa_merged.to_csv('mercury_specimen_data.tsv', sep='\t', index=False)