## Mineral Deposit Data Analysis

This notebook load and analyze the mineral deposit data. 

### load data
This section imports the required packages, download the dataset from the website and list the file names. As below, the data_loc is where the data set will be saved. To save storage space, the dataset won't be extracted as a folder. The data procesing will be done based via reading corresponding file contained in the zip file.  

In [126]:
# import required packages
import pandas as pd
import os
import pickle
import sys
pd.options.display.width=None
pd.options.display.max_columns=None


if sys.version_info >= (3, 6):
    from zipfile import ZipFile as zipfile
else:
    import zipfile36 as zipfile
    
url = "https://unearthed-exploresa.s3-ap-southeast-2.amazonaws.com/Unearthed_5_SARIG_Data_Package.zip" 
# enter the directory to save data
data_loc = 'D:/GitFolder/WorkBench/exploreSA-Gawler/data'
file_name = 'Unearthed_5_SARIG_Data_Package.zip'

if os.path.isfile(os.path.join(data_loc, file_name)):
    print ("File exist")
    pass
else:
    # open and save the zip file onto computer
    url = urlopen(URL)
    output = open('Unearthed_5_SARIG_Data_Package.zip', 'wb')    # note the flag:  "wb"        
    output.write(url.read())
    output.close()
    
files_in_dataset = []
file_name = 'Unearthed_5_SARIG_Data_Package.zip'
for file in zipfile(os.path.join(data_loc, file_name),'r').filelist:
    files_in_dataset.append(file.filename)
    
files_in_dataset



File exist


['SARIG_Data_Package/sarig_dh_core_exp.csv',
 'SARIG_Data_Package/sarig_dh_details_exp.csv',
 'SARIG_Data_Package/sarig_dh_litho_exp.csv',
 'SARIG_Data_Package/sarig_dh_petrophys_exp.csv',
 'SARIG_Data_Package/sarig_dh_reference_exp.csv',
 'SARIG_Data_Package/sarig_dh_strat_exp.csv',
 'SARIG_Data_Package/sarig_fieldobs_exp.csv',
 'SARIG_Data_Package/sarig_fieldobs_litho_exp.csv',
 'SARIG_Data_Package/sarig_fieldobs_note_exp.csv',
 'SARIG_Data_Package/sarig_fieldobs_struct_exp.csv',
 'SARIG_Data_Package/sarig_md_commodity_exp.csv',
 'SARIG_Data_Package/sarig_md_details_exp.csv',
 'SARIG_Data_Package/sarig_md_mineralogy_exp.csv',
 'SARIG_Data_Package/sarig_md_reference_exp.csv',
 'SARIG_Data_Package/sarig_md_zone_hr_lith_exp.csv',
 'SARIG_Data_Package/sarig_md_zone_lith_exp.csv',
 'SARIG_Data_Package/sarig_rs_biostr_analys_exp.csv',
 'SARIG_Data_Package/sarig_rs_biostr_results_exp.csv',
 'SARIG_Data_Package/sarig_rs_chem_exp.csv',
 'SARIG_Data_Package/sarig_rs_chem_isotope_exp.csv',
 'SA

 For this part of data cleaning, we will only use the following files: 
 - 'SARIG_Data_Package/sarig_md_commodity_exp.csv',
 - 'SARIG_Data_Package/sarig_md_details_exp.csv',
 - 'SARIG_Data_Package/sarig_md_mineralogy_exp.csv',
 - 'SARIG_Data_Package/sarig_md_reference_exp.csv',
 - 'SARIG_Data_Package/sarig_md_zone_hr_lith_exp.csv',
 - 'SARIG_Data_Package/sarig_md_zone_lith_exp.csv'

### determine the record identifier

In [127]:
# read the reference data
sarig_md_reference_exp = pd.read_csv(
    zipfile(os.path.join(data_loc, file_name),'r').open('SARIG_Data_Package/sarig_md_reference_exp.csv','r'), 
    sep=',', encoding='latin1')
sarig_md_reference_exp['PUBLICATION_DATE'] = pd.to_datetime(sarig_md_reference_exp['PUBLICATION_DATE'])
sarig_md_reference_exp.head(3)

Unnamed: 0,MINERAL_DEPOSIT_NO,DOC_TYPE_CODE,DOC_TYPE_DESC,DOC_REF_ID,REFERENCE,AUTHOR,PUBLICATION_DATE,TITLE,PUBLICATION,SAMREF_CNO,SAMREF_RECORD_URL,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,ENV,Envelope,09294,ENV 09294,"Hughes, F.J.;Johnson, P.;Olliver, J.G.;Holden,...",NaT,Data release - as updated [made at SA Director...,Government of South Australia. Department for ...,2036021.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,17,OTH,Other,,,ANONYMOUS,NaT,RECORD OF MINES SUMMARY CARD NO 97,SOUTH AUSTRALIA. DEPARTMENT OF MINES AND ENERG...,,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
2,17,RB,Report Book,63/00058,RB 63/00058,"Leeson, B.",1970-01-01,Geology of the Beltana 1:63 360 map area.,South Australia. Department of Mines. Report Book,2000960.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849


In [128]:
sarig_md_reference_exp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 14680 entries, 0 to 14679
Data columns (total 19 columns):
MINERAL_DEPOSIT_NO    14680 non-null int64
DOC_TYPE_CODE         14642 non-null object
DOC_TYPE_DESC         14642 non-null object
DOC_REF_ID            10919 non-null object
REFERENCE             11119 non-null object
AUTHOR                13981 non-null object
PUBLICATION_DATE      6343 non-null datetime64[ns]
TITLE                 14640 non-null object
PUBLICATION           14638 non-null object
SAMREF_CNO            11069 non-null float64
SAMREF_RECORD_URL     11069 non-null object
SITE_NO               14680 non-null int64
EASTING_GDA2020       14680 non-null float64
NORTHING_GDA2020      14680 non-null float64
ZONE_GDA2020          14680 non-null int64
LONGITUDE_GDA2020     14680 non-null float64
LATITUDE_GDA2020      14680 non-null float64
LONGITUDE_GDA94       14680 non-null float64
LATITUDE_GDA94        14680 non-null float64
dtypes: datetime64[ns](1), float64(7), int64

In [129]:
sarig_md_reference_exp.isnull().any()

MINERAL_DEPOSIT_NO    False
DOC_TYPE_CODE          True
DOC_TYPE_DESC          True
DOC_REF_ID             True
REFERENCE              True
AUTHOR                 True
PUBLICATION_DATE       True
TITLE                  True
PUBLICATION            True
SAMREF_CNO             True
SAMREF_RECORD_URL      True
SITE_NO               False
EASTING_GDA2020       False
NORTHING_GDA2020      False
ZONE_GDA2020          False
LONGITUDE_GDA2020     False
LATITUDE_GDA2020      False
LONGITUDE_GDA94       False
LATITUDE_GDA94        False
dtype: bool

Since the columns "MINERAL_DEPOSIT_NO",  "SITE_NO", "LONGITUDE_GDA2020", "LATITUDE_GDA2020" contain no null values, they are potential identifier of records in the following analysis. 

In [130]:
# decide the uniqueness
print(len(sarig_md_reference_exp['MINERAL_DEPOSIT_NO'].unique()), len(sarig_md_reference_exp['SITE_NO'].unique()))
print(len(sarig_md_reference_exp[['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']].drop_duplicates()))
print(len(sarig_md_reference_exp[['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']].drop_duplicates()))


6822 6822
6819
6819


Here, the unique values of 'MINERAL_DEPOSIT_NO' and 'SITE_NO' are equal but they are different from that of ['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']. There might be distinct 'MINERAL_DEPOSIT_NO' or 'SITE_NO' correspond to the same longitude and latitude. This should be investigated. 

In [131]:
# remove the duplicates 
site_lon_lat = sarig_md_reference_exp[['MINERAL_DEPOSIT_NO','SITE_NO', 'LONGITUDE_GDA2020', 'LATITUDE_GDA2020']].drop_duplicates()

# count the records corresponding to the same longitude and latitude
count_site = site_lon_lat.groupby(by=['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']).count()

# find these cooridinates
ifentified_lon_lat = count_site[count_site['MINERAL_DEPOSIT_NO']!=1].reset_index()[['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']]

In [132]:
sarig_md_reference_exp.merge(ifentified_lon_lat, how='inner', on=['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']).set_index(['LONGITUDE_GDA2020', 'LATITUDE_GDA2020'])

Unnamed: 0_level_0,Unnamed: 1_level_0,MINERAL_DEPOSIT_NO,DOC_TYPE_CODE,DOC_TYPE_DESC,DOC_REF_ID,REFERENCE,AUTHOR,PUBLICATION_DATE,TITLE,PUBLICATION,SAMREF_CNO,SAMREF_RECORD_URL,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
LONGITUDE_GDA2020,LATITUDE_GDA2020,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1
138.826976,-34.654329,3612,ENV,Envelope,03336,ENV 03336,"Wells, R.",1978-01-01,History of mines in the areas held under ELs 3...,South Australia. Department of Mines and Energ...,1006218.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,476755,300860.7,6163141.47,54,138.826966,-34.654315
138.826976,-34.654329,3612,OTH,Other,,,ANONYMOUS,NaT,RECORD OF MINES SUMMARY CARD NO 81,SOUTH AUSTRALIA. DEPARTMENT OF MINES AND ENERG...,,,476755,300860.7,6163141.47,54,138.826966,-34.654315
138.826976,-34.654329,3717,MESAJ,MESA Journal,071,MESAJ 071,"Drew, G.",NaT,Barossa Goldfield: a historical snapshot.,MESA Journal,2036243.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,476860,300860.7,6163141.47,54,138.826966,-34.654315
138.826976,-34.654329,3717,OTH,Other,,,ANONYMOUS,NaT,RECORD OF MINES SUMMARY CARD NO 196,SOUTH AUSTRALIA. DEPARTMENT OF MINES AND ENERG...,,,476860,300860.7,6163141.47,54,138.826966,-34.654315
136.939669,-33.482961,7104,RB,Report Book,89/00051,RB 89/00051,"Flint, D.J.;Dubowski, E.A.",NaT,Cowell Jade Province - Detailed geological map...,South Australia. Department of Mines and Energ...,5589.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,480250,680215.71,6293486.53,53,136.939659,-33.482947
136.939669,-33.482961,7105,RB,Report Book,89/00051,RB 89/00051,"Flint, D.J.;Dubowski, E.A.",NaT,Cowell Jade Province - Detailed geological map...,South Australia. Department of Mines and Energ...,5589.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,480251,680215.71,6293486.53,53,136.939659,-33.482947
139.379872,-30.221761,10516,BULL,Bulletin,030,BULL 030,"Dickinson, S.B.;Wade, M.L.;Webb, B.P.",2054-01-01,Geology of the East Painter uranium deposits (...,South Australia. Geological Survey. Bulletin,1007468.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,2006782,344080.71,6655531.48,54,139.379862,-30.221747
139.379872,-30.221761,10517,BULL,Bulletin,030,BULL 030,"Dickinson, S.B.;Wade, M.L.;Webb, B.P.",2054-01-01,Geology of the East Painter uranium deposits (...,South Australia. Geological Survey. Bulletin,1007468.0,https://sarigbasis.pir.sa.gov.au/WebtopEw/ws/s...,2006783,344080.71,6655531.48,54,139.379862,-30.221747


From above, we can see there are three cases where two MINERAL_DEPOSIT_NO share the same set of (LONGITUDE, LATITUDE). This might be the case where TWO sites (with different SITE_NO, MINERAL_DEPOSIT_NO) have the same coordinates. The following data provides evidence for this guess. 

In [133]:
# the set of the MINERAL_DEPOSIT_NO which share cordinates
sarig_md_reference_exp.merge(
    ifentified_lon_lat, how='inner', 
    on=['LONGITUDE_GDA2020', 'LATITUDE_GDA2020']).set_index(
    ['LONGITUDE_GDA2020', 'LATITUDE_GDA2020'])['MINERAL_DEPOSIT_NO'].values

array([ 3612,  3612,  3717,  3717,  7104,  7105, 10516, 10517],
      dtype=int64)

### Commodities 

This section identifies the set of commodity names and allows the users of this code to select the commodities for which they want to extract related data

In [134]:
sarig_md_commodity_exp = pd.read_csv(
    zipfile(os.path.join(data_loc, file_name),'r').open('SARIG_Data_Package/sarig_md_commodity_exp.csv','r'), 
    sep=',', encoding='latin1')
sarig_md_commodity_exp.head(5)

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,COMMODITY_CODE,COMMODITY_NAME,COMMODITY_CONFIDENCE,SIGNIFICANCE,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,ENTERPRISE BELTANA,Cu,Copper,,MAJOR,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,21,HARVEYS RETURN,Cu,Copper,,MAJOR,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
2,33,OOLDEA,Fe,Iron,,MAJOR,209556,779511.42,6607901.66,52,131.916017,-30.62887,131.916008,-30.628856
3,34,VICTORY DOWNS,REE,Rare Earths,,MAJOR,209576,300730.84,7120631.49,53,133.008773,-26.019552,133.008764,-26.019538
4,34,VICTORY DOWNS,HMIN,Heavy Minerals,,MINOR,209576,300730.84,7120631.49,53,133.008773,-26.019552,133.008764,-26.019538


In [135]:
sarig_md_commodity_exp.loc[sarig_md_commodity_exp['MINERAL_DEPOSIT_NO'].isin([3612,  3612,  3717,  3717,  7104,  7105, 10516, 10517])]

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,COMMODITY_CODE,COMMODITY_NAME,COMMODITY_CONFIDENCE,SIGNIFICANCE,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
2621,3612,HISSEYS GULLY,Au,Gold,,MAJOR,476755,300860.7,6163141.47,54,138.826976,-34.654329,138.826966,-34.654315
2750,3717,VICTORIA HILL,Au,Gold,,MAJOR,476860,300860.7,6163141.47,54,138.826976,-34.654329,138.826966,-34.654315
5568,7104,COWELL JADE 37,JADE,Jade,,MAJOR,480250,680215.71,6293486.53,53,136.939669,-33.482961,136.939659,-33.482947
5569,7104,COWELL JADE 37,MARB,Marble,,MINOR,480250,680215.71,6293486.53,53,136.939669,-33.482961,136.939659,-33.482947
5570,7104,COWELL JADE 37,TALC,Talc,,MINOR,480250,680215.71,6293486.53,53,136.939669,-33.482961,136.939659,-33.482947
5571,7105,COWELL JADE 38,TALC,Talc,,MINOR,480251,680215.71,6293486.53,53,136.939669,-33.482961,136.939659,-33.482947
5572,7105,COWELL JADE 38,JADE,Jade,,MAJOR,480251,680215.71,6293486.53,53,136.939669,-33.482961,136.939659,-33.482947
5573,7105,COWELL JADE 38,MARB,Marble,,MINOR,480251,680215.71,6293486.53,53,136.939669,-33.482961,136.939659,-33.482947
9575,10516,EAST PAINTER C,U,Uranium,,MAJOR,2006782,344080.71,6655531.48,54,139.379872,-30.221761,139.379862,-30.221747
9576,10517,EAST PAINTER B,U,Uranium,,MAJOR,2006783,344080.71,6655531.48,54,139.379872,-30.221761,139.379862,-30.221747


The same coordinate but different MINERAL_DEPOSIT_NO, as well as different DEPOSIT_NAME. This proves the guess that some sites (SITE_NO, MINERAL_DEPOSIT_NO) actually share coordinates. This fact also suggest that we should use SITE_NO or MINERAL_DEPOSIT_NO as record identifier o records. 

In [136]:
sarig_md_commodity_exp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11017 entries, 0 to 11016
Data columns (total 14 columns):
MINERAL_DEPOSIT_NO      11017 non-null int64
DEPOSIT_NAME            11017 non-null object
COMMODITY_CODE          11017 non-null object
COMMODITY_NAME          11017 non-null object
COMMODITY_CONFIDENCE    125 non-null object
SIGNIFICANCE            11016 non-null object
SITE_NO                 11017 non-null int64
EASTING_GDA2020         11017 non-null float64
NORTHING_GDA2020        11017 non-null float64
ZONE_GDA2020            11017 non-null int64
LONGITUDE_GDA2020       11017 non-null float64
LATITUDE_GDA2020        11017 non-null float64
LONGITUDE_GDA94         11017 non-null float64
LATITUDE_GDA94          11017 non-null float64
dtypes: float64(6), int64(3), object(5)
memory usage: 1.2+ MB


In [137]:
'''
use this code to generate the set_commodity_name
 sarig_md_commodity_exp['COMMODITY_NAME'].unique()
set_commodity_name = ['Copper', 'Iron', 'Rare Earths', 'Heavy Minerals', 
                      'Gold', 'Chrysoprase', 'Cobalt', 'Nickel','Corundum', 
                      'Vanadium', 'Ilmenite', 'Chromium', 'Agate', 'Celestite',
                      'Clay', 'Shale', 'Granite', 
                      'Ironstone - construction materials', 'Opal', 'Alunite',
                      'Micaceous Hematite', 'Kaolin', 'Dolomite', 'Limestone',
                      'Gravel', 'Sandstone', 'Quartzite', 'Dolerite', 
                      'Rhyolite', 'Graphite', 'Magnesite', 'Lead', 'Marble', 
                      'Uranium', 'Thorium', 'Asbestos', 'Zinc', 'Talc', 
                      'Manganese', 'Sand', 'Gneiss', 'Gabbro', 'Amphibolite', 
                      'Beryl', 'Uranium Oxide', 'Iron Ore', 'Silver', 'Schist', 
                      'Calcrete', 'Metasiltstone', 'Amazonite', 'Tungsten', 
                      'Molybdenum', 'Gypsum', 'Lime sand', 'Phosphate', 
                      'Diamond', 'Platinoids', 'Salt', 'Aluminium', 'Tin', 
                      'Amethyst', 'Jade', 'Pozzolan (Volcanic Ash)', 
                      'Silica sand', 'Sapphire', 'Slate', 'Basalt', 
                      'Tourmaline', 'Feldspar', 'Silica', 'Barite', 
                      'Calcite', 'Fluorite', 'Kyanite', 'Sulphur', 'Quartz', 
                      'Sillimanite', 'Mica', 'Beryllium', 'Pegmatite', 
                      'Andalusite', 'Bismuth','Carphosiderite', 'Chiastolite',
                      'Radium', 'Wollastonite', 'Arsenic', 'Garnet', 'Ochre',
                      'Coal', 'Rutile', 'Mercury','Palygorskite', 'Turquoise', 
                      'Scholzite', 'Shell grit', 'Topaz','Vermiculite', 
                      'Siltstone', 'Norite', 'Magnesium', 'Antimony',
                      'Epsomite', 'Albite', 'Ruby', 'Trona', 'Potash', 'Peat',
                      'Diatomite', 'Tantalum', 'Oil Shale', 'Nephrite', 
                      'Allanite', 'Monazite', 'Halloysite', 'Titanium', 'Gas',
                      'Evaporites',  'Geothermal Energy', 'Lithium']
'''
# select the commodities interested from the above commodity names. 
commodities_interested = ['Copper', 'Gold']
interested_md_commodity_exp = sarig_md_commodity_exp[
    sarig_md_commodity_exp['COMMODITY_NAME'].isin(commodities_interested)]

Here, we use SITE_NO as record identifier. 

In [138]:
# the SITE_NO corresponding to the selected commodities
interested_md_commodity_exp_site = interested_md_commodity_exp['SITE_NO']
interested_md_commodity_exp_site

0         210457
1         210458
5         209575
6         209572
7         209573
          ...   
10992    2097707
10993    2097707
11001    2099334
11002    2099339
11015    2099497
Name: SITE_NO, Length: 3097, dtype: int64

In [139]:
# read the mineral deposit details data
sarig_md_details_exp = pd.read_csv(
    zipfile(os.path.join(data_loc, file_name),'r').open('SARIG_Data_Package/sarig_md_details_exp.csv','r'), 
    sep=',', encoding='latin1')
sarig_md_details_exp['DISCOVERY_YEAR'] = sarig_md_details_exp['DISCOVERY_YEAR'].astype('Int64')
sarig_md_details_exp.head(5)

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,DEPOSIT_SYNONYMS,DISCOVERY_YEAR,DEPOSIT_CLASS,DEPOSIT_SUMMARY,COMMODITIES,MINEROLOGY_ORE,MINEROLOGY_GANGUE,MINERAL_DISTRICT,MINE_SUMMARY_CARD,REFERENCE_FLAG,MAP_250000,MAP_100000,MAP_50000,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94,HORIZ_ACCRCY_M,ELEVATION_M,VERT_ACCRCY_M,SURVEY_METHOD_CODE,SURVEY_METHOD
0,17,ENTERPRISE BELTANA,BELTANA; ENTERPRISE; MC 6903,1899,Occurrence,S continuation of the Black Feather vein syste...,Copper,"Bornite, Chalcocite, Chalcopyrite, Malachite","Quartz, Siderite",,Y,Y,SH5409 COPLEY,6536 Copley,2,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,20.0,,,GOEAR,Google Earth image
1,21,HARVEYS RETURN,FOUR-MILE CLAIM,1899,Occurrence,Cu mineralisation in a NS-trending qz-siderite...,Copper,Malachite,"Barite, Calcite, Iron oxide (non specific), ...",,Y,Y,SH5409 COPLEY,6536 Copley,2,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,20.0,,,GOEAR,Google Earth image
2,33,OOLDEA,,1987,Prospect,"aeromagnetic anomaly, of ~25km strike length a...",Iron,Magnetite,"Feldspar, Quartz",,N,Y,SH5212 OOLDEA,5236 Moondrah,1,209556,779511.42,6607901.66,52,131.916017,-30.62887,131.916008,-30.628856,50.0,,,DOCM,"Sourced from documents (PLANS, ENV, RB,etc)"
3,34,VICTORY DOWNS,,1991,Occurrence,stream sediment sampling returned heavy minera...,"Rare Earths, Heavy Minerals","Ilmenite, Monazite, Zircon",Magnetite,,N,Y,SG5309 ALBERGA,5545 Alcurra,4,209576,300730.84,7120631.49,53,133.008773,-26.019552,133.008764,-26.019538,50.0,,,UCMAP,Uncontrolled map
4,35,MENGERSONS WELL,GREENTONE,1970,Occurrence,1970s exploration by Australian Aquitaine Petr...,Copper,Cuprite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934,100.0,,,GOEAR,Google Earth image


In [140]:
expand_commodities = sarig_md_details_exp.set_index('MINERAL_DEPOSIT_NO')['COMMODITIES'].str.split(',', expand=True).stack().reset_index().drop('level_1', axis=1)
expand_commodities.rename(columns={0: "COMMODITY_NAME"}, inplace=True)
expand_commodities.head(3)

Unnamed: 0,MINERAL_DEPOSIT_NO,COMMODITY_NAME
0,17,Copper
1,21,Copper
2,33,Iron


In [141]:
expand_md_details_exp = sarig_md_details_exp.merge(expand_commodities, how='left', on='MINERAL_DEPOSIT_NO')
expand_md_details_exp.drop('COMMODITIES', axis=1, inplace=True)

In [142]:
interested_md_details_exp = expand_md_details_exp[
    expand_md_details_exp['SITE_NO'].isin(interested_md_commodity_exp_site)]
interested_md_details_exp.head()

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,DEPOSIT_SYNONYMS,DISCOVERY_YEAR,DEPOSIT_CLASS,DEPOSIT_SUMMARY,MINEROLOGY_ORE,MINEROLOGY_GANGUE,MINERAL_DISTRICT,MINE_SUMMARY_CARD,REFERENCE_FLAG,MAP_250000,MAP_100000,MAP_50000,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94,HORIZ_ACCRCY_M,ELEVATION_M,VERT_ACCRCY_M,SURVEY_METHOD_CODE,SURVEY_METHOD,COMMODITY_NAME
0,17,ENTERPRISE BELTANA,BELTANA; ENTERPRISE; MC 6903,1899.0,Occurrence,S continuation of the Black Feather vein syste...,"Bornite, Chalcocite, Chalcopyrite, Malachite","Quartz, Siderite",,Y,Y,SH5409 COPLEY,6536 Copley,2,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,20.0,,,GOEAR,Google Earth image,Copper
1,21,HARVEYS RETURN,FOUR-MILE CLAIM,1899.0,Occurrence,Cu mineralisation in a NS-trending qz-siderite...,Malachite,"Barite, Calcite, Iron oxide (non specific), ...",,Y,Y,SH5409 COPLEY,6536 Copley,2,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,20.0,,,GOEAR,Google Earth image,Copper
5,35,MENGERSONS WELL,GREENTONE,1970.0,Occurrence,1970s exploration by Australian Aquitaine Petr...,Cuprite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934,100.0,,,GOEAR,Google Earth image,Copper
6,36,INDULKANA FAULT,,1970.0,Occurrence,"oxidised copper mineralisation, with rare diss...","Azurite, Chalcopyrite, Chrysocolla, Malachite",Quartz,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008,200.0,,,GOEAR,Google Earth image,Copper
7,37,OLD WELL,,,Occurrence,Cu mineralisation adjacent to mafic dyke at fa...,Malachite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204,200.0,,,GOEAR,Google Earth image,Copper


In [143]:
extract_mineral_deposit = pd.concat(
    [interested_md_commodity_exp_site, interested_md_details_exp], axis=1, 
    join='inner')
extract_mineral_deposit.head(5)

Unnamed: 0,SITE_NO,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,DEPOSIT_SYNONYMS,DISCOVERY_YEAR,DEPOSIT_CLASS,DEPOSIT_SUMMARY,MINEROLOGY_ORE,MINEROLOGY_GANGUE,MINERAL_DISTRICT,MINE_SUMMARY_CARD,REFERENCE_FLAG,MAP_250000,MAP_100000,MAP_50000,SITE_NO.1,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94,HORIZ_ACCRCY_M,ELEVATION_M,VERT_ACCRCY_M,SURVEY_METHOD_CODE,SURVEY_METHOD,COMMODITY_NAME
0,210457,17,ENTERPRISE BELTANA,BELTANA; ENTERPRISE; MC 6903,1899.0,Occurrence,S continuation of the Black Feather vein syste...,"Bornite, Chalcocite, Chalcopyrite, Malachite","Quartz, Siderite",,Y,Y,SH5409 COPLEY,6536 Copley,2,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,20.0,,,GOEAR,Google Earth image,Copper
1,210458,21,HARVEYS RETURN,FOUR-MILE CLAIM,1899.0,Occurrence,Cu mineralisation in a NS-trending qz-siderite...,Malachite,"Barite, Calcite, Iron oxide (non specific), ...",,Y,Y,SH5409 COPLEY,6536 Copley,2,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,20.0,,,GOEAR,Google Earth image,Copper
5,209575,35,MENGERSONS WELL,GREENTONE,1970.0,Occurrence,1970s exploration by Australian Aquitaine Petr...,Cuprite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934,100.0,,,GOEAR,Google Earth image,Copper
6,209572,36,INDULKANA FAULT,,1970.0,Occurrence,"oxidised copper mineralisation, with rare diss...","Azurite, Chalcopyrite, Chrysocolla, Malachite",Quartz,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008,200.0,,,GOEAR,Google Earth image,Copper
7,209573,37,OLD WELL,,,Occurrence,Cu mineralisation adjacent to mafic dyke at fa...,Malachite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204,200.0,,,GOEAR,Google Earth image,Copper


In [144]:
sarig_md_mineralogy_exp = pd.read_csv(
    zipfile(os.path.join(data_loc, file_name),'r').open('SARIG_Data_Package/sarig_md_mineralogy_exp.csv','r'), 
    sep=',', encoding='latin1')
sarig_md_mineralogy_exp.head(5)

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,MINERAL_CODE,MINERAL,MINERAL_TYPE,RELATIVE_ABUNDANCE_CODE,RELATIVE_ABUNDANCE_DESC,WEATHERING_PRODUCT,FORM_DISTRIBUTION,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,ENTERPRISE BELTANA,BN,Bornite,ORE,TRACE,1-4 % by volume,N,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,17,ENTERPRISE BELTANA,CC,Chalcocite,ORE,TRACE,1-4 % by volume,Y,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
2,17,ENTERPRISE BELTANA,CCP,Chalcopyrite,ORE,RARE,<1 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
3,17,ENTERPRISE BELTANA,MAL,Malachite,ORE,MINOR,5-29 % by volume,Y,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
4,17,ENTERPRISE BELTANA,QZ,Quartz,GANGUE,MAJOR,30- 70 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849


In [145]:
interested_md_mineralogy_exp = sarig_md_mineralogy_exp[sarig_md_mineralogy_exp['SITE_NO'].isin(interested_md_commodity_exp_site)]
interested_md_mineralogy_exp.head()

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,MINERAL_CODE,MINERAL,MINERAL_TYPE,RELATIVE_ABUNDANCE_CODE,RELATIVE_ABUNDANCE_DESC,WEATHERING_PRODUCT,FORM_DISTRIBUTION,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,ENTERPRISE BELTANA,BN,Bornite,ORE,TRACE,1-4 % by volume,N,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,17,ENTERPRISE BELTANA,CC,Chalcocite,ORE,TRACE,1-4 % by volume,Y,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
2,17,ENTERPRISE BELTANA,CCP,Chalcopyrite,ORE,RARE,<1 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
3,17,ENTERPRISE BELTANA,MAL,Malachite,ORE,MINOR,5-29 % by volume,Y,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
4,17,ENTERPRISE BELTANA,QZ,Quartz,GANGUE,MAJOR,30- 70 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849


In [146]:
extract_mineral_deposit = pd.concat(
    [extract_mineral_deposit, interested_md_mineralogy_exp], axis=1, 
    join='inner')
extract_mineral_deposit.head()

Unnamed: 0,SITE_NO,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,DEPOSIT_SYNONYMS,DISCOVERY_YEAR,DEPOSIT_CLASS,DEPOSIT_SUMMARY,MINEROLOGY_ORE,MINEROLOGY_GANGUE,MINERAL_DISTRICT,MINE_SUMMARY_CARD,REFERENCE_FLAG,MAP_250000,MAP_100000,MAP_50000,SITE_NO.1,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94,HORIZ_ACCRCY_M,ELEVATION_M,VERT_ACCRCY_M,SURVEY_METHOD_CODE,SURVEY_METHOD,COMMODITY_NAME,MINERAL_DEPOSIT_NO.1,DEPOSIT_NAME.1,MINERAL_CODE,MINERAL,MINERAL_TYPE,RELATIVE_ABUNDANCE_CODE,RELATIVE_ABUNDANCE_DESC,WEATHERING_PRODUCT,FORM_DISTRIBUTION,SITE_NO.2,EASTING_GDA2020.1,NORTHING_GDA2020.1,ZONE_GDA2020.1,LONGITUDE_GDA2020.1,LATITUDE_GDA2020.1,LONGITUDE_GDA94.1,LATITUDE_GDA94.1
0,210457,17,ENTERPRISE BELTANA,BELTANA; ENTERPRISE; MC 6903,1899.0,Occurrence,S continuation of the Black Feather vein syste...,"Bornite, Chalcocite, Chalcopyrite, Malachite","Quartz, Siderite",,Y,Y,SH5409 COPLEY,6536 Copley,2,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,20.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,BN,Bornite,ORE,TRACE,1-4 % by volume,N,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,210458,21,HARVEYS RETURN,FOUR-MILE CLAIM,1899.0,Occurrence,Cu mineralisation in a NS-trending qz-siderite...,Malachite,"Barite, Calcite, Iron oxide (non specific), ...",,Y,Y,SH5409 COPLEY,6536 Copley,2,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,20.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,CC,Chalcocite,ORE,TRACE,1-4 % by volume,Y,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
5,209575,35,MENGERSONS WELL,GREENTONE,1970.0,Occurrence,1970s exploration by Australian Aquitaine Petr...,Cuprite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934,100.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,SD,Siderite,GANGUE,TRACE,1-4 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
6,209572,36,INDULKANA FAULT,,1970.0,Occurrence,"oxidised copper mineralisation, with rare diss...","Azurite, Chalcopyrite, Chrysocolla, Malachite",Quartz,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008,200.0,,,GOEAR,Google Earth image,Copper,21,HARVEYS RETURN,BAR,Barite,GANGUE,MINOR,5-29 % by volume,N,,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
7,209573,37,OLD WELL,,,Occurrence,Cu mineralisation adjacent to mafic dyke at fa...,Malachite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204,200.0,,,GOEAR,Google Earth image,Copper,21,HARVEYS RETURN,CAL,Calcite,GANGUE,TRACE,1-4 % by volume,N,,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531


In [147]:
sarig_md_zone_hr_lith_exp = pd.read_csv(
    zipfile(os.path.join(data_loc, file_name),'r').open('SARIG_Data_Package/sarig_md_zone_hr_lith_exp.csv','r'), 
    sep=',', encoding='latin1')
sarig_md_zone_hr_lith_exp.head(5)

Unnamed: 0,MINERAL_DEPOSIT_NO,HOST_CATEGORY,HR_LITHOLOGY_CODE,LITHOLOGY,HR_LITHOLOGY_MODIFIER,HR_GIS_CODE,HR_MAP_SYMBOL,HR_STRAT_UNIT_CONF,HR_STRAT_UNIT,STRAT_DESCRIPTION,HR_STRAT_UNIT_MODIFIER,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,DOMINANT,SHLE,Shale,,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,21,DOMINANT,SHLE,Shale,red,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
2,33,DOMINANT,MSED,Metasediment,MYLONITIC,AL,AL,?,Archaean-Palaeoproterozoic rocks,Undifferentiated Archaean and/or Palaeoprotero...,,209556,779511.42,6607901.66,52,131.916017,-30.62887,131.916008,-30.628856
3,35,DOMINANT,DOLR,Dolerite,,M-------21,M21,,Mesoproterozoic unit 21,"Gabbro and microgabbro dykes, dark grey to bla...",,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934
4,36,DOMINANT,GNSS,Gneiss,,LMb,LMb,,Birksgate Complex,"Gneiss, quartzofeldspathic; orthogneiss, felsi...",,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008


In [148]:
interested_md_zone_hr_lith_exp = sarig_md_zone_hr_lith_exp[sarig_md_zone_hr_lith_exp['SITE_NO'].isin(interested_md_commodity_exp_site)]
interested_md_zone_hr_lith_exp.head()

Unnamed: 0,MINERAL_DEPOSIT_NO,HOST_CATEGORY,HR_LITHOLOGY_CODE,LITHOLOGY,HR_LITHOLOGY_MODIFIER,HR_GIS_CODE,HR_MAP_SYMBOL,HR_STRAT_UNIT_CONF,HR_STRAT_UNIT,STRAT_DESCRIPTION,HR_STRAT_UNIT_MODIFIER,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,DOMINANT,SHLE,Shale,,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,21,DOMINANT,SHLE,Shale,red,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
3,35,DOMINANT,DOLR,Dolerite,,M-------21,M21,,Mesoproterozoic unit 21,"Gabbro and microgabbro dykes, dark grey to bla...",,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934
4,36,DOMINANT,GNSS,Gneiss,,LMb,LMb,,Birksgate Complex,"Gneiss, quartzofeldspathic; orthogneiss, felsi...",,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008
5,37,DOMINANT,GNSS,Gneiss,,LMb--w,LMbw,,Wataru Gneiss,"Gneiss, granitic, medium-grained, with hornble...",,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204


In [149]:
extract_mineral_deposit = pd.concat(
    [extract_mineral_deposit, interested_md_zone_hr_lith_exp], axis=1, 
    join='inner')
extract_mineral_deposit.head()

Unnamed: 0,SITE_NO,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,DEPOSIT_SYNONYMS,DISCOVERY_YEAR,DEPOSIT_CLASS,DEPOSIT_SUMMARY,MINEROLOGY_ORE,MINEROLOGY_GANGUE,MINERAL_DISTRICT,MINE_SUMMARY_CARD,REFERENCE_FLAG,MAP_250000,MAP_100000,MAP_50000,SITE_NO.1,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94,HORIZ_ACCRCY_M,ELEVATION_M,VERT_ACCRCY_M,SURVEY_METHOD_CODE,SURVEY_METHOD,COMMODITY_NAME,MINERAL_DEPOSIT_NO.1,DEPOSIT_NAME.1,MINERAL_CODE,MINERAL,MINERAL_TYPE,RELATIVE_ABUNDANCE_CODE,RELATIVE_ABUNDANCE_DESC,WEATHERING_PRODUCT,FORM_DISTRIBUTION,SITE_NO.2,EASTING_GDA2020.1,NORTHING_GDA2020.1,ZONE_GDA2020.1,LONGITUDE_GDA2020.1,LATITUDE_GDA2020.1,LONGITUDE_GDA94.1,LATITUDE_GDA94.1,MINERAL_DEPOSIT_NO.2,HOST_CATEGORY,HR_LITHOLOGY_CODE,LITHOLOGY,HR_LITHOLOGY_MODIFIER,HR_GIS_CODE,HR_MAP_SYMBOL,HR_STRAT_UNIT_CONF,HR_STRAT_UNIT,STRAT_DESCRIPTION,HR_STRAT_UNIT_MODIFIER,SITE_NO.3,EASTING_GDA2020.2,NORTHING_GDA2020.2,ZONE_GDA2020.2,LONGITUDE_GDA2020.2,LATITUDE_GDA2020.2,LONGITUDE_GDA94.2,LATITUDE_GDA94.2
0,210457,17,ENTERPRISE BELTANA,BELTANA; ENTERPRISE; MC 6903,1899.0,Occurrence,S continuation of the Black Feather vein syste...,"Bornite, Chalcocite, Chalcopyrite, Malachite","Quartz, Siderite",,Y,Y,SH5409 COPLEY,6536 Copley,2,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,20.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,BN,Bornite,ORE,TRACE,1-4 % by volume,N,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,17,DOMINANT,SHLE,Shale,,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,210458,21,HARVEYS RETURN,FOUR-MILE CLAIM,1899.0,Occurrence,Cu mineralisation in a NS-trending qz-siderite...,Malachite,"Barite, Calcite, Iron oxide (non specific), ...",,Y,Y,SH5409 COPLEY,6536 Copley,2,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,20.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,CC,Chalcocite,ORE,TRACE,1-4 % by volume,Y,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,21,DOMINANT,SHLE,Shale,red,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
5,209575,35,MENGERSONS WELL,GREENTONE,1970.0,Occurrence,1970s exploration by Australian Aquitaine Petr...,Cuprite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934,100.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,SD,Siderite,GANGUE,TRACE,1-4 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,37,DOMINANT,GNSS,Gneiss,,LMb--w,LMbw,,Wataru Gneiss,"Gneiss, granitic, medium-grained, with hornble...",,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204
6,209572,36,INDULKANA FAULT,,1970.0,Occurrence,"oxidised copper mineralisation, with rare diss...","Azurite, Chalcopyrite, Chrysocolla, Malachite",Quartz,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008,200.0,,,GOEAR,Google Earth image,Copper,21,HARVEYS RETURN,BAR,Barite,GANGUE,MINOR,5-29 % by volume,N,,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,38,DOMINANT,GNSS,Gneiss,granitic,LMb--w,LMbw,,Wataru Gneiss,"Gneiss, granitic, medium-grained, with hornble...",,209594,350990.77,7020511.47,53,133.49916,-26.929299,133.499151,-26.929285
7,209573,37,OLD WELL,,,Occurrence,Cu mineralisation adjacent to mafic dyke at fa...,Malachite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204,200.0,,,GOEAR,Google Earth image,Copper,21,HARVEYS RETURN,CAL,Calcite,GANGUE,TRACE,1-4 % by volume,N,,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,39,DOMINANT,LMST,Limestone,,N----r,N-r,?,Rodda beds,"Siltstone, grey-green, khaki, calcareous and d...",,209593,349160.79,7000011.47,53,133.478242,-27.114122,133.478233,-27.114108


In [150]:
sarig_md_zone_lith_exp = pd.read_csv(
    zipfile(os.path.join(data_loc, file_name),'r').open('SARIG_Data_Package/sarig_md_zone_lith_exp.csv','r'), 
    sep=',', encoding='latin1')
sarig_md_zone_lith_exp.head(5)

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,RANK,LITHOLOGY_CODE,LITHOLOGY,LITHOLOGY_MODIFIER,GIS_CODE,MAP_SYMBOL,STRAT_UNIT_CONF,HR_STRAT_UNIT,STRAT_DESCRIPTION,STRAT_UNIT_MODIFIER,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,ENTERPRISE BELTANA,DOMINANT,VEIM,Mesothermal Vein,bunches and stockwork of secondary copper mine...,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,cupriferous rather than ferruginous,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,21,HARVEYS RETURN,DOMINANT,VEIN,Vein (Undifferentiated),,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,green carbonates associated with iron,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
2,33,OOLDEA,DOMINANT,QMTU,Quartz-Magnetite Rock (Undiff. Origin),"QUARTZ-MAGNETITE--FELDSPAR-BIOTITE-GARNET, MYL...",M-,M,?,Mesoproterozoic rocks,Undifferentiated Mesoproterozoic rocks.,The sequence resembles Archaean metasediments ...,209556,779511.42,6607901.66,52,131.916017,-30.62887,131.916008,-30.628856
3,34,VICTORY DOWNS,DOMINANT,SAND,Sand,,H---a,Qha,,Holocene alluvial/fluvial sediments,Undifferentiated Holocene alluvial/fluvial sed...,,209576,300730.84,7120631.49,53,133.008773,-26.019552,133.008764,-26.019538
4,35,MENGERSONS WELL,DOMINANT,VEIN,Vein (Undifferentiated),,------qz,qz,,quartz vein,"Quartz veins/bodies, undifferentiated.",,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934


In [151]:
interested_md_zone_lith_exp = sarig_md_zone_lith_exp[sarig_md_zone_lith_exp['SITE_NO'].isin(interested_md_commodity_exp_site)]
interested_md_zone_lith_exp.head(5)

Unnamed: 0,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,RANK,LITHOLOGY_CODE,LITHOLOGY,LITHOLOGY_MODIFIER,GIS_CODE,MAP_SYMBOL,STRAT_UNIT_CONF,HR_STRAT_UNIT,STRAT_DESCRIPTION,STRAT_UNIT_MODIFIER,SITE_NO,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94
0,17,ENTERPRISE BELTANA,DOMINANT,VEIM,Mesothermal Vein,bunches and stockwork of secondary copper mine...,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,cupriferous rather than ferruginous,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,21,HARVEYS RETURN,DOMINANT,VEIN,Vein (Undifferentiated),,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,green carbonates associated with iron,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
4,35,MENGERSONS WELL,DOMINANT,VEIN,Vein (Undifferentiated),,------qz,qz,,quartz vein,"Quartz veins/bodies, undifferentiated.",,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934
5,36,INDULKANA FAULT,DOMINANT,TCTO,Tectonite,,LMb,LMb,,Birksgate Complex,"Gneiss, quartzofeldspathic; orthogneiss, felsi...",,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008
6,37,OLD WELL,DOMINANT,DOLR,Dolerite,,M-------21,M21,,Mesoproterozoic unit 21,"Gabbro and microgabbro dykes, dark grey to bla...",,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204


In [152]:
extract_mineral_deposit = pd.concat(
    [extract_mineral_deposit, interested_md_zone_lith_exp], axis=1, 
    join='inner')
extract_mineral_deposit.head()

Unnamed: 0,SITE_NO,MINERAL_DEPOSIT_NO,DEPOSIT_NAME,DEPOSIT_SYNONYMS,DISCOVERY_YEAR,DEPOSIT_CLASS,DEPOSIT_SUMMARY,MINEROLOGY_ORE,MINEROLOGY_GANGUE,MINERAL_DISTRICT,MINE_SUMMARY_CARD,REFERENCE_FLAG,MAP_250000,MAP_100000,MAP_50000,SITE_NO.1,EASTING_GDA2020,NORTHING_GDA2020,ZONE_GDA2020,LONGITUDE_GDA2020,LATITUDE_GDA2020,LONGITUDE_GDA94,LATITUDE_GDA94,HORIZ_ACCRCY_M,ELEVATION_M,VERT_ACCRCY_M,SURVEY_METHOD_CODE,SURVEY_METHOD,COMMODITY_NAME,MINERAL_DEPOSIT_NO.1,DEPOSIT_NAME.1,MINERAL_CODE,MINERAL,MINERAL_TYPE,RELATIVE_ABUNDANCE_CODE,RELATIVE_ABUNDANCE_DESC,WEATHERING_PRODUCT,FORM_DISTRIBUTION,SITE_NO.2,EASTING_GDA2020.1,NORTHING_GDA2020.1,ZONE_GDA2020.1,LONGITUDE_GDA2020.1,LATITUDE_GDA2020.1,LONGITUDE_GDA94.1,LATITUDE_GDA94.1,MINERAL_DEPOSIT_NO.2,HOST_CATEGORY,HR_LITHOLOGY_CODE,LITHOLOGY,HR_LITHOLOGY_MODIFIER,HR_GIS_CODE,HR_MAP_SYMBOL,HR_STRAT_UNIT_CONF,HR_STRAT_UNIT,STRAT_DESCRIPTION,HR_STRAT_UNIT_MODIFIER,SITE_NO.3,EASTING_GDA2020.2,NORTHING_GDA2020.2,ZONE_GDA2020.2,LONGITUDE_GDA2020.2,LATITUDE_GDA2020.2,LONGITUDE_GDA94.2,LATITUDE_GDA94.2,MINERAL_DEPOSIT_NO.3,DEPOSIT_NAME.2,RANK,LITHOLOGY_CODE,LITHOLOGY.1,LITHOLOGY_MODIFIER,GIS_CODE,MAP_SYMBOL,STRAT_UNIT_CONF,HR_STRAT_UNIT.1,STRAT_DESCRIPTION.1,STRAT_UNIT_MODIFIER,SITE_NO.4,EASTING_GDA2020.3,NORTHING_GDA2020.3,ZONE_GDA2020.3,LONGITUDE_GDA2020.3,LATITUDE_GDA2020.3,LONGITUDE_GDA94.3,LATITUDE_GDA94.3
0,210457,17,ENTERPRISE BELTANA,BELTANA; ENTERPRISE; MC 6903,1899.0,Occurrence,S continuation of the Black Feather vein syste...,"Bornite, Chalcocite, Chalcopyrite, Malachite","Quartz, Siderite",,Y,Y,SH5409 COPLEY,6536 Copley,2,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,20.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,BN,Bornite,ORE,TRACE,1-4 % by volume,N,Stockwork; Veinlets,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,17,DOMINANT,SHLE,Shale,,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,17,ENTERPRISE BELTANA,DOMINANT,VEIM,Mesothermal Vein,bunches and stockwork of secondary copper mine...,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,cupriferous rather than ferruginous,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849
1,210458,21,HARVEYS RETURN,FOUR-MILE CLAIM,1899.0,Occurrence,Cu mineralisation in a NS-trending qz-siderite...,Malachite,"Barite, Calcite, Iron oxide (non specific), ...",,Y,Y,SH5409 COPLEY,6536 Copley,2,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,20.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,CC,Chalcocite,ORE,TRACE,1-4 % by volume,Y,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,21,DOMINANT,SHLE,Shale,red,N-hw-b,Nwb,,Bunyeroo Formation,"Siltstone, shale, grey-red to grey-green, part...",,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,21,HARVEYS RETURN,DOMINANT,VEIN,Vein (Undifferentiated),,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,green carbonates associated with iron,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531
5,209575,35,MENGERSONS WELL,GREENTONE,1970.0,Occurrence,1970s exploration by Australian Aquitaine Petr...,Cuprite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209575,324250.84,7019051.49,53,133.229722,-26.939354,133.229713,-26.93934,100.0,,,GOEAR,Google Earth image,Copper,17,ENTERPRISE BELTANA,SD,Siderite,GANGUE,TRACE,1-4 % by volume,N,,210457,251540.85,6581151.56,54,138.401168,-30.876863,138.401157,-30.876849,37,DOMINANT,GNSS,Gneiss,,LMb--w,LMbw,,Wataru Gneiss,"Gneiss, granitic, medium-grained, with hornble...",,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204,36,INDULKANA FAULT,DOMINANT,TCTO,Tectonite,,LMb,LMb,,Birksgate Complex,"Gneiss, quartzofeldspathic; orthogneiss, felsi...",,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008
6,209572,36,INDULKANA FAULT,,1970.0,Occurrence,"oxidised copper mineralisation, with rare diss...","Azurite, Chalcopyrite, Chrysocolla, Malachite",Quartz,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209572,314630.82,7035461.48,53,133.135293,-26.790022,133.135284,-26.790008,200.0,,,GOEAR,Google Earth image,Copper,21,HARVEYS RETURN,BAR,Barite,GANGUE,MINOR,5-29 % by volume,N,,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,38,DOMINANT,GNSS,Gneiss,granitic,LMb--w,LMbw,,Wataru Gneiss,"Gneiss, granitic, medium-grained, with hornble...",,209594,350990.77,7020511.47,53,133.49916,-26.929299,133.499151,-26.929285,37,OLD WELL,DOMINANT,DOLR,Dolerite,,M-------21,M21,,Mesoproterozoic unit 21,"Gabbro and microgabbro dykes, dark grey to bla...",,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204
7,209573,37,OLD WELL,,,Occurrence,Cu mineralisation adjacent to mafic dyke at fa...,Malachite,,,N,Y,SG5309 ALBERGA,5544 Agnes Creek,3,209573,310120.86,7037921.58,53,133.090317,-26.767218,133.090308,-26.767204,200.0,,,GOEAR,Google Earth image,Copper,21,HARVEYS RETURN,CAL,Calcite,GANGUE,TRACE,1-4 % by volume,N,,210458,251431.67,6581960.86,54,138.400224,-30.869545,138.400213,-30.869531,39,DOMINANT,LMST,Limestone,,N----r,N-r,?,Rodda beds,"Siltstone, grey-green, khaki, calcareous and d...",,209593,349160.79,7000011.47,53,133.478242,-27.114122,133.478233,-27.114108,38,WELLS GRANITE DOWNS,DOMINANT,VEIN,Vein (Undifferentiated),quartz,------qz04,qz4,,quartz vein unit 4,Undifferentiated ferruginous quartz veins.,,209594,350990.77,7020511.47,53,133.49916,-26.929299,133.499151,-26.929285


### Save the data set extracted from mineral deposit dataset for the selected commodities

In [153]:
extract_mineral_deposit.to_csv("{}_extract_from_mineral_deposit.csv".format('_'.join(commodities_interested)), sep=',', header='infer')