# Core Proportion

* This notebook is used to define the core_prop function.
* The core_prop function runs throught our interpreted well-data and outputs the following: (per data set)
    * "core_prop_dict" containing (units:mean core proportion) values
-----------------------------------------------------------------------------------------------------------------

## Importing Libraries and Data

In [1]:
# Importing libraries
import pandas as pd
import re

# Reading .csv
# data = pd.read_csv(r'Data/CSV/1-AV-001-PR.csv', sep=';', decimal=',') ## Dataset1
data = pd.read_csv(r'Data/CSV/1-CS-002-PR.csv', sep=';', decimal=',') ##Dataset2
data.head()

Unnamed: 0,Well,Unit,Facies,Facies Thickness,Flow thickness,Core Proportion
0,1-CS-002-PR,Paranapanema,Simple lava (basic | massive interior),3.65,,
1,1-CS-002-PR,Paranapanema,Simple lava (basic | L. crust),1.12,,
2,1-CS-002-PR,Paranapanema,Siliciclastics,1.52,,
3,1-CS-002-PR,Paranapanema,Simple lava (basic | U. crust),11.6,37.28,0.650215
4,1-CS-002-PR,Paranapanema,Simple lava (basic | massive interior),24.24,,


## Core proportion of lava flows per geochem group
-----------------------------------------------------------------------------------------------------------------   
* This code runs throught the WellCad data and calculates the following:
    * The mean core proportion per geochemical group.
-----------------------------------------------------------------------------------------------------------------
* Outputs:
    * A core_prop_dict containing (unit: mean core proportion) values

In [47]:
# core_prop function

def core_prop(data, core_prop_dict):
    'Calculates the mean core proportion of lava flows in a given interpreted welldata'
    'data --> interpreted welldata pandas dataframe'
    'core_prop_dict --> empty dict which will be outputted'
    
    # defining variables:
    sum_core_prop = 0 # stores the sum of "core proportion" values.
    n = 0 # number of values per unit. Used for the mean calculation.
    
    # Filtering inputed dataframe:
    # Given the way the datafame is formatted I can just drop any rows which do not have core proportion information.
    filtered = data.dropna(subset=['Core Proportion'])
    filtered.reset_index(inplace=True)
    
    # Algorithm:
    for i in range(len(filtered)):
        if i == 0: # setting up the first iteration
            core_prop_dict[filtered['Unit'].iloc[i]] = 0
            sum_core_prop += filtered['Core Proportion'].iloc[i]
            n += 1
        else: # After first iteration
            if data['Unit'].iloc[i] == filtered['Unit'].iloc[i-1]: # If units remains the same
                sum_core_prop += filtered['Core Proportion'].iloc[i]
                n +=1
            else: # If 'unit' changes:
                if data['Unit'].iloc[i] in core_prop_dict: # If data has already been added to the dict.
                    # Adding previous value to the dict:
                    mean = sum_core_prop/n
                    core_prop_dict[filtered['Unit'].iloc[i-1]] = mean
                    # Reseting values:
                    n = 1
                    sum_core_prop = core_prop_dict[filtered['Core Proportion'].iloc[i]]
                    sum_core_prop += data['Core Proportion'].iloc[i]
                else:
                    # Calculate mean core_prop value for current unit:
                    mean = sum_core_prop/n
                    core_prop_dict[filtered['Unit'].iloc[i-1]] = mean
                    # Reset values:
                    core_prop_dict[filtered['Unit'].iloc[i]] = 0
                    n = 1
                    sum_core_prop = filtered['Core Proportion'].iloc[i]

        

In [46]:
filtered = data.dropna(subset=['Core Proportion'])
filtered.reset_index(inplace=True)
filtered.head(50)

Unnamed: 0,index,Well,Unit,Facies,Facies Thickness,Flow thickness,Core Proportion
0,3,1-CS-002-PR,Paranapanema,Simple lava (basic | U. crust),11.6,37.28,0.650215
1,7,1-CS-002-PR,Pitanga,Simple lava (basic | U. crust),7.12,32.0,0.7175
2,10,1-CS-002-PR,Pitanga,Simple lava (basic | U. crust),4.91,22.8,0.666667
3,13,1-CS-002-PR,Pitanga,Simple lava (basic | rubbly flow top),11.6,39.36,0.622713
4,18,1-CS-002-PR,Pitanga,Simple lava (basic | U. crust),2.56,6.0,0.44
5,23,1-CS-002-PR,Pitanga,Simple lava (basic | U. crust),4.16,21.44,0.753731
6,27,1-CS-002-PR,Pitanga,Simple lava (basic | U. crust),19.57,56.13,0.619989
7,32,1-CS-002-PR,Pitanga,Simple lava (basic | rubbly flow top),14.4,65.44,0.757946
8,36,1-CS-002-PR,Pitanga,Simple lava (basic | rubbly flow top),9.76,34.16,0.583138
9,39,1-CS-002-PR,Pitanga,Simple lava (basic | rubbly flow top),8.08,19.44,0.530864


In [48]:
# Testing
dictionary = {}
core_prop(data, dictionary)
print(dictionary)

<class 'KeyError'>: 0.666666667