# Calculating sums and averages
Workflow that describes how to calculate the statistics of the elastic properties in a set of intervals / zones, for a set of wells. The result is stored in a RokDoc *Sums and Averages* excel file, which can be used in RokDoc after converting the result file from *.xlsx* to *.xls*

## Project table
The Excel sheet *project_table.xlsx*, in the *excels* folder of the install directory, is the important hub for which wells, and well logs, to use. Please see *Introduction to blixt_rp.ipynb* for basic usage.



<img src="images/ProjectTable.png" />

In [1]:
import os
import blixt_utils.io.io as uio
import core.well as cw
from rp_utils.calc_stats import calc_stats2 as calc_stats
from core.well import Project
from plotting import plot_rp

## Create a wells project


In [2]:
wp = Project(name='MyProject', tops_file='test_data/RokDocTops.xlsx', tops_type='rokdoc')

*Project* also take keywords:
- *working_dir* - the folder where the project is situated
- *project_table* - full or relative path name of the project table (.xlsx) explained in the top of this notebook
- *log_to_stdout* - if True, logging information is sent to standard output, else to a text file
- *tops_type* - tells the project what kind tops that are in use: *rokdoc*, *petrel* or *npd*

By default, none of these need to set, and it uses the install directory, toghether with example data and project table there

## Load project table


In [3]:
#well_table = uio.project_wells(wp.project_table, wp.working_dir)
wells = wp.load_all_wells()

Load the project templates

In [9]:
templates = wp.load_all_templates()

## Load tops or working intervals
Well tops can either be handled by loading a file with well tops. They can be in format exported from Petrel, npd.no, or RokDoc.

Or you can define working intervals in the *Working intervals* sheet of the project table

<img src="images/ProjectTable_working_intervals.png" />

In the above example, the working intervals are defined through
- *Column A*: **Use** 
 - This column in not is use in the current version
- *Column B*: **Given well name**
 - Is the name of wells in the project, as defined in the *Wells table* sheet.  
- *Column C*: **Interval name**
 - name of the working interval
- *Column D & E*: **Top depth** & **Base depth**
 - Depth in meters MD to top and base of the working interval
 
The first five rows of this sheet should not be modified.

Load the working intervals through

In [7]:
wis = wp.load_all_wis()

   # Calculate RokDoc compatible Sums & Averages
  This functionality is useful when you want to analyze the statistics of several formations / intervals across multiple wells (.las files).
  The results is saved as an .xlsx spreadsheet, with statistics of the rock properties. To be read by RokDoc, it has to be converted to .xls

Tell python where to save the results, where to find the wells, and the tops file

If the .xlsx file exists, it will be appended to.

If you want to save the results, or load the data from, elsewhere on your file system, please provide the full path name, using "forward slashes".

In the folder of where the .xlsx file is situated, all QC plots will be saved. Their name will end with the below 'tag'

In [10]:
rd_file = 'results_folder/RokDoc_SumsAverages_from_python.xlsx'

tag = 'my'

Modify the project_table.xlsx so that it points to the wells; Well_A, Well_B and Well_C

<img src="images/ProjectTable.png" />


Log names under each log types *P velocity*, *S velocity*, *Density*, *Porosity*, and *Volume* must be specified, else the output to RokDoc Sums and Averages will fail.

Multiple log names can be specified under each log type (e.g. *Vp* and *Vp_dry*), but only one log per log type can be used in the statistics.

Therefore we need a table to specify which log to use under each log type


In [11]:
log_table = {'P velocity': 'vp_dry', 'S velocity': 'vs_dry', 'Density': 'rho_dry', 'Porosity': 'phie', 'Volume': 'vcl'}

Determine which working intervals you'd like to calculate the statistics for

In [12]:
wi_sands = ['Sand H', 'Sand F', 'Sand E', 'Sand D', 'Sand C']
wi_shales = ['Shale G', 'Shale C']

Define the cut offs that are used to classify the data (e.g. sands or shales).
The statistics will only be calculated within each interval, where the cut off is valid
The log names (e.g. *vcl*) corresponding to a log type (e.g. *Volume*) must exist in the .las files

In [13]:
cutoffs_sands = {'Volume': ['<', 0.5], 'Porosity': ['>', 0.1]}
cutoffs_shales = {'Volume': ['>', 0.5], 'Porosity': ['<', 0.1]}

###Run the calculation of the statistics
First for the sands

In [14]:
calc_stats(wells, log_table, wis, wi_sands, cutoffs_sands, 
              rokdoc_output=rd_file,
              working_dir=os.path.join(wp.working_dir, 'results_folder'),
              suffix=tag)

Interval: Sand H
Creating new RokDoc Sums and Averages file
Interval: Sand F


  mn = np.nanmean(results_per_well[this_well_name][key])
  keepdims=keepdims)


Appending to existing RokDoc Sums and averages file
Interval: Sand E
Appending to existing RokDoc Sums and averages file
Interval: Sand D
Appending to existing RokDoc Sums and averages file
Interval: Sand C
Appending to existing RokDoc Sums and averages file


In [None]:
Then for the shales

In [15]:
calc_stats(wells, log_table, wis, wi_shales, cutoffs_shales, 
              rokdoc_output=rd_file,
              working_dir=os.path.join(wp.working_dir, 'results_folder'),
              suffix=tag)

Interval: Shale G
Appending to existing RokDoc Sums and averages file
Interval: Shale C
Appending to existing RokDoc Sums and averages file
