# Introduction
This notebook is meant to work with real-time drilling data from Volve dataset converted from WITSML to CSV. Files are available for download at http://www.ux.uis.no/~atunkiel/file_list.html.

## Basic imports
Some imports are necessary for the notebook to work.

In [1]:
%matplotlib qt
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from glob import glob
import os
import random

## Constructing a file list
File list is constructed in this cell. It will find all *.csv* files located in the same folder as this notebook. 

In [13]:
filelist = glob(r'C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\*.csv', recursive=False) #search in not recursive by default

print ("Detected logs:\n")

for i in range(len(filelist)):
    print ('[' + str(i) + ']' + " " + filelist[i].split('\\')[-1] + " " + str(os.path.getsize(filelist[i])//1000000) + 'MB')

print ()

Detected logs:

[0] F9ADepth.csv 5MB
[1] F9ATime.csv 427MB



## Searching for attributes

This cell allows you to define a searchphrase (case insensitive) that will be searched for in the dataset. Only the logs that contain something will be shown.

This notebook is optimized for speed by reading just the first line of the CSV file.

In [14]:
searchphrase = "depth"
searchphrase = searchphrase.lower()

for j in filelist:
    
    df = pd.read_csv(j, nrows = 1)
    k = 0
    for i in list(df):
        if (i.lower().find(searchphrase) > -1):
            if k==0:
                print ("")
                print ("Found in: " + j)
                k=1
            print (i)

print()


Found in: C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ADepth.csv
Measured Depth m
Lag Depth (TVD) m
Total Vertical Depth m
Hole Depth (TVD) m
Bit Depth m
Hole depth (MD) m

Found in: C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ATime.csv
Bit Depth (MD) m
Lagged Total Depth m
Total Depth m
SRVDEPTH m
Total Vertical Depth m
Bit Depth m
Bit Depth m.1
Depth chrom sample (meas) m
Hole depth (MD) m
Continuous Survey Depth m
Lag Depth (TVD) m
ESD_DELAY_DEPTH m



## Chart generator

It is difficult to create a method that would estimate usability of certain attribute from the point of completeness or noise. Therefore it is convinient to quickly generate charts of those attributes.

This function will search for a phrase and plot a sample of 10 000 points (for performance reasons) and generate a high resolution PNG file.

In [20]:
def chartgenerator(searchphrase):
    
    for j in filelist:
        print ("Reading " + j)
        n = sum(1 for line in open(j)) - 1
        s = 10000
        if (n - s) < 0 : n = s
        skip = sorted(random.sample(range(1,n+1),n-s))

        df = pd.read_csv(j, skiprows=skip)

        for i in list(df):
            if (i.lower().find(searchphrase) > -1):
                plt.clf()
                plt.figure(figsize=(20,10))
                df[i].dropna().plot(linewidth=1, label=i, title=j.split('\\')[-1])
                plt.legend()
                print(j + i + '.png')
                plt.savefig(j + i + '.png', dpi=300,bbox_inches='tight')


This routine will cycle through a keyword list and feed it into the chart generator.

In [23]:
keywords = [
            'continuous',
            'gamma'
            ]

for k in keywords:
    k = k.lower()
    chartgenerator(k)

Reading C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ADepth.csv
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ADepth.csvMWD Continuous Inclination dega.png
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ADepth.csvMWD Continuous Azimuth dega.png
Reading C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ATime.csv
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ATime.csvMWD Continuous Inclination dega.png
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ATime.csvMWD Continuous Azimuth dega.png
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ATime.csvContinuous Survey Depth m.png
Reading C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ADepth.csv
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\Vol

FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Users\\sarmad\\Google Drive\\GitHub\\Web GUI Development\\VolveDataExploration\\Data\\F9ADepth.csvMWD Raw Gamma Ray 1/s.png'

## Attribute list

This cell will provide a list of all the logs and will count the instances of attributes matching a searchphrase. This can be use 

In [18]:
searchphrase = "continuous"
searchphrase = searchphrase.lower()
print (searchphrase)
for j in filelist:
    print(j, end=": ")
    found=0
    df = pd.read_csv(j, nrows = 1)

    for i in list(df):
        if (i.lower().find(searchphrase) > -1):
            found = found + 1
    print(found)

continuous
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ADepth.csv: 2
C:\Users\sarmad\Google Drive\GitHub\Web GUI Development\VolveDataExploration\Data\F9ATime.csv: 3
