# Introduction
This notebook is meant to work with real-time drilling data from Volve dataset converted from WITSML to CSV.

## Basic imports
Some imports are necessary for the notebook to work.

In [None]:
%matplotlib qt
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from glob import glob
import os
import random

## Constructing a file list
File list is constructed in this cell. It will find all *.csv* files located in the same folder as this notebook. 

In [None]:
filelist = glob(r'*.csv', recursive=False) #search in not recursive by default

print ("Detected logs:\n")

for i in range(len(filelist)):
    print ('[' + str(i) + ']' + " " + filelist[i].split('\\')[-1] + " " + str(os.path.getsize(filelist[i])//1000000) + 'MB')

print ()

## Searching for attributes

This cell allows you to define a searchphrase (case insensitive) that will be searched for in the dataset. Only the logs that contain something will be shown.

This notebook is optimized for speed by reading just the first line of the CSV file.

In [None]:
searchphrase = "depth"
searchphrase = searchphrase.lower()

for j in filelist:
    
    df = pd.read_csv(j, nrows = 1)
    k = 0
    for i in list(df):
        if (i.lower().find(searchphrase) > -1):
            if k==0:
                print ("")
                print ("Found in: " + j)
                k=1
            print (i)

print()

## Chart generator

It is difficult to create a method that would estimate usability of certain attribute from the point of completeness or noise. Therefore it is convinient to quickly generate charts of those attributes.

This function will search for a phrase and plot a sample of 10 000 points (for performance reasons) and generate a high resolution PNG file.

In [None]:
def chartgenerator(searchphrase):
    
    for j in filelist:
        print ("Reading " + j)
        n = sum(1 for line in open(j)) - 1
        s = 10000
        if (n - s) < 0 : n = s
        skip = sorted(random.sample(range(1,n+1),n-s))

        df = pd.read_csv(j, skiprows=skip)

        for i in list(df):
            if (i.lower().find(searchphrase) > -1):
                plt.clf()
                plt.figure(figsize=(20,10))
                df[i].dropna().plot(linewidth=1, label=i, title=j.split('\\')[-1])
                plt.legend()
                print(j + i + '.png')
                plt.savefig(j + i + '.png', dpi=300,bbox_inches='tight')


This routine will cycle through a keyword list and feed it into the chart generator.

In [None]:
keywords = [
            'continuous',
            'gamma'
            ]

for k in keywords:
    k = k.lower()
    chartgenerator(k)