#**Determining wedge position status “ON/OFF”**


During the drilling operations, some manipulations with the drilling tool are constantly performed using a winch. In fact, one can always say that the drilling tool either hangs on the winch hook (and thereby its weight is unloaded by the hoist system of the drill), or is held by the wedges of the rotary shaft (and thereby the entire weight of the drilling tool is unloaded onto the rotary table and winch hook remains weight free).

There is a need to determine the state where the weight of the tool is in wedges or aweight at any given time.


The main parameter for this is the reading of the weight sensor on the hook. It seems that to determine the state you just need to know the current weight, the weight of the empty hook and compare them with each other. However, in practice this is not that easy. Weight indications are subject to dynamic short and long-term deviations. Sometimes it’s difficult even for a person to figure out what happens just by looking at the data. (but nevertheless, an experienced look highlights 99.9% of cases correctly).

Some labeled examples are attached, that can be used for training.

In [None]:
import os
from zipfile import ZipFile
from shutil import copy

In [None]:
fileName = "../data/wedges.zip"
ds = ZipFile(fileName)
ds.extractall()
os.remove(fileName)
print('Zip file is extracted')

os.chdir('wedges')
#Looking what is inside
!ls -lah

**Definitions:**

The state of the “ON” or “In wedges” position of the wedges is characterized by the absence of additional weight on the rig’s hook, that is, the “hook weight” sensor value is approximately (1) equals to the weight of the hook itself (“empty hook weight”, parameter FHW). This condition is possible when the drilling tool is fixed in the rotor with the help of wedges or when the tool is absent in the well (all is on the surface).

---

*(1) It is possible that on the hook there is an additional weight of one candle (~ 0.7 - 1 t) or one tube (~ 0.2 -0.3 t) of the drilling tool at those times when the candle is disconnected from the rest of the tool, but not yet installed in the rack.*

---
The state of the “OFF” or “aweight” wedges is characterized by the presence of additional weight on the rig’s hook, that is, the sensor ‘weight on hook’ is greater than the hook’s own weight (“empty hook weight”).



In [None]:
from IPython.display import Image
Image('desc.png')

**Task:**

It is a need to automatically determine the current state of the wedges position (“ON” or “OFF”).
The task is complicated by the fact that the “hook weight” parameter sometimes varies significantly due to various short-term (from seconds to minutes) dynamic effects, such as tool movement, tightening / landing of the drilling tool, vibration of the drilling rig and drilling tool, electromagnetic interference, recalibration of the sensor, as well as long-term (from hours to several days) effects such as temperature drift of the sensor or loosening of the sensor mount.

* SLIPS = ON, - in wedges
* SLIPS = OFF, - aweight

In [None]:
%matplotlib inline
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import datetime

In [None]:
data = pd.read_excel(r'Example_1.xlsx')
data['example'] = 1
data2 = pd.read_excel(r'Example_2.xlsx')
data2['example'] = 2
data3 = pd.read_excel(r'Example_3.xlsx')
data3['example'] = 3
data4 = pd.read_excel(r'Example_4.xlsx')
data4['example'] = 4
data5 = pd.read_excel(r'Example_5.xlsx')
data5['example'] = 5

data = pd.concat([data, data2, data3, data4, data5], sort = False)
data = data.rename(columns = {"Время": "date",
                              "Положение талевого блока, м": "block_position", 
                              "Вес на крюке, т": "weight",
                              "Нагрузка на долото, т": "rotor_load",
                              "Обороты ротора, об/мин": "rotor_turn",
                              "Мех. скорость, м/ч": "rotor_speed",
                              "Давление на входе, кгс/см²": "preasure",
                              
                              "Расход на входе, л/мин": "entry_spend",
                              "Глубина долота, м": "chisels_depth",
                              "Глубина скважины, м": "wells_depth",
                             
                              "Положение талевого блока, м.1": "block_position.1",
                              "Глубина скважины, м.1": "wells_depth.1",
                              "Глубина долота, м.1": "chisels_depth.1",       
                             })

In [None]:
data.head()

In [None]:
def converter(datestring):
    res = datestring.split(' ')

    for old, new in [('янв', '01'), ('фев', '02'), ('мар', '03'), ('апр.', '04'), 
                     ('мая', '05'), ('июн', '06'), ('июл', '07'), ('авг', '08'), 
                     ('сен', '09'), ('окт', '10'), ('ноя', '11'), ('дек', '12')]:
        res[1] = res[1].replace(old, new)
    
    result = '2019' + "-" + res[1]  + "-" + res[0] + ' ' + res[2]
    return result

data['date'] = data['date'].apply(converter)
data["date"] = pd.to_datetime(data["date"])
data['SLIPS'] = data['SLIPS'].str.contains('ON').astype(bool)


Check:
* Data Types
* Shape of the dataset
* Null values - identify the columns to drop based on the % of missing values
* Distribution of values for each variable
* Check if some variables have identical values
* Drop redundant/useless data from the dataset


In [None]:
#YOUR CODE HERE

In [None]:
#Show data once again
data.head()