# SEISMIC INVERSION USING MACHINE LEARNING

This notebook builds an end-to-end Long Short Term Memory Model using TensorFlow and XGBoost

## Problem
Geophysics in the Cloud was a competition with the goal to perform seismic inversion of rock attributes from seismic data with the use of well logs. 

## Data
Data used in this project came from open data (3D Poseidon from Australia). Seismic acquisition in 2009 by ConoPhilips
Data given:
- Near, Mid, Far offset seismic
- Migration Velocity
- Sonic Logs
- Gamma
- Porosity
- Resistivity Logs

## Goal
Performing inversions for P-Impedance, S-Impedance, and Density.

## Evaluation

* MAE = (1/n) * Σ|yi – xi|
* R2 = 1 - ${\frac{RSS}{TSS}}$ \\
R^2	=	coefficient of determination \\
RSS	=	sum of squares of residuals \\
TSS	=	total sum of squares \\
* ME: maximum residual error

## Features

1. Well logs with DTC (transit-time of compressional wave), DTS (transit-time of shear wave) and RHOB (bulk density) are used for training and evaluation. 
2. Data is from Two blind wells. 
3. The provided logs contain a large amount of missing data.

Inversion Info:

Equations: \\
Zp = Vp${*}$Rhob \\
Zs = Vs${*}$Rhob

### 1. Get workspace ready

In [2]:
from time import time
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import scipy.integrate as integrate
from scipy.misc import derivative
from scipy.interpolate import interp1d
from scipy.signal import hilbert,chirp 

from smooth import *

# Scaler
from sklearn.preprocessing import RobustScaler

# Baseline
from sklearn.metrics import mean_absolute_error, r2_score, max_error
from xgboost import XGBRegressor, plot_importance

# Models
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dropout, Dense, LSTM
from keras.layers.convolutional import Conv1D, MaxPooling1D, AveragePooling1D
from tensorflow.keras import layers
from tensorflow import keras
import tensorflow.keras as k

### 2. Cleaning and imputing data
Given the size of SEGY files and the lack of computer power I decided to skip the data cleaning and imputation part. 

### 3. Data Engineering and augmentation
Create new features from provided data.

In [13]:
cleaned_data = pd.read_csv('../Datasets/Seismic Inversion/input_label_04-22_3.csv').set_index('well_id')
cleaned_data.head()

Unnamed: 0_level_0,well_enc,twt,rhob,vp,vs,formation,seis_near,seis_mid,seis_far,bg_vel
well_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
well_01,0,2180.5,2.583062,4535.708695,2274.160045,2,-6749.200079,-4651.793791,-6010.635571,4819.458925
well_01,0,2181.0,2.590539,4677.809167,2297.976146,2,-4500.335374,-4519.01192,-6720.620094,4824.308579
well_01,0,2181.5,2.576754,4603.527613,2344.096821,2,-2231.685365,-4388.921156,-7496.04429,4829.038659
well_01,0,2182.0,2.575342,4599.998286,2301.914494,2,-30.777578,-4282.545106,-8325.698097,4833.651971
well_01,0,2182.5,2.569139,4419.725343,2204.455407,2,2014.860464,-4220.907376,-9198.371453,4838.151318


In [14]:
cleaned_data.shape

(12019, 10)

In [23]:
cleaned_data.groupby('well_id').count()

Unnamed: 0_level_0,well_enc,twt,rhob,vp,vs,formation,seis_near,seis_mid,seis_far,bg_vel
well_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
well_01,2311,2311,2311,2311,2311,2311,2311,2311,2311,2311
well_11,1688,1688,1688,1688,1688,1688,1688,1688,1688,1688
well_21,2549,2549,2549,2549,2549,2549,2549,2549,2549,2549
well_25,2031,2031,2031,2031,2031,2031,2031,2031,2031,2031
well_27,2776,2776,2776,2776,2776,2776,2776,2776,2776,2776
well_33,664,664,664,664,664,664,664,664,664,664


In [29]:
cleaned_data.columns.values

array(['well_enc', 'twt', 'rhob', 'vp', 'vs', 'formation', 'seis_near',
       'seis_mid', 'seis_far', 'bg_vel'], dtype=object)