In [1]:
Plot = require 'itorch.Plot'
require 'nn'

# Load the data
We first load the weird t7 converted files containing the data. The conversion step is in the <a href="data_loading.ipynb">data loading notebook</a>.

In [2]:
file = torch.DiskFile('dat/facies_vectors.t7', 'r')
facies = file:readObject()
file:close()
file = torch.DiskFile('dat/validation_data_nofacies.t7', 'r')
validate = file:readObject()
file:close()

# Clean the data
### Extract the useful feature vectors
Let's pick out the well logs and the "geologic constraining variables" by simply dropping the facies and the depth data.

In [3]:
print("facies size: ", facies:size()[1], "x", facies:size()[2])
print("validate size: ", validate:size()[1], "x", validate:size()[2])

facies size: 	4149	x	9	
validate size: 	830	x	8	


I've decided to map the wells into a third dimension of a tensor, so that the dims will go as 1=log, 2=depth, 3=well. I'm doing this as a table since the well lengths are variable.

In [4]:
-- initialize
training_data = {}
testing_data = {}
depth = {}

-- build the training wells into the table
training_data["shrimplin"] = facies[{{1,471},{3,9}}]
training_data["alexander"] = facies[{{472,937},{3,9}}]
training_data["shankle"] = facies[{{938,1386},{3,9}}]
training_data["luke"] = facies[{{1387,1847},{3,9}}]
training_data["kimzey"] = facies[{{1848,2286},{3,9}}]
training_data["cross"] = facies[{{2287,2787},{3,9}}]
training_data["nolan"] = facies[{{2788,3202},{3,9}}]
training_data["recruit"] = facies[{{3203,3282},{3,9}}]
training_data["newby"] = facies[{{3283,3745},{3,9}}]
training_data["churchman"] = facies[{{3746,4149},{3,9}}]

-- build the testing wells into the table
testing_data["stuart"] = validate[{{1,474},{2,8}}]
testing_data["crawford"] = validate[{{475,830},{2,8}}]

-- build a depth log for plotting
depth["shrimplin"] = facies[{{1,471},{2}}]
depth["alexander"] = facies[{{472,937},{2}}]
depth["shankle"] = facies[{{938,1386},{2}}]
depth["luke"] = facies[{{1387,1847},{2}}]
depth["kimzey"] = facies[{{1848,2286},{2}}]
depth["cross"] = facies[{{2287,2787},{2}}]
depth["nolan"] = facies[{{2788,3202},{2}}]
depth["recruit"] = facies[{{3203,3282},{2}}]
depth["newby"] = facies[{{3283,3745},{2}}]
depth["churchman"] = facies[{{3746,4149},{2}}]
depth["stuart"] = validate[{{1,474},{1}}]
depth["crawford"] = validate[{{475,830},{1}}]

-- QC
print(training_data)
print(testing_data)
print(depth)

{
  nolan : DoubleTensor - size: 415x7
  luke : DoubleTensor - size: 461x7
  shrimplin : DoubleTensor - size: 471x7
  kimzey : DoubleTensor - size: 439x7
  cross : DoubleTensor - size: 501x7
  newby : DoubleTensor - size: 463x7
  churchman : DoubleTensor - size: 404x7
  shankle : DoubleTensor - size: 449x7
  alexander : DoubleTensor - size: 466x7
  recruit : DoubleTensor - size: 80x7
}
{
  stuart : DoubleTensor - size: 474x7
  crawford : DoubleTensor - size: 356x7
}
{
  nolan : DoubleTensor - size: 415x1
  shankle : DoubleTensor - size: 449x1
  luke : DoubleTensor - size: 461x1
  stuart : DoubleTensor - size: 474x1
  shrimplin : DoubleTensor - size: 471x1
  kimzey : DoubleTensor - size: 439x1
  cross : DoubleTensor - size: 501x1
  newby : DoubleTensor - size: 463x1
  churchman : DoubleTensor - size: 404x1
  crawford : DoubleTensor - size: 356x1
  alexander : DoubleTensor - size: 466x1
  recruit : DoubleTensor - size: 80x1
}


### Normalize the data
As per the literature and Brandon's suggestion, we now normalize the data to have zero mean and unit variance.

In [5]:
mean = {}
stdv = {}

for key,value in pairs(training_data) do --over each well
    mean[key] = torch.Tensor(7)
    stdv[key] = torch.Tensor(7)
    print('Well: ', key)
    for i = 1, 7 do --over each log
        mean[key][i] = training_data[key][{{},{i}}]:mean()
        print('Log ' .. i .. ', Mean: ' .. mean[key][i])
        training_data[key][{{},{i}}]:add(-mean[key][i])
        
        stdv[key][i] = training_data[key][{{},{i}}]:std()
        print('Log ' .. i .. ', Standard Deviation: ' .. stdv[key][i])
        training_data[key][{{},{i}}]:div(stdv[key][i])
    end
    print("\n")
end

Well: 	nolan	


Log 1, Mean: 68.693939759036	
Log 1, Standard Deviation: 32.730215642043	
Log 2, Mean: 0.5924	
Log 2, Standard Deviation: 0.20888527235023	
Log 3, Mean: 3.1340698795181	
Log 3, Standard Deviation: 2.4304927280085	
Log 4, Mean: 12.197361445783	
Log 4, Standard Deviation: 4.8411695655865	
Log 5, Mean: 3.8579469879518	
Log 5, Standard Deviation: 0.87767775335189	
Log 6, Mean: 1.5277108433735	
Log 6, Standard Deviation: 0.49983409155943	


Log 7, Mean: 0.54900722891566	
Log 7, Standard Deviation: 0.28556345258955	

	
Well: 	luke	
Log 1, Mean: 64.777223427332	
Log 1, Standard Deviation: 27.427174260686	
Log 2, Mean: 0.63968980477223	
Log 2, Standard Deviation: 0.22662107159636	
Log 3, Mean: 4.2184381778742	
Log 3, Standard Deviation: 4.8162210685864	
Log 4, Mean: 12.953904555315	
Log 4, Standard Deviation: 6.3295181442742	
Log 5, Mean: 3.660704989154	
Log 5, Standard Deviation: 0.72724200247972	
Log 6, Mean: 1.4663774403471	
Log 6, Standard Deviation: 0.49941019631722	
Log 7, Mean: 0.51409544468547	
Log 7, Standard Deviation: 0.28690198220341	

	
Well: 	shrimplin	
Log 1, Mean: 69.40889596603	
Log 1, Standard Deviation: 37.299535899968	
Log 2, Mean: 0.65967940552017	
Log 2, Standard Deviation: 0.23289356801289	
Log 3, Mean: 7.7084925690021	
Log 3, Standard Deviation: 4.2794526468766	
Log 4, Mean: 12.173704883227	
Log 4, Standard Deviation: 5.3171523453616	
Log 5, Mean: 4.219957537155	
Log 5, Standard Deviation: 0.902823466

Log 6, Standard Deviation: 0.48699101424912	
Log 7, Mean: 0.52813822894168	
Log 7, Standard Deviation: 0.28509597819355	

	
Well: 	churchman	
Log 1, Mean: 63.683452970297	
Log 1, Standard Deviation: 33.333991721881	
Log 2, Mean: 0.75875742574257	
Log 2, Standard Deviation: 0.24136665388419	
Log 3, Mean: 1.56	
Log 3, Standard Deviation: 3.6546989040692	
Log 4, Mean: 14.011116336634	
Log 4, Standard Deviation: 10.579451257837	
Log 5, Mean: 3.7899257425743	
Log 5, Standard Deviation: 1.0927138953098	
Log 6, Mean: 1.7227722772277	
Log 6, Standard Deviation: 0.44818491135137	
Log 7, Mean: 0.54431683168317	
Log 7, Standard Deviation: 0.29364704867391	

	
Well: 	shankle	
Log 1, Mean: 65.431180400891	
Log 1, Standard Deviation: 25.69641831952	
Log 2, Mean: 0.63083073496659	
Log 2, Standard Deviation: 0.24129325637305	
Log 3, Mean: 2.3489977728285	
Log 3, Standard Deviation: 6.1135428647692	
Log 4, Mean: 15.741124721604	
Log 4, Standard Deviation: 9.0804665940009	
Log 5, Mean: 3.2249443207127	


Here's the same jazz for the validation data

In [6]:
mean = {}
stdv = {}

for key,value in pairs(testing_data) do --over each well
    mean[key] = torch.Tensor(7)
    stdv[key] = torch.Tensor(7)
    print('Well: ', key)
    for i = 1, 7 do --over each log
        mean[key][i] = testing_data[key][{{},{i}}]:mean()
        print('Log ' .. i .. ', Mean: ' .. mean[key][i])
        testing_data[key][{{},{i}}]:add(-mean[key][i])
        
        stdv[key][i] = testing_data[key][{{},{i}}]:std()
        print('Log ' .. i .. ', Standard Deviation: ' .. stdv[key][i])
        testing_data[key][{{},{i}}]:div(stdv[key][i])
    end
    print("\n")
end

Well: 	stuart	
Log 1, Mean: 56.819900843882	
Log 1, Standard Deviation: 28.600303331762	
Log 2, Mean: 0.71246624472574	
Log 2, Standard Deviation: 0.21809118076801	
Log 3, Mean: 3.2905063291139	
Log 3, Standard Deviation: 4.0506328371708	
Log 4, Mean: 11.373945147679	
Log 4, Standard Deviation: 4.8097374633476	
Log 5, Mean: 3.7441476793249	
Log 5, Standard Deviation: 0.58620027327333	
Log 6, Mean: 1.6308016877637	
Log 6, Standard Deviation: 0.48309759705951	
Log 7, Mean: 0.543	
Log 7, Standard Deviation: 0.27979586336928	

	
Well: 	crawford	
Log 1, Mean: 58.666019662921	
Log 1, Standard Deviation: 26.033590787314	
Log 2, Mean: 0.6048595505618	
Log 2, Standard Deviation: 0.3523933340864	
Log 3, Mean: 2.2680617977528	
Log 3, Standard Deviation: 2.2819218565829	
Log 4, Mean: 12.029859550562	
Log 4, Standard Deviation: 5.6424975847849	
Log 5, Mean: 3.5343876404494	
Log 5, Standard Deviation: 0.70918906775178	
Log 6, Mean: 1.7415730337079	
Log 6, Standard Deviation: 0.43838602546969	
Log 7,

### See what we've done

In [7]:
plot = Plot():line(training_data["shrimplin"][{{},{1}}]:reshape(training_data["shrimplin"]:size(1)), depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'red','gamma'):draw()

plot:line(training_data["shrimplin"][{{},{2}}]:reshape(training_data["shrimplin"]:size(1))+4, depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'blue','ILD')
plot:line(training_data["shrimplin"][{{},{3}}]:reshape(training_data["shrimplin"]:size(1))+8, depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'green','dPhi')
plot:line(training_data["shrimplin"][{{},{4}}]:reshape(training_data["shrimplin"]:size(1))+12, depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'brown','nPor')
plot:line(training_data["shrimplin"][{{},{5}}]:reshape(training_data["shrimplin"]:size(1))+16, depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'grey','PE')
--plot:line(training_data["shrimplin"][{{},{6}}]:reshape(training_data["shrimplin"]:size(1))+10, depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'black','NM_M')
--plot:line(training_data["shrimplin"][{{},{7}}]:reshape(training_data["shrimplin"]:size(1))+12, depth["shrimplin"]:reshape(training_data["shrimplin"]:size(1)),'yellow','RelPos')

plot:legend(true)
plot:title("Shrimplin Logs")
plot:redraw()

## Facies labels
To train and test the model we'll need the correct facies labels, so we memorize them here.

In [8]:
facies_labels = {}

facies_labels["shrimplin"] = facies[{{1,471},{1}}]
facies_labels["alexander"] = facies[{{472,937},{1}}]
facies_labels["shankle"] = facies[{{938,1386},{1}}]
facies_labels["luke"] = facies[{{1387,1847},{1}}]
facies_labels["kimzey"] = facies[{{1848,2286},{1}}]
facies_labels["cross"] = facies[{{2287,2787},{1}}]
facies_labels["nolan"] = facies[{{2788,3202},{1}}]
facies_labels["recruit"] = facies[{{3203,3282},{1}}]
facies_labels["newby"] = facies[{{3283,3745},{1}}]
facies_labels["churchman"] = facies[{{3746,4149},{1}}]

## Extract blind well
Next we separate the Newby well for blind testing.

In [9]:
-- create blind data
blind_well = {}

blind_well["newby"] = training_data["newby"][{{},{}}]

-- remove blind data from training data
training_data["newby"] = nil

## Build a net
I'll naively build a neural net here and run the data through. I have no idea what I'm doing or what type of layers I need, but here goes!

In [38]:
net = nn.Sequential()
net:add(nn.TemporalConvolution(1,3,2))
net:add(nn.ReLU())
net:add(nn.TemporalMaxPooling(2))
net:add(nn.TemporalConvolution(3,5,2))
net:add(nn.ReLU())
net:add(nn.TemporalMaxPooling(2))
net:add(nn.View(5*1))
net:add(nn.Linear(5*1,20))
net:add(nn.ReLU())
net:add(nn.Linear(20,10))
net:add(nn.ReLU())
net:add(nn.Linear(10,9))
net:add(nn.LogSoftMax())

## See if the net goes
The input to Torch NNs obviously has to be a `torch.Tensor`. Not quite as obviously, when you load in data as above it inherits the `readObject` top level properties. Sooooo to feed data into the net we have to first convert it to a `torch.Tensor`.

In [11]:
temp = torch.Tensor(7,1)
for i = 1,7 do
    temp[i] = training_data["shrimplin"][1][i]
end
input = temp

In [12]:
output = net:forward(input)

In [13]:
output

-2.2578
-2.2554
-2.2294
-2.2501
-2.1246
-2.1108
-2.2168
-2.1881
-2.1550
[torch.DoubleTensor of size 9]



In [14]:
net:zeroGradParameters() -- zero the internal gradient buffers of the network

In [15]:
gradInput = net:backward(input, torch.rand(9))

In [16]:
gradInput

0.01 *
  0.0695
  0.0000
 -1.0845
  0.0000
  0.0000
  0.0000
  0.0000
[torch.DoubleTensor of size 7x1]



## Define Loss Function
On the suggestion of the intewebs, I'm going to use a negative-log likeihood as an error criterion. Apparently it works well in classification problems. (Again, no idea wtf I'm doing)

In [17]:
criterion = nn.ClassNLLCriterion()

In [18]:
criterion:forward(output,facies_labels["shrimplin"][1])

2.2294142782347	1	


In [19]:
gradients = criterion:backward(output, facies_labels["shrimplin"][1])

In [20]:
gradients

 0
 0
-1
 0
 0
 0
 0
 0
 0
[torch.DoubleTensor of size 9]



In [21]:
gradInput = net:backward(input, gradients)

## Condition the data
`nn.StochasticGradient` expects it's input in the form outlined in the <a href="https://github.com/torch/nn/blob/master/doc/training.md#traindataset">docs</a>. Also, as mentioned above, my data isn't in the cleverest of forms to begin with. So let's fix all this malarchy.

In [22]:
trainset = {}

-- the data
trainset["data"] = torch.Tensor(facies:size()[1],7,1) 
--[[ the extra 1 dimension here is, as was shown above, 
    because the net requires a 2D input tensor]]--

idx = 0
for key,value in pairs(training_data) do
    for i = 1,training_data[key]:size()[1] do
        trainset["data"][i + idx] = training_data[key][i]
    end
    idx = idx + training_data[key]:size()[1]
end

-- the answers
trainset["facies"] = torch.Tensor(facies:size()[1])

idx = 0
for key,value in pairs(facies_labels) do
    for i = 1, facies_labels[key]:size()[1] do
        trainset["facies"][i + idx] = facies_labels[key][i]
    end
    idx = idx + facies_labels[key]:size()[1]
end

In [23]:
trainset

{
  data : DoubleTensor - size: 4149x7x1
  facies : DoubleTensor - size: 4149
}


In [24]:
#trainset.data

 4149
    7
    1
[torch.LongStorage of size 3]



In [25]:
#trainset.facies

 4149
[torch.LongStorage of size 1]




1. First, we need a `torch.Tensor` input to the neural net. This automatically satisfies the `StochasticGradient` requirement that the data must have an index function.
2. Second, the data must have a `:size()` function.

In [26]:
setmetatable(trainset, 
    {__index = function(t, i) 
                    return {t.data[i], t.facies[i]} 
                end}
);

function trainset:size() 
    return self.data:size(1) 
end

In [27]:
trainset:size()

4149	


In [28]:
trainset

{
  data : DoubleTensor - size: 4149x7x1
  size : function: 0x410eb800
  facies : DoubleTensor - size: 4149
}


## Train the net

In [29]:
trainer = nn.StochasticGradient(net, criterion)
trainer.learningRate = 0.001
trainer.maxIteration = 5

In [39]:
timer = torch.Timer()
trainer:train(trainset)
print(timer:time().real)

# StochasticGradient: training	


# current error = 2.0218501810146	


# current error = 2.0218536981833	


# current error = 2.0218645171743	


# current error = 2.0218661067947	


# current error = 2.0218651808945	
# StochasticGradient: you have reached the maximum number of iterations	
# training error = 2.0218651808945	


9.1088891029358	


## Predict using the net
Again we'll have to organize the data and define an index and size function for the test set.

In [43]:
predicted = net:forward(testset.data[1])

[string "predicted = net:forward(testset.data[1])..."]:1: attempt to index global 'testset' (a nil value)
stack traceback:
	[string "predicted = net:forward(testset.data[1])..."]:1: in main chunk
	[C]: in function 'xpcall'
	/home/gram/torch/install/share/lua/5.1/itorch/main.lua:210: in function </home/gram/torch/install/share/lua/5.1/itorch/main.lua:174>
	/home/gram/torch/install/share/lua/5.1/lzmq/poller.lua:75: in function 'poll'
	/home/gram/torch/install/share/lua/5.1/lzmq/impl/loop.lua:307: in function 'poll'
	/home/gram/torch/install/share/lua/5.1/lzmq/impl/loop.lua:325: in function 'sleep_ex'
	/home/gram/torch/install/share/lua/5.1/lzmq/impl/loop.lua:370: in function 'start'
	/home/gram/torch/install/share/lua/5.1/itorch/main.lua:389: in main chunk
	[C]: in function 'require'
	(command line):1: in main chunk
	[C]: at 0x00405d50: 