# Core-to-log matching

## Contents

* [Problem description](#Problem-description)
* [Dataset](#Dataset)
* [Matching pipeline](#Matching-pipeline)
* [Conclusion](#Conclusion)

## Problem description

Perform core-to-log matching by shifting core samples in order to maximize correlation between well and core logs.

Shifted core samples must satisfy the following constraints:
* Boring intervals constraints:
    * boring intervals must be shifted by no more than 5 meters
    * boring intervals must not overlap
    * the order of boring intervals must remain unchanged
    * if several boring intervals are extracted one after another, they must be shifted by the same delta
* Lithology intervals constraints (if defined for a well):
    * lithology intervals can be moved only inside the corresponding boring interval
    * lithology intervals must not overlap
    * the order of lithology intervals must remain unchanged

## Dataset

The algorithm was tested on a dataset of 147 wells:

In [1]:
import os
import sys

sys.path.insert(0, os.path.join("..", "..", ".."))
from well_logs import WellDataset
from well_logs.batchflow import Pipeline

In [2]:
DATASET_PATH = "/Raw_dataset/*"
well_ds = WellDataset(path=DATASET_PATH, dirs=True, sort=True)

## Matching pipeline

Matched wells are saved in the `MATCHED_DATASET_PATH` directory:

In [3]:
MATCHED_DATASET_PATH = "/Matched_dataset/"

The following matching mode precedence (from highest to lowest) is used. Each mode is specified as follows: `<well_log> ~ <core_attr>.<core_log>`, where:
* `well_log` - mnemonic of a well log to use
* `core_attr` - an attribute of a well to get core data from
* `core_log` - mnemonic of a core log or property to use

In [4]:
matching_modes = [
    "GK ~ core_logs.GK",
    "DENSITY ~ core_logs.DENSITY",
    "DENSITY ~ core_properties.DENSITY",
    "DENSITY ~ core_properties.POROSITY",
    "DT ~ core_properties.POROSITY",
    "NKTD ~ core_properties.POROSITY",
]

Matching pipeline consists of 3 actions:
* check, that well data is consistent
* perform core-to-log matching with the given modes and save matching reports for each well
* dump matched wells in the `MATCHED_DATASET_PATH` directory

In [5]:
matching_pipeline = (Pipeline()
    .check_regularity()
    .match_core_logs(mode=matching_modes, save_report=True)
    .dump(MATCHED_DATASET_PATH)
    .run(batch_size=1, n_epochs=1, shuffle=False, drop_last=False, bar=True, lazy=True)
)

In [6]:
(well_ds >> matching_pipeline).run()

 34%|███▍      | 50/147 [29:47<43:04, 26.65s/it]   


<well_logs.batchflow.batchflow.pipeline.Pipeline at 0x7fd15346fb10>

## Conclusion

Only 50 of 147 wells were matched by the algorithm, all other wells have inconsistent data.