Auto-populate a blank IC database (SQL Server) with live well data from the NPD FactPages.

<b>Part 1. Download the following data types, reformat for IC and export to .csv</b>

Exploration well headers<br>
Development well headers<br>
Core intervals<br>
Core photos<br>
Thin sections<br>
CO2<br>
Oil samples<br>
Lithostratigraphy<br>
Drill stem tests<br>
Casing and leak-off tests<br>
Drilling mud<br>
References and Documents<br>

In [1]:
import numpy as np
import pandas as pd
from pandas import ExcelFile
from pandas import ExcelWriter
import pprint
import copy
import os

# %pprint
# pp = pprint.PrettyPrinter(indent=10)

In [2]:
pd.__version__

'1.0.1'

In [3]:
# Change Pandas display settings to show all columns
pd.set_option('display.max_columns', None)  
pd.set_option('display.expand_frame_repr', False)
#pd.set_option('max_colwidth', None)
pd.set_option('display.max_rows', 500)

In [4]:
# IC database folder
dbdir = 'C:\ICData\Test3'

# Data folder
outdir = '{}\output_data'.format(dbdir)

# Create data folder within IC database folder
if not os.path.exists(outdir):
    os.mkdir(outdir)

In [5]:
# Create function to save dataframe to oudir and return header

def output_to_csv(outname, df):

    filepath = '{}\{}.csv'.format(outdir, outname)
    print('Saved to:', filepath)
    
    df.to_csv(filepath, index=False, encoding='utf-8-sig')

    return pd.read_csv(filepath).head(3)

# Example: output_to_csv(outname='IC_wellbore_exploration_all', df=df_explo)

In [6]:
# Uncomment your chosen data source -
    # web: select to download data live from NPD FactPages using parameterized query strings (see https://factpages.npd.no/factpages/Parameters.aspx)
    # file: select if you have manually downloaded data in Excel format and saved to 'input data' folder 

#data_source = 'web'
data_source = 'file'

# Download data, reformat for IC and output to .csv
    
## Well Headers

In [7]:
# Download the latest NPD well headers in Excel format
# Navigate to NPD Factpages > Wellbore > Table View > Exploration/Development > All - Long List> Export Excel.
# Assign to two dataframes, one for Exploraion wells and one for Development wells

if data_source == 'web':
    df_explo = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_exploration_all&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.169&CultureCode=en')
    df_dev = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_development_all&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.169&CultureCode=en')

if data_source == 'file':
    # Navigate to NPD Factpages > Wellbore > Table View > Exploration/Development > All - Long List> Export Excel.
    df_explo = pd.read_excel('input data/wellbore_exploration_all.xlsx')
    df_dev = pd.read_excel('input data/wellbore_development_all.xlsx')

# Print the original column titles in each dataframe.
print("\nExploration well header column titles:")
print(list(df_explo.columns))
print("\nDevelopment well header column titles:")
print(list(df_dev.columns))


Exploration well header column titles:
['Wellbore name', 'Well name', 'Drilling operator', 'Drilled in production licence', 'Purpose', 'Status', 'Content', 'Type', 'Subsea', 'Entered date', 'Completed date', 'Field', 'Drill permit', 'Discovery', 'Discovery wellbore', 'Bottom hole temperature [°C]', 'Sitesurvey', 'Seismic location', 'Maximum inclination [°]', 'Kelly bushing elevation [m]', 'Final vertical depth (TVD) [m RKB]', 'Total depth (MD) [m RKB]', 'Water depth [m]', 'Kick off  point [m RKB]', 'Oldest penetrated age', 'Oldest penetrated formation', 'Main area', 'Drilling facility', 'Drilling facility type', 'Drilling facility category', 'Licensing activity awarded in', 'Multilateral', 'Purpose - planned', 'Entry year', 'Completed year', 'Reclassified from/to wellbore', 'Reentry activity', 'Plot symbol', '1st level with HC, formation', '1st level with HC, age', '2nd level with HC, formation', '2nd level with HC, age', '3rd level with HC, formation', '3rd level with HC, age', 'Dril

In [8]:
(num_explo_rows, num_explo_cols) = df_explo.shape
(num_dev_rows, num_dev_cols) = df_dev.shape

print('{} rows and {} columns in Exploration wells.'.format(num_explo_rows, num_explo_cols))
print('{} rows and {} columns in Development wells.'.format(num_dev_rows, num_dev_cols))

1930 rows and 87 columns in Exploration wells.
5152 rows and 75 columns in Development wells.


In [9]:
print('Exploration well headers:')
df_explo.head(3)

Exploration well headers:


Unnamed: 0,Wellbore name,Well name,Drilling operator,Drilled in production licence,Purpose,Status,Content,Type,Subsea,Entered date,Completed date,Field,Drill permit,Discovery,Discovery wellbore,Bottom hole temperature [°C],Sitesurvey,Seismic location,Maximum inclination [°],Kelly bushing elevation [m],Final vertical depth (TVD) [m RKB],Total depth (MD) [m RKB],Water depth [m],Kick off point [m RKB],Oldest penetrated age,Oldest penetrated formation,Main area,Drilling facility,Drilling facility type,Drilling facility category,Licensing activity awarded in,Multilateral,Purpose - planned,Entry year,Completed year,Reclassified from/to wellbore,Reentry activity,Plot symbol,"1st level with HC, formation","1st level with HC, age","2nd level with HC, formation","2nd level with HC, age","3rd level with HC, formation","3rd level with HC, age",Drilling days,Reentry,Prod. licence for drilling target,Plugged and abondon date,Plugged date,Geodetic datum,NS degrees,NS minutes,NS seconds,NS code,EW degrees,EW minutes,EW seconds,EW code,NS decimal degrees,EW decimal degrees,NS UTM [m],EW UTM [m],UTM zone,"Wellbore name, part 1","Wellbore name, part 2","Wellbore name, part 3","Wellbore name, part 4","Wellbore name, part 5","Wellbore name, part 6",Pressrelease url,FactPage url,Factmaps,DISKOS Well Type,DISKOS Wellbore Parent,Publication date,Release date,Reclassified date,NPDID wellbore,NPDID discovery,NPDID field,NPDID drilling facility,NPDID wellbore reclassified from,NPDID production licence drilled in,NPDID site survey,Date main level updated,Date all updated,Date sync NPD
0,1/2-1,1/2-1,Phillips Petroleum Norsk AS,143,WILDCAT,P&A,OIL,EXPLORATION,NO,1989-03-20,1989-06-04,BLANE,604-L,1/2-1 Blane,YES,147.0,,PW 8303A - 10 SP. 290,2.0,22.0,,3574.0,72.0,,CAMPANIAN,TOR FM,NORTH SEA,ROSS ISLE,SEMISUB STEEL,MOVEABLE,12,NO,WILDCAT,1989,1989,,,5,FORTIES FM,PALEOCENE,,,,,77,NO,,NaT,NaT,ED50,56,53,15.07,N,2,28,35.7,E,56.887519,2.476583,6305128.26,468106.29,31,1,2,,1,,,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,2007-12-19,1991-06-04,NaT,1382,43814.0,3437650.0,296245.0,0,21956.0,,2019-10-03,2019-10-03,09.02.2020
1,1/2-2,1/2-2,Paladin Resources Norge AS,143 CS,WILDCAT,P&A,OIL SHOWS,EXPLORATION,NO,2005-12-14,2006-02-02,,1103-L,,NO,138.0,,inline 7429-trace 4824 Survey PGS CGMNOR,4.9,40.0,3432.0,3434.0,74.0,,PALEOCENE,EKOFISK FM,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,12,NO,WILDCAT,2005,2006,,,12,,,,,,,51,NO,,NaT,NaT,ED50,56,59,32.0,N,2,29,47.66,E,56.992222,2.496572,6316774.33,469410.1,31,1,2,,2,,,https://www.npd.no/fakta/nyheter/Resultat-av-l...,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,2008-08-15,2008-02-02,NaT,5192,,,278245.0,0,2424919.0,,2019-10-03,2019-10-03,09.02.2020
2,1/3-1,1/3-1,A/S Norske Shell,011,WILDCAT,P&A,GAS,EXPLORATION,NO,1968-07-06,1968-11-11,,15-L,1/3-1,YES,182.0,,LINE 5651 SP. E165,18.0,26.0,,4877.0,71.0,,LATE PERMIAN,ZECHSTEIN GP,NORTH SEA,ORION,JACK-UP 3 LEGS,MOVEABLE,1-A,NO,WILDCAT,1968,1968,,,9,TOR FM,LATE CRETACEOUS,CROMER KNOLL GP,EARLY CRETACEOUS,,,129,NO,,NaT,NaT,ED50,56,51,21.0,N,2,51,5.0,E,56.855833,2.851389,6301488.86,490936.87,31,1,3,,1,,,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,2010-04-30,1970-11-11,NaT,154,43820.0,,288604.0,0,20844.0,,2019-10-03,2019-10-03,09.02.2020


In [10]:
print('Development well headers:')
df_dev.head(3)

Development well headers:


Unnamed: 0,Wellbore name,Well name,Drilling operator,Drilled in production licence,Status,Purpose,Purpose - planned,Content,Type,Subsea,Entered date,Completed date,Predrilled entry date,Predrilled completion date,Field,Drill permit,Discovery,Discovery wellbore,Kelly bushing elevation [m],Final vertical depth (TVD) [m RKB],Total depth (MD) [m RKB],Water depth [m],Kick off point [m RKB],Main area,Drilling facility,Drilling facility type,Drilling facility category,Production facility,Licensing activity awarded in,Multilateral,Content - planned,Entry year,Completed year,Reclassified from/to wellbore,Plugged and abondon date,Plugged date,Prod. licence for drilling target,Plot symbol,Geodetic datum,NS degrees,NS minutes,NS seconds,NS code,EW degrees,EW minutes,EW seconds,EW code,NS decimal degrees,EW decimal degrees,NS UTM [m],EW UTM [m],UTM zone,"Wellbore name, part 1","Wellbore name, part 2","Wellbore name, part 3","Wellbore name, part 4","Wellbore name, part 5","Wellbore name, part 6",FactPage url,Factmaps,DISKOS Well Type,DISKOS Wellbore Parent,NPDID wellbore,NPDID discovery,NPDID field,Publication date,Release date,NPDID production licence drilled in,NPDID production licence target,NPDID drilling facility,NPDID production facility,NPDID wellbore reclassified from,Date main level updated,Date all updated,Date sync NPD
0,1/3-A-1 H,1/3-A-1,DONG E&P Norge AS,274,CLOSED,PRODUCTION,PRODUCTION,OIL,DEVELOPMENT,YES,2011-07-22,2011-09-21,NaT,NaT,OSELVAR,3365-P,1/3-6 Oselvar,NO,45.0,3163.0,5927.0,72.0,,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,OSELVAR,NST2001,NO,OIL,2011,2011,,NaT,NaT,,50,ED50,56,55,55.06,N,2,40,16.66,E,56.931961,2.671294,6310001.5,479994.47,31,1,3,A,1,,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,6612,43832.0,5506919.0,NaT,2013-09-21,2060266,,278245.0,410592.0,0,2019-12-09,2015-10-06,09.02.2020
1,1/3-A-2 H,1/3-A-2,DONG E&P Norge AS,274,CLOSED,PRODUCTION,PRODUCTION,OIL,DEVELOPMENT,YES,2011-11-18,2012-01-19,2011-06-19,2011-07-04,OSELVAR,3366-P,1/3-6 Oselvar,NO,45.0,3170.0,5882.0,72.0,,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,OSELVAR,NST2001,NO,OIL,2011,2012,,NaT,NaT,,50,ED50,56,55,54.89,N,2,40,16.67,E,56.931914,2.671297,6309996.24,479994.61,31,1,3,A,2,,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,6613,43832.0,5506919.0,NaT,2014-01-19,2060266,,278245.0,410592.0,0,2019-12-09,2015-10-06,09.02.2020
2,1/3-A-3 H,1/3-A-3,DONG E&P Norge AS,274,CLOSED,PRODUCTION,PRODUCTION,OIL,DEVELOPMENT,YES,2012-03-04,2012-05-14,2011-07-05,2011-07-21,OSELVAR,3367-P,1/3-6 Oselvar,NO,45.0,3171.0,6665.0,72.0,,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,OSELVAR,NST2001,NO,OIL,2012,2012,,NaT,NaT,,50,ED50,56,55,55.07,N,2,40,17.32,E,56.931964,2.671478,6310001.76,480005.63,31,1,3,A,3,,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,6614,43832.0,5506919.0,NaT,2014-05-14,2060266,,278245.0,410592.0,0,2019-12-09,2015-10-06,09.02.2020


### Column headers unique to Explo & Dev wells

In [11]:
explo_columns = df_explo.columns.tolist()
dev_columns = df_dev.columns.tolist()

# List well headers unqiue to each dataframe
print('Attributes unique to Exploration wells:\n', sorted(set(explo_columns) - set(dev_columns)))
print('\nAttributes unique to Development wells:\n', sorted(set(dev_columns) - set(explo_columns)))

Attributes unique to Exploration wells:
 ['1st level with HC, age', '1st level with HC, formation', '2nd level with HC, age', '2nd level with HC, formation', '3rd level with HC, age', '3rd level with HC, formation', 'Bottom hole temperature [°C]', 'Drilling days', 'Maximum inclination [°]', 'NPDID site survey', 'Oldest penetrated age', 'Oldest penetrated formation', 'Pressrelease url', 'Reclassified date', 'Reentry', 'Reentry activity', 'Seismic location', 'Sitesurvey']

Attributes unique to Development wells:
 ['Content - planned', 'NPDID production facility', 'NPDID production licence target', 'Predrilled completion date', 'Predrilled entry date', 'Production facility']


### Rename attributes for IC

In [12]:
# These are IC's default well header attributes (when matching columns in Import Well Header File)
# Try to use as many of these as possible when renaming below.
# Any other columns will need to be added to IC as Well Attributes.

ic_default_attributes = {'Name', 'Code', 'Alternate 1', 'Alternate 2', 'API number', 'UWI number', 'Comment', 'Geodatum', 
                         'Longitude', 'Latitude', 'Grid system', 'Surface X', 'Surface Y', 'Elevation Reference',
                         'Elevation', 'KBE', 'RTE', 'DFE', 'GLE', 'SPUD date', 'Completion date', 'Status', 
                         'Quadrant', 'Block', 'Sub block', 'Field', 'Location', 'Operator', 'Country',
                         'Basin', 'Province', 'County', 'State', 'Section', 'Township', 'Range', 'Terminal depth',
                         'Water depth', 'Facility', 'Discovery name', 'Seismic line', 'Intent', 'Licence number'}

# Rename columns from/to. 
# Check spelling and capitalisation carefully when renaming to match IC's default attributes.

attributes_to_rename = {'Wellbore name' : 'Name',
                        'Well name' : 'Alternate 1',
                        'Drilling operator' : 'Operator',
                        'Drilled in production licence' : 'Licence number',
                        'Purpose' : 'Intent',
                        'Purpose - planned' : 'Intent - planned',
                        'Status' : 'Well status',
                        'Content' : 'Well content',
                        'Entered date' : 'SPUD date',
                        'Completed date' : 'Completion date',
                        'Discovery' : 'Discovery name',
                        'Seismic location' : 'Seismic line',
                        'Kelly bushing elevation [m]' : 'KBE',
                        'Total depth (MD) [m RKB]' : 'Terminal depth',
                        'Water depth [m]' : 'Water depth',
                        'Kick off  point [m RKB]' : 'Kick off point [m RKB]', #remove extra space
                        'Main area' : 'Location',
                        'Drilling facility' : 'Facility',
                        '1st level with HC, formation' : '1st level with HC formation', #remove commas to be csv friendly
                        '1st level with HC, age' : '1st level with HC age',
                        '2nd level with HC, formation' : '2nd level with HC formation',
                        '2nd level with HC, age' : '2nd level with HC age',
                        '3rd level with HC, formation' : '3rd level with HC formation',
                        '3rd level with HC, age' : '3rd level with HC age',
                        'Geodetic datum' : 'Geodatum',
                        'NS decimal degrees' : 'Latitude',
                        'EW decimal degrees' : 'Longitude',
                        'NS UTM [m]' : 'Surface Y',
                        'EW UTM [m]' : 'Surface X',
                        'Wellbore name, part 1' : 'Quadrant',
                        'Wellbore name, part 2' : 'Block', 
                        'Pressrelease url' : 'Press Release URL',
                        'FactPage url' : 'FactPage URL',
                        'Factmaps' : 'FactMaps URL'}

# Apply renaming to each of the dataframes
df_explo.rename(columns=attributes_to_rename, inplace=True)
df_dev.rename(columns=attributes_to_rename, inplace=True)

# QC only renamed columns
print("Renamed attributes only:")
renamed_columns = list(attributes_to_rename.values())
df_explo[renamed_columns].head(3)
#df_dev[renamed_columns].head(3)

Renamed attributes only:


Unnamed: 0,Name,Alternate 1,Operator,Licence number,Intent,Intent - planned,Well status,Well content,SPUD date,Completion date,Discovery name,Seismic line,KBE,Terminal depth,Water depth,Kick off point [m RKB],Location,Facility,1st level with HC formation,1st level with HC age,2nd level with HC formation,2nd level with HC age,3rd level with HC formation,3rd level with HC age,Geodatum,Latitude,Longitude,Surface Y,Surface X,Quadrant,Block,Press Release URL,FactPage URL,FactMaps URL
0,1/2-1,1/2-1,Phillips Petroleum Norsk AS,143,WILDCAT,WILDCAT,P&A,OIL,1989-03-20,1989-06-04,1/2-1 Blane,PW 8303A - 10 SP. 290,22.0,3574.0,72.0,,NORTH SEA,ROSS ISLE,FORTIES FM,PALEOCENE,,,,,ED50,56.887519,2.476583,6305128.26,468106.29,1,2,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...
1,1/2-2,1/2-2,Paladin Resources Norge AS,143 CS,WILDCAT,WILDCAT,P&A,OIL SHOWS,2005-12-14,2006-02-02,,inline 7429-trace 4824 Survey PGS CGMNOR,40.0,3434.0,74.0,,NORTH SEA,MÆRSK GIANT,,,,,,,ED50,56.992222,2.496572,6316774.33,469410.1,1,2,https://www.npd.no/fakta/nyheter/Resultat-av-l...,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...
2,1/3-1,1/3-1,A/S Norske Shell,011,WILDCAT,WILDCAT,P&A,GAS,1968-07-06,1968-11-11,1/3-1,LINE 5651 SP. E165,26.0,4877.0,71.0,,NORTH SEA,ORION,TOR FM,LATE CRETACEOUS,CROMER KNOLL GP,EARLY CRETACEOUS,,,ED50,56.855833,2.851389,6301488.86,490936.87,1,3,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...


### Delete attributes containing duplicate data

In [13]:
# Coordinates are repeated elsewhere so we can delete the component parts from the dataframes.
# And we've renamed Wellbore name parts 1 and 2 to Quadrant and Block, and do not need the other parts.

attributes_to_drop = ['Plot symbol', 'NS degrees', 'NS minutes', 'NS seconds', 'NS code', 'EW degrees', 'EW minutes', 'EW seconds', 'EW code', 
                      'Wellbore name, part 3', 'Wellbore name, part 4', 'Wellbore name, part 5', 'Wellbore name, part 6']

df_explo.drop(attributes_to_drop, axis=1, inplace=True)
df_dev.drop(attributes_to_drop, axis=1, inplace=True)

print('Prove we still have well names and coordinates:')
df_explo[['Name', 'Latitude', 'Longitude', 'Surface Y', 'Surface X']].head(3)

Prove we still have well names and coordinates:


Unnamed: 0,Name,Latitude,Longitude,Surface Y,Surface X
0,1/2-1,56.887519,2.476583,6305128.26,468106.29
1,1/2-2,56.992222,2.496572,6316774.33,469410.1
2,1/3-1,56.855833,2.851389,6301488.86,490936.87


### Truncate well list based on column and value(s)

In [14]:
# Enter the column and values you want to return, e.g. Location: BARENTS SEA, or Quadrant: 6204, 6205.
fltr_column = 'Location'

# List the names you want to *KEEP*!
fltr_value = ['NORTH SEA', 'NORWEGIAN SEA', 'BARENTS SEA']

# Apply the filter to the dataframes
indexNames = df_explo[~df_explo[fltr_column].isin(fltr_value)].index
df_explo.drop(indexNames , inplace=True)

indexNames = df_dev[~df_dev[fltr_column].isin(fltr_value)].index
df_dev.drop(indexNames , inplace=True)

# Get dataframe shape and unpack tuples
(exploRows, exploCols) = df_explo.shape
(devRows, devCols) = df_dev.shape

# Print out the results
print("After filtering on {}: {}, you are left with:\n {} rows for Exploration wells, and {} rows for Development wells."
      .format(fltr_column, fltr_value, exploRows, devRows))
print('The first and last rows are:')

# Print the first and last rows of the Exploration dataframe to check that the filter has worked
df_explo.iloc[[0, -1]]

After filtering on Location: ['NORTH SEA', 'NORWEGIAN SEA', 'BARENTS SEA'], you are left with:
 1930 rows for Exploration wells, and 5152 rows for Development wells.
The first and last rows are:


Unnamed: 0,Name,Alternate 1,Operator,Licence number,Intent,Well status,Well content,Type,Subsea,SPUD date,Completion date,Field,Drill permit,Discovery name,Discovery wellbore,Bottom hole temperature [°C],Sitesurvey,Seismic line,Maximum inclination [°],KBE,Final vertical depth (TVD) [m RKB],Terminal depth,Water depth,Kick off point [m RKB],Oldest penetrated age,Oldest penetrated formation,Location,Facility,Drilling facility type,Drilling facility category,Licensing activity awarded in,Multilateral,Intent - planned,Entry year,Completed year,Reclassified from/to wellbore,Reentry activity,1st level with HC formation,1st level with HC age,2nd level with HC formation,2nd level with HC age,3rd level with HC formation,3rd level with HC age,Drilling days,Reentry,Prod. licence for drilling target,Plugged and abondon date,Plugged date,Geodatum,Latitude,Longitude,Surface Y,Surface X,UTM zone,Quadrant,Block,Press Release URL,FactPage URL,FactMaps URL,DISKOS Well Type,DISKOS Wellbore Parent,Publication date,Release date,Reclassified date,NPDID wellbore,NPDID discovery,NPDID field,NPDID drilling facility,NPDID wellbore reclassified from,NPDID production licence drilled in,NPDID site survey,Date main level updated,Date all updated,Date sync NPD
0,1/2-1,1/2-1,Phillips Petroleum Norsk AS,143,WILDCAT,P&A,OIL,EXPLORATION,NO,1989-03-20,1989-06-04,BLANE,604-L,1/2-1 Blane,YES,147.0,,PW 8303A - 10 SP. 290,2.0,22.0,,3574.0,72.0,,CAMPANIAN,TOR FM,NORTH SEA,ROSS ISLE,SEMISUB STEEL,MOVEABLE,12,NO,WILDCAT,1989,1989,,,FORTIES FM,PALEOCENE,,,,,77,NO,,NaT,NaT,ED50,56.887519,2.476583,6305128.26,468106.29,31,1,2,,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,2007-12-19,1991-06-04,NaT,1382,43814.0,3437650.0,296245.0,0,21956.0,,2019-10-03,2019-10-03,09.02.2020
1929,7435/12-1,7435/12-1,Statoil Petroleum AS,859,WILDCAT,P&A,GAS,EXPLORATION,NO,2017-08-09,2017-09-01,,1667-L,7435/12-1 (Korpfjell),YES,54.0,,ST14005T15: Inline 7833. X-Line 7829,3.1,32.0,1539.0,1540.0,253.0,,MIDDLE TRIASSIC,KOBBE FM,BARENTS SEA,SONGA ENABLER,SEMISUB STEEL,MOVEABLE,23,NO,WILDCAT,2017,2017,,,STØ FM,MIDDLE JURASSIC,KOBBE FM,MIDDLE TRIASSIC,,,24,NO,,NaT,NaT,ED50,74.071725,35.808628,8222886.96,402277.88,37,7435,12,https://www.npd.no/fakta/nyheter/Resultat-av-l...,https://factpages.npd.no/factpages/default.asp...,https://factmaps.npd.no/factmaps/3_0/?run=Well...,initial,,2019-09-01,2019-09-01,NaT,8228,29491696.0,,439972.0,0,28169055.0,,2019-12-07,2019-12-07,09.02.2020


### CREATE FILES - create Reference files for IC containing URLs for Explo and Dev wells

In [15]:
# Converts three URL columns into three rows. Adds a Title column and sorts by Well and Title.
df_explo_references = df_explo[['Name', 'NPDID wellbore', 'Press Release URL', 'FactPage URL', 'FactMaps URL']]
df_explo_references = pd.melt(df_explo_references, id_vars=['Name', 'NPDID wellbore'], value_vars=['Press Release URL', 'FactPage URL', 'FactMaps URL'], var_name='Title', value_name='URL')
df_explo_references.sort_values(['Name', 'Title'], inplace=True)

# Remove empty rows, specifically where no 'Press Release URL' for Exploration references
df_explo_references['URL'].replace(' ', np.nan, inplace=True)
df_explo_references.dropna(subset=['URL'], inplace=True)

# Name and create file for Exploration wells
output_to_csv(outname='IC_explo_references', df=df_explo_references)

Saved to: C:\ICData\Test3\output_data\IC_explo_references.csv


Unnamed: 0,Name,NPDID wellbore,Title,URL
0,1/2-1,1382,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...
1,1/2-1,1382,FactPage URL,https://factpages.npd.no/factpages/default.asp...
2,1/2-2,5192,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...


In [16]:
# As above, but creates 'Reference' file for Development Wells (minus the Press Release URL)
df_dev_references = df_dev[['Name', 'NPDID wellbore', 'FactPage URL', 'FactMaps URL']]
df_dev_references = pd.melt(df_dev_references, id_vars=['Name', 'NPDID wellbore'], 
                            value_vars=['FactPage URL', 'FactMaps URL'], 
                            var_name='Title', value_name='URL')
df_dev_references.sort_values(['Name', 'Title'], inplace=True)

# Name and create file for Development wells
output_to_csv(outname='IC_dev_references', df=df_dev_references)

Saved to: C:\ICData\Test3\output_data\IC_dev_references.csv


Unnamed: 0,Name,NPDID wellbore,Title,URL
0,1/3-A-1 H,6612,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...
1,1/3-A-1 H,6612,FactPage URL,https://factpages.npd.no/factpages/default.asp...
2,1/3-A-2 H,6613,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...


In [17]:
# Drop URL attributes
# Now that we've output the URLs to separate files, we no longer need them in the Exploration and Development dataframes.
df_explo.drop(['Press Release URL', 'FactPage URL', 'FactMaps URL'], axis=1, inplace=True)
df_dev.drop(['FactPage URL', 'FactMaps URL'], axis=1, inplace=True)

df_explo.head(3)

Unnamed: 0,Name,Alternate 1,Operator,Licence number,Intent,Well status,Well content,Type,Subsea,SPUD date,Completion date,Field,Drill permit,Discovery name,Discovery wellbore,Bottom hole temperature [°C],Sitesurvey,Seismic line,Maximum inclination [°],KBE,Final vertical depth (TVD) [m RKB],Terminal depth,Water depth,Kick off point [m RKB],Oldest penetrated age,Oldest penetrated formation,Location,Facility,Drilling facility type,Drilling facility category,Licensing activity awarded in,Multilateral,Intent - planned,Entry year,Completed year,Reclassified from/to wellbore,Reentry activity,1st level with HC formation,1st level with HC age,2nd level with HC formation,2nd level with HC age,3rd level with HC formation,3rd level with HC age,Drilling days,Reentry,Prod. licence for drilling target,Plugged and abondon date,Plugged date,Geodatum,Latitude,Longitude,Surface Y,Surface X,UTM zone,Quadrant,Block,DISKOS Well Type,DISKOS Wellbore Parent,Publication date,Release date,Reclassified date,NPDID wellbore,NPDID discovery,NPDID field,NPDID drilling facility,NPDID wellbore reclassified from,NPDID production licence drilled in,NPDID site survey,Date main level updated,Date all updated,Date sync NPD
0,1/2-1,1/2-1,Phillips Petroleum Norsk AS,143,WILDCAT,P&A,OIL,EXPLORATION,NO,1989-03-20,1989-06-04,BLANE,604-L,1/2-1 Blane,YES,147.0,,PW 8303A - 10 SP. 290,2.0,22.0,,3574.0,72.0,,CAMPANIAN,TOR FM,NORTH SEA,ROSS ISLE,SEMISUB STEEL,MOVEABLE,12,NO,WILDCAT,1989,1989,,,FORTIES FM,PALEOCENE,,,,,77,NO,,NaT,NaT,ED50,56.887519,2.476583,6305128.26,468106.29,31,1,2,initial,,2007-12-19,1991-06-04,NaT,1382,43814.0,3437650.0,296245.0,0,21956.0,,2019-10-03,2019-10-03,09.02.2020
1,1/2-2,1/2-2,Paladin Resources Norge AS,143 CS,WILDCAT,P&A,OIL SHOWS,EXPLORATION,NO,2005-12-14,2006-02-02,,1103-L,,NO,138.0,,inline 7429-trace 4824 Survey PGS CGMNOR,4.9,40.0,3432.0,3434.0,74.0,,PALEOCENE,EKOFISK FM,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,12,NO,WILDCAT,2005,2006,,,,,,,,,51,NO,,NaT,NaT,ED50,56.992222,2.496572,6316774.33,469410.1,31,1,2,initial,,2008-08-15,2008-02-02,NaT,5192,,,278245.0,0,2424919.0,,2019-10-03,2019-10-03,09.02.2020
2,1/3-1,1/3-1,A/S Norske Shell,011,WILDCAT,P&A,GAS,EXPLORATION,NO,1968-07-06,1968-11-11,,15-L,1/3-1,YES,182.0,,LINE 5651 SP. E165,18.0,26.0,,4877.0,71.0,,LATE PERMIAN,ZECHSTEIN GP,NORTH SEA,ORION,JACK-UP 3 LEGS,MOVEABLE,1-A,NO,WILDCAT,1968,1968,,,TOR FM,LATE CRETACEOUS,CROMER KNOLL GP,EARLY CRETACEOUS,,,129,NO,,NaT,NaT,ED50,56.855833,2.851389,6301488.86,490936.87,31,1,3,initial,,2010-04-30,1970-11-11,NaT,154,43820.0,,288604.0,0,20844.0,,2019-10-03,2019-10-03,09.02.2020


In [18]:
# Add new column(s) and assign constant value, e.g. Country: NORWAY.
df_explo['Country'] = 'NORWAY' 
df_dev['Country'] = 'NORWAY'

# IC Version 4.3.1 and earlier only. Fixed in 4.3.2.
# First lets rename an extraordinarily long string in column 'Seismic line' to avoid an error in IC.
#df_explo['Seismic line'] = df_explo['Seismic line'].replace('TUN15M01 3D bin datasett: Inline reference: 12688 Croslline reference: between 12383 and 12384', 'TUN15M01 3D bin: Inline 12688 Crossline 12383-12384')

# Remove decimal places introduced to the 'NPDIP' columns
df_explo['NPDID discovery'] = df_explo['NPDID discovery'].fillna(0).astype(int)
df_dev['NPDID discovery'] = df_dev['NPDID discovery'].fillna(0).astype(int)

df_explo['NPDID drilling facility'] = df_explo['NPDID drilling facility'].fillna(0).astype(int)
df_dev['NPDID drilling facility'] = df_dev['NPDID drilling facility'].fillna(0).astype(int)

df_explo['NPDID field'] = df_explo['NPDID field'].fillna(0).astype(int)
df_dev['NPDID field'] = df_dev['NPDID field'].fillna(0).astype(int)

# Copy data from one column to another, preserving the original.
df_explo['UWI number'] = df_explo['NPDID wellbore']
df_dev['UWI number'] = df_dev['NPDID wellbore']

# Check the result
df_explo[['Name', 'Country', 'NPDID drilling facility', 'NPDID wellbore']].head(3)

Unnamed: 0,Name,Country,NPDID drilling facility,NPDID wellbore
0,1/2-1,NORWAY,296245,1382
1,1/2-2,NORWAY,278245,5192
2,1/3-1,NORWAY,288604,154


### Concatenate Well Status & Well Content to match IC's Well Symbols dictionary

In [19]:
# This cell creates a new column called 'Status', combining 'Well Status' and 'Well Content'
# For example, to create a string like "P & A Oil Shows"

# Change 'P&A' to 'P & A'.
df_explo['Well status'] = df_explo['Well status'].replace(to_replace='P&A', value='P & A')

# First letter of each word capitalised
df_explo['Status'] = df_explo['Well status'].str.title() + ' ' + df_explo['Well content'].str.title()

# Check the results
df_explo[['Name', 'Status']].head(n=10)

Unnamed: 0,Name,Status
0,1/2-1,P & A Oil
1,1/2-2,P & A Oil Shows
2,1/3-1,P & A Gas
3,1/3-2,P & A Dry
4,1/3-3,P & A Oil
5,1/3-4,P & A Oil Shows
6,1/3-5,P & A Dry
7,1/3-6,P & A Gas Condensate
8,1/3-7,P & A Oil
9,1/3-8,P & A Shows


In [20]:
# List all unique entries under Status for all wells.
# In IC, open Database > Graphic Dictionaries > Well Symbols, and ensure you have dictionary entries for each.

lst_explo_status = sorted(set(df_explo['Status'].astype(str)))
lst_dev_status = sorted(set(df_dev['Status'].astype(str)))

lst_all_status = lst_explo_status + lst_dev_status

print("{} unique status values to include in IC 'Well Symbols' graphic dictionary:".format(len(lst_all_status)))
print('')
lst_unique_status = sorted(set(lst_all_status))
print(', '.join(lst_unique_status))

94 unique status values to include in IC 'Well Symbols' graphic dictionary:

Blowout Gas Shows, Closed, Closed Cuttings, Closed Gas, Closed Gas Condensate, Closed Oil, Closed Oil Gas, Closed Oil Gas Condensate, Closed Water, Closed Water Gas, Injecting Co2, Injecting Cuttings, Injecting Gas, Injecting Gas Condensate, Injecting Oil, Injecting Oil Gas, Injecting Water, Injecting Water Gas, Junked, Junked Dry, Junked Oil, Junked Oil Gas, Junked Oil Gas Shows, Junked Shows, Junked Water, P & A, P & A Cuttings, P & A Dry, P & A Gas, P & A Gas Condensate, P & A Gas Shows, P & A Oil, P & A Oil Gas, P & A Oil Gas Condensate, P & A Oil Gas Shows, P & A Oil Shows, P & A Shows, P & A Water, Plugged, Plugged Cuttings, Plugged Dry, Plugged Gas, Plugged Gas Condensate, Plugged Oil, Plugged Oil Gas, Plugged Oil Gas Condensate, Plugged Water, Plugged Water Gas, Predrilled, Predrilled Gas, Producing Gas, Producing Gas Condensate, Producing Oil, Producing Oil Gas, Producing Oil Gas Condensate, Producing

### Concatenate cells to create 'Grid system' in IC format

In [21]:
# At time of writing, there are several problems with 'Geodatum' in the NPD datasets, including:
#  - trailing spaces ('ED50 ') in all Explo wells
#  - erroneous '56ED50', '60ED50' and '61ED50' values in Dev wells
#  - missing 'ED50' values in two explo wells
# Luckily, we can just force 'ED50' on all these cells!

df_explo['Geodatum'] = 'ED50'
df_dev['Geodatum'] = 'ED50'

# Concatenate cells to create a new column 'Grid system' in IC format (e.g. "ED50 / UTM Zone 31N")

df_explo['Grid system'] = df_explo['Geodatum'] + ' / ' + 'UTM zone ' + df_explo['UTM zone'].map(str) + 'N'
df_dev['Grid system'] = df_dev['Geodatum'] + ' / ' + 'UTM zone ' + df_dev['UTM zone'].map(str) + 'N'

print('Geodatum and Grid systems for IC:')
df_explo[['Name', 'Geodatum', 'Grid system']].head(3)

Geodatum and Grid systems for IC:


Unnamed: 0,Name,Geodatum,Grid system
0,1/2-1,ED50,ED50 / UTM zone 31N
1,1/2-2,ED50,ED50 / UTM zone 31N
2,1/3-1,ED50,ED50 / UTM zone 31N


### QC Well Headers

In [22]:
# Print out attributes lists, reflecting all the changes above.
# Use these lists to check the current order of your columns in each, and consider how you might like to re-order them.
# Any columns created above (including: Country, Status, Grid system) currently appear at the end of the lists.

print('--- BEFORE RE-ORDERING ---\n')
print(len(df_explo.columns), 'Exploration attributes:\n', list(df_explo.columns), '\n')
print(len(df_dev.columns), 'Development attributes:\n', list(df_dev.columns))

--- BEFORE RE-ORDERING ---

75 Exploration attributes:
 ['Name', 'Alternate 1', 'Operator', 'Licence number', 'Intent', 'Well status', 'Well content', 'Type', 'Subsea', 'SPUD date', 'Completion date', 'Field', 'Drill permit', 'Discovery name', 'Discovery wellbore', 'Bottom hole temperature [°C]', 'Sitesurvey', 'Seismic line', 'Maximum inclination [°]', 'KBE', 'Final vertical depth (TVD) [m RKB]', 'Terminal depth', 'Water depth', 'Kick off point [m RKB]', 'Oldest penetrated age', 'Oldest penetrated formation', 'Location', 'Facility', 'Drilling facility type', 'Drilling facility category', 'Licensing activity awarded in', 'Multilateral', 'Intent - planned', 'Entry year', 'Completed year', 'Reclassified from/to wellbore', 'Reentry activity', '1st level with HC formation', '1st level with HC age', '2nd level with HC formation', '2nd level with HC age', '3rd level with HC formation', '3rd level with HC age', 'Drilling days', 'Reentry', 'Prod. licence for drilling target', 'Plugged and abond

### Re-order all columns (OPTIONAL)

In [23]:
# # Specifies the order of columns for Exploration and Development wells in the final outputs.
# # It's not compulsory to re-order columns, as IC lists all non-default attributes alphabetically.

# explo_order = ['Name', 'Alternate 1', 'UWI number', 'Quadrant', 'Block', 'Operator', 'Licence number', 'Intent', 
#                 'Intent - planned', 'Well status', 'Well content', 'Status', 'Type', 'Subsea', 'SPUD date', 
#                 'Completion date', 'Field', 'Drill permit', 'Discovery name', 'Discovery wellbore', 
#                 'Bottom hole temperature [°C]', 'Seismic line', 'Maximum inclination [°]', 'KBE', 
#                 'Final vertical depth (TVD) [m RKB]', 'Terminal depth', 'Water depth', 'Kick off point [m RKB]', 
#                 'Oldest penetrated age', 'Oldest penetrated formation', 'Location', 'Country', 'Facility', 
#                 'Drilling facility type', 'Drilling facility category', 'Licensing activity awarded in', 
#                 'Multilateral', 'Entry year', 'Completed year', 'Reclassified from/to wellbore', 'Reentry activity', 
#                 'Plot symbol', '1st level with HC formation', '1st level with HC age', '2nd level with HC formation', 
#                 '2nd level with HC age', '3rd level with HC formation', '3rd level with HC age', 'Drilling days', 
#                 'Reentry', 'Geodatum', 'Latitude', 'Longitude', 'Surface X', 'Surface Y', 'UTM zone', 'Grid system', 
#                 'DISKOS Well Type', 'DISKOS Wellbore Parent', 
#                 'Publication date', 'Release date', 'NPDID wellbore', 'NPDID discovery', 'NPDID field', 
#                 'NPDID drilling facility', 'NPDID wellbore reclassified from', 'NPDID production licence drilled in', 
#                 'Date main level updated', 'Date all updated', 'Date sync NPD']

# dev_order = ['Name', 'Alternate 1', 'UWI number', 'Quadrant', 'Block', 'Operator', 'Licence number', 'Intent', 
#               'Intent - planned', 'Well status', 'Well content',  'Status', 'Content - planned', 'Type', 'Subsea',
#               'SPUD date', 'Completion date', 'Field', 'Predrilled entry date','Predrilled completion date', 
#               'Drill permit', 'Discovery name', 'Discovery wellbore', 'KBE', 'Final vertical depth (TVD) [m RKB]',
#               'Terminal depth', 'Water depth', 'Kick off point [m RKB]', 'Location', 'Country', 'Facility', 
#               'Drilling facility type', 'Drilling facility category', 'Licensing activity awarded in', 
#               'Production facility', 'Multilateral', 'Entry year', 'Completed year','Reclassified from/to wellbore', 
#               'Plot symbol', 'Geodatum', 'Latitude', 'Longitude', 'Surface Y', 'Surface X', 'UTM zone',  'Grid system', 
#               'DISKOS Well Type', 'DISKOS Wellbore Parent', 'NPDID wellbore', 
#               'NPDID discovery', 'NPDID field', 'Publication date', 'Release date', 'NPDID production licence drilled in', 
#               'NPDID drilling facility', 'NPDID production facility','NPDID wellbore reclassified from', 
#               'Date main level updated', 'Date all updated', 'Date sync NPD']

In [24]:
# # Check if your list of re-ordered attributes is complete.
# missing_explo = set(df_explo.columns).difference(explo_order)
# missing_dev = set(df_dev.columns).difference(dev_order)

# if len(missing_explo) > 0:
#     print('Your re-ordered list of Exploration attributes is incomplete. You must include:\n {}.\n'.format(missing_explo))
# else:
#     print('Your re-ordered list of Exploration attributes is complete.\n')
    
# if len(missing_dev) > 0:
#     print('Your re-ordered list of Development attributes is incomplete. You must include:\n {}.'.format(missing_dev))
# else:
#     print('Your re-ordered list of Development attributes is complete.')

In [25]:
# # Only when your re-ordered lists of Exploration and Development attributes are complete should you run this cell,
# # Otherwise these will not be included in the output file!
# # Applies the re-ordering to the dataframes

# df_explo = df_explo.reindex(columns=explo_order)
# df_dev = df_dev.reindex(columns=dev_order)

### QC column values

In [26]:
# Print out all unique values for selected attributes (example: Operator and Field)

def lstheaderfields (*args):
    for arg in args:
        print('---' , arg, '---')
        print('')
        words = [x for x in df_explo[arg].unique()]
        print('Exploration wells:')
        print(words)
        print('')
        words = [x for x in df_dev[arg].unique()]
        print('Development wells:')
        print(words)
        print("")
        
# Enter the names of columns you would like to check
lstheaderfields('Operator', 'Field')

--- Operator ---

Exploration wells:
['Phillips Petroleum Norsk AS', 'Paladin Resources Norge AS', 'A/S Norske Shell', 'Elf Petroleum Norge AS', 'Amoco Norway Oil Company', 'BP Norway Limited U.A.', 'DONG E&P Norge AS', 'BG Norge AS', 'Phillips Petroleum Company Norway', 'Conoco Norway Inc.', 'Amerada Hess Norge AS', 'Total E&P Norge AS', 'Den norske stats oljeselskap a.s', 'BP Petroleum Dev. of Norway AS', 'Talisman Energy Norge AS', 'Det norske oljeselskap ASA', 'Aker BP ASA', 'Saga Petroleum ASA', 'Norske Murphy Oil Company', 'Norwegian Gulf Exploration Company AS', 'Norsk Hydro Produksjon AS', 'ConocoPhillips Skandinavia AS', 'Statoil Petroleum AS', 'Norsk Agip AS', 'StatoilHydro Petroleum AS', 'Lundin Norway AS', 'MOL Norge AS', 'Faroe Petroleum Norge AS', 'Edison Norge AS', 'Elf Norge A/S', 'Premier Oil Norge AS', 'Repsol Exploration Norge AS', 'LOTOS Exploration and Production Norge AS', 'Esso Exploration and Production Norway A/S', 'Unocal Norge A/S', 'Centrica Resources (Norge

### CREATE FILES - create well header files for explo and dev wells

In [27]:
# Output exploration well headers

print('{} exploration wells from {} to {}.'.format(len(df_explo), 
                                                   df_explo['Name'][df_explo.index[0]], 
                                                   df_explo['Name'][df_explo.index[-1]]))

output_to_csv(outname='IC_wellbore_exploration_all', df=df_explo)

1930 exploration wells from 1/2-1 to 7435/12-1.
Saved to: C:\ICData\Test3\output_data\IC_wellbore_exploration_all.csv


Unnamed: 0,Name,Alternate 1,Operator,Licence number,Intent,Well status,Well content,Type,Subsea,SPUD date,Completion date,Field,Drill permit,Discovery name,Discovery wellbore,Bottom hole temperature [°C],Sitesurvey,Seismic line,Maximum inclination [°],KBE,Final vertical depth (TVD) [m RKB],Terminal depth,Water depth,Kick off point [m RKB],Oldest penetrated age,Oldest penetrated formation,Location,Facility,Drilling facility type,Drilling facility category,Licensing activity awarded in,Multilateral,Intent - planned,Entry year,Completed year,Reclassified from/to wellbore,Reentry activity,1st level with HC formation,1st level with HC age,2nd level with HC formation,2nd level with HC age,3rd level with HC formation,3rd level with HC age,Drilling days,Reentry,Prod. licence for drilling target,Plugged and abondon date,Plugged date,Geodatum,Latitude,Longitude,Surface Y,Surface X,UTM zone,Quadrant,Block,DISKOS Well Type,DISKOS Wellbore Parent,Publication date,Release date,Reclassified date,NPDID wellbore,NPDID discovery,NPDID field,NPDID drilling facility,NPDID wellbore reclassified from,NPDID production licence drilled in,NPDID site survey,Date main level updated,Date all updated,Date sync NPD,Country,UWI number,Status,Grid system
0,1/2-1,1/2-1,Phillips Petroleum Norsk AS,143,WILDCAT,P & A,OIL,EXPLORATION,NO,1989-03-20,1989-06-04,BLANE,604-L,1/2-1 Blane,YES,147.0,,PW 8303A - 10 SP. 290,2.0,22.0,,3574.0,72.0,,CAMPANIAN,TOR FM,NORTH SEA,ROSS ISLE,SEMISUB STEEL,MOVEABLE,12,NO,WILDCAT,1989,1989,,,FORTIES FM,PALEOCENE,,,,,77,NO,,,,ED50,56.887519,2.476583,6305128.26,468106.29,31,1,2,initial,,2007-12-19,1991-06-04,,1382,43814,3437650,296245,0,21956.0,,2019-10-03,2019-10-03,09.02.2020,NORWAY,1382,P & A Oil,ED50 / UTM zone 31N
1,1/2-2,1/2-2,Paladin Resources Norge AS,143 CS,WILDCAT,P & A,OIL SHOWS,EXPLORATION,NO,2005-12-14,2006-02-02,,1103-L,,NO,138.0,,inline 7429-trace 4824 Survey PGS CGMNOR,4.9,40.0,3432.0,3434.0,74.0,,PALEOCENE,EKOFISK FM,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,12,NO,WILDCAT,2005,2006,,,,,,,,,51,NO,,,,ED50,56.992222,2.496572,6316774.33,469410.1,31,1,2,initial,,2008-08-15,2008-02-02,,5192,0,0,278245,0,2424919.0,,2019-10-03,2019-10-03,09.02.2020,NORWAY,5192,P & A Oil Shows,ED50 / UTM zone 31N
2,1/3-1,1/3-1,A/S Norske Shell,011,WILDCAT,P & A,GAS,EXPLORATION,NO,1968-07-06,1968-11-11,,15-L,1/3-1,YES,182.0,,LINE 5651 SP. E165,18.0,26.0,,4877.0,71.0,,LATE PERMIAN,ZECHSTEIN GP,NORTH SEA,ORION,JACK-UP 3 LEGS,MOVEABLE,1-A,NO,WILDCAT,1968,1968,,,TOR FM,LATE CRETACEOUS,CROMER KNOLL GP,EARLY CRETACEOUS,,,129,NO,,,,ED50,56.855833,2.851389,6301488.86,490936.87,31,1,3,initial,,2010-04-30,1970-11-11,,154,43820,0,288604,0,20844.0,,2019-10-03,2019-10-03,09.02.2020,NORWAY,154,P & A Gas,ED50 / UTM zone 31N


In [28]:
# Output development well headers

print('{} development wells {} to {}.'.format(len(df_dev), 
                                                   df_dev['Name'][df_dev.index[0]], 
                                                   df_dev['Name'][df_dev.index[-1]]))

output_to_csv(outname='IC_wellbore_development_all', df=df_dev)

5152 development wells 1/3-A-1 H to 7122/10-I-4 H.
Saved to: C:\ICData\Test3\output_data\IC_wellbore_development_all.csv


Unnamed: 0,Name,Alternate 1,Operator,Licence number,Well status,Intent,Intent - planned,Well content,Type,Subsea,SPUD date,Completion date,Predrilled entry date,Predrilled completion date,Field,Drill permit,Discovery name,Discovery wellbore,KBE,Final vertical depth (TVD) [m RKB],Terminal depth,Water depth,Kick off point [m RKB],Location,Facility,Drilling facility type,Drilling facility category,Production facility,Licensing activity awarded in,Multilateral,Content - planned,Entry year,Completed year,Reclassified from/to wellbore,Plugged and abondon date,Plugged date,Prod. licence for drilling target,Geodatum,Latitude,Longitude,Surface Y,Surface X,UTM zone,Quadrant,Block,DISKOS Well Type,DISKOS Wellbore Parent,NPDID wellbore,NPDID discovery,NPDID field,Publication date,Release date,NPDID production licence drilled in,NPDID production licence target,NPDID drilling facility,NPDID production facility,NPDID wellbore reclassified from,Date main level updated,Date all updated,Date sync NPD,Country,UWI number,Status,Grid system
0,1/3-A-1 H,1/3-A-1,DONG E&P Norge AS,274,CLOSED,PRODUCTION,PRODUCTION,OIL,DEVELOPMENT,YES,2011-07-22 00:00:00.000,2011-09-21 00:00:00.000,,,OSELVAR,3365-P,1/3-6 Oselvar,NO,45.0,3163.0,5927.0,72.0,,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,OSELVAR,NST2001,NO,OIL,2011,2011,,,,,ED50,56.931961,2.671294,6310001.5,479994.47,31,1,3,initial,,6612,43832,5506919,,2013-09-21 00:00:00.000,2060266,,278245,410592.0,0,2019-12-09 00:00:00.000,2015-10-06 00:00:00.000,09.02.2020,NORWAY,6612,Closed Oil,ED50 / UTM zone 31N
1,1/3-A-2 H,1/3-A-2,DONG E&P Norge AS,274,CLOSED,PRODUCTION,PRODUCTION,OIL,DEVELOPMENT,YES,2011-11-18 00:00:00.000,2012-01-19 00:00:00.000,2011-06-19 00:00:00.000,2011-07-04 00:00:00.000,OSELVAR,3366-P,1/3-6 Oselvar,NO,45.0,3170.0,5882.0,72.0,,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,OSELVAR,NST2001,NO,OIL,2011,2012,,,,,ED50,56.931914,2.671297,6309996.24,479994.61,31,1,3,initial,,6613,43832,5506919,,2014-01-19 00:00:00.000,2060266,,278245,410592.0,0,2019-12-09 00:00:00.000,2015-10-06 00:00:00.000,09.02.2020,NORWAY,6613,Closed Oil,ED50 / UTM zone 31N
2,1/3-A-3 H,1/3-A-3,DONG E&P Norge AS,274,CLOSED,PRODUCTION,PRODUCTION,OIL,DEVELOPMENT,YES,2012-03-04 00:00:00.000,2012-05-14 00:00:00.000,2011-07-05 00:00:00.000,2011-07-21 00:00:00.000,OSELVAR,3367-P,1/3-6 Oselvar,NO,45.0,3171.0,6665.0,72.0,,NORTH SEA,MÆRSK GIANT,JACK-UP 3 LEGS,MOVEABLE,OSELVAR,NST2001,NO,OIL,2012,2012,,,,,ED50,56.931964,2.671478,6310001.76,480005.63,31,1,3,initial,,6614,43832,5506919,,2014-05-14 00:00:00.000,2060266,,278245,410592.0,0,2019-12-09 00:00:00.000,2015-10-06 00:00:00.000,09.02.2020,NORWAY,6614,Closed Oil,ED50 / UTM zone 31N


### Well Attributes to create in IC

In [29]:
# The following attributes are not IC defaults and need to be created under Wells > Attributes.
# Alternatively, use the SQL code produced in the next cell to create these rows in SSMS. 

# Find the full list of attributes after all the editing you've done above
all_attributes = set(list(df_explo.columns) + list(df_dev.columns))

# Find and count those attributes you'll need to create in IC
non_default_attributes = list(set(all_attributes).difference(ic_default_attributes))
non_default_attributes.sort()
num_non_default_attributes = len(non_default_attributes)

print('The following {} attributes are not IC defaults and must be added to IC:\n'.format(num_non_default_attributes))
print(list(non_default_attributes))

The following 55 attributes are not IC defaults and must be added to IC:

['1st level with HC age', '1st level with HC formation', '2nd level with HC age', '2nd level with HC formation', '3rd level with HC age', '3rd level with HC formation', 'Bottom hole temperature [°C]', 'Completed year', 'Content - planned', 'DISKOS Well Type', 'DISKOS Wellbore Parent', 'Date all updated', 'Date main level updated', 'Date sync NPD', 'Discovery wellbore', 'Drill permit', 'Drilling days', 'Drilling facility category', 'Drilling facility type', 'Entry year', 'Final vertical depth (TVD) [m RKB]', 'Intent - planned', 'Kick off point [m RKB]', 'Licensing activity awarded in', 'Maximum inclination [°]', 'Multilateral', 'NPDID discovery', 'NPDID drilling facility', 'NPDID field', 'NPDID production facility', 'NPDID production licence drilled in', 'NPDID production licence target', 'NPDID site survey', 'NPDID wellbore', 'NPDID wellbore reclassified from', 'Oldest penetrated age', 'Oldest penetrated formatio

In [30]:
# If you have database administration privileges, you can use this cell 
# to generate the SQL Query code that will create Well Attributes in IC in the format:
    
    #INSERT INTO t_WellsUserFields (f_FieldId, f_FieldName, f_IsInputUsed, f_InputID, f_Description, f_Origin, f_SortOrder)
        #VALUES (1, 'Attribute', 'False', 0, 'Description of attribute', 0, 1);

        # This assumes you have no yet created any Well Attributes in IC. 
        # If you have already, you'll need to tweak the 3 variables below.

pk_index = 0 #Enter one less than your highest pk_index
original_pk_index = 0 #Enter the same number as above (this one we won't change)
f_sortorder = 0 #Enter the next appropriate f_sortorder

print("INSERT INTO t_WellsUserFields")
print("  (f_FieldId, f_FieldName, f_IsInputUsed, f_InputID, f_Description, f_Origin, f_SortOrder)")
print("VALUES")

for i in non_default_attributes:
    pk_index += 1
    f_sortorder += 1
    if pk_index < (num_non_default_attributes + original_pk_index):
        print("  ({x}, '{y}', 'False', 0, 'Userfield {y}', 0, {z}),".format(x = pk_index, y = i, z = f_sortorder))
    else:
        print("  ({x}, '{y}', 'False', 0, 'Userfield {y}', 0, {z});".format(x = pk_index, y = i, z = f_sortorder))

# Follow these steps:
    # 1. Open your IC database in SQL Server Management Studio. IC must be closed/computer restarted to open a LocalDB in SSMS.
    # 2. Expand 'Tables', scroll down to 't_WellsUserFields' and right-click 'Edit Top 200 Rows'.
    # 3. Press Ctrl+N to create a new query, copy and paste the following SQL code to the blank query and hit F5.

INSERT INTO t_WellsUserFields
  (f_FieldId, f_FieldName, f_IsInputUsed, f_InputID, f_Description, f_Origin, f_SortOrder)
VALUES
  (1, '1st level with HC age', 'False', 0, 'Userfield 1st level with HC age', 0, 1),
  (2, '1st level with HC formation', 'False', 0, 'Userfield 1st level with HC formation', 0, 2),
  (3, '2nd level with HC age', 'False', 0, 'Userfield 2nd level with HC age', 0, 3),
  (4, '2nd level with HC formation', 'False', 0, 'Userfield 2nd level with HC formation', 0, 4),
  (5, '3rd level with HC age', 'False', 0, 'Userfield 3rd level with HC age', 0, 5),
  (6, '3rd level with HC formation', 'False', 0, 'Userfield 3rd level with HC formation', 0, 6),
  (7, 'Bottom hole temperature [°C]', 'False', 0, 'Userfield Bottom hole temperature [°C]', 0, 7),
  (8, 'Completed year', 'False', 0, 'Userfield Completed year', 0, 8),
  (9, 'Content - planned', 'False', 0, 'Userfield Content - planned', 0, 9),
  (10, 'DISKOS Well Type', 'False', 0, 'Userfield DISKOS Well Type', 0, 10),
  

### Install correct co-ordinate systems to IC database

In [31]:
# In IC, open Project > Properties > Coords > Coordinate Systems
# Ensure each of the following co-ordinate system are installed **before importing well headers**
# Or use the cell below to write to file

lst_geodatum = sorted(set(df_explo['Geodatum'].astype(str)))
print('Geodatum:', ', '.join(lst_geodatum))

lst_gridsystem = sorted(set(df_explo['Grid system'].astype(str)))
print('Grid systems:', ', '.join(lst_gridsystem))

Geodatum: ED50
Grid systems: ED50 / UTM zone 31N, ED50 / UTM zone 32N, ED50 / UTM zone 33N, ED50 / UTM zone 34N, ED50 / UTM zone 35N, ED50 / UTM zone 36N, ED50 / UTM zone 37N


In [32]:
# Write Norwegian UTM Zones to Projections.def
# Note that Projections.def already contains geodatum ED50.

f = open(dbdir + '\Support\Projections.def', 'w+')

f.write('''# ED50
<4230> +proj=longlat +ellps=intl +no_defs <>
# ED50 / UTM zone 31N
<23031> +proj=utm +zone=31 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 32N
<23032> +proj=utm +zone=32 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 33N
<23033> +proj=utm +zone=33 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 34N
<23034> +proj=utm +zone=34 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 35N
<23035> +proj=utm +zone=35 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 36N
<23036> +proj=utm +zone=36 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 37N
<23037> +proj=utm +zone=37 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
''')

f.seek(0)
contents = f.read()
print(contents)

f.close()

# ED50
<4230> +proj=longlat +ellps=intl +no_defs <>
# ED50 / UTM zone 31N
<23031> +proj=utm +zone=31 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 32N
<23032> +proj=utm +zone=32 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 33N
<23033> +proj=utm +zone=33 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 34N
<23034> +proj=utm +zone=34 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 35N
<23035> +proj=utm +zone=35 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 36N
<23036> +proj=utm +zone=36 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>
# ED50 / UTM zone 37N
<23037> +proj=utm +zone=37 +ellps=intl +towgs84=-87,-98,-121,0,0,0,0 +units=m +no_defs <>



### Import the data to IC

In [33]:
# Before importing data to IC, ensure you have followed the last few steps to:
# - Create the appropriate Well Attributes in your IC Database.
# - Add the correct coordinate systems to your IC Project.

# Import reference files via Import > Well References
# Import well headers via Import > Headers.

# Note that, while the well header data imports very quickly, IC is a bit slow to create the wells if they don't already exist. Patience!

## Well Data

<h3>Core (Core Interval)</h3>

In [34]:
if data_source == 'web':
    df_core = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_core&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=165.225.81.99&CultureCode=en')

if data_source == 'file':
    df_core = pd.read_excel('input data/wellbore_core.xlsx')
    
df_core.head(3)

Unnamed: 0,Wellbore,Core sample number,Core sample - top depth,Core sample - bottom depth,Core sample depth - uom,Total core sample length [m],Number of cores samples,Cores available for sampling?,NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,1,10208.0,10208.4,[ft ],56.2,8,YES,1382,2016-05-31,08.02.2020
1,1/2-1,2,10216.0,10233.0,[ft ],56.2,8,YES,1382,2016-05-31,08.02.2020
2,1/2-1,3,0.0,0.0,,56.2,8,YES,1382,2016-05-31,08.02.2020


In [35]:
df_core = df_core.replace(0, np.nan)
df_core.head(3)

Unnamed: 0,Wellbore,Core sample number,Core sample - top depth,Core sample - bottom depth,Core sample depth - uom,Total core sample length [m],Number of cores samples,Cores available for sampling?,NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,1,10208.0,10208.4,[ft ],56.2,8.0,YES,1382,2016-05-31,08.02.2020
1,1/2-1,2,10216.0,10233.0,[ft ],56.2,8.0,YES,1382,2016-05-31,08.02.2020
2,1/2-1,3,,,,56.2,8.0,YES,1382,2016-05-31,08.02.2020


In [36]:
df_core.isnull().sum()

Wellbore                           0
Core sample number                 0
Core sample - top depth          136
Core sample -  bottom depth      136
Core sample depth - uom          130
Total core sample length [m]       5
Number of cores samples            6
Cores available for sampling?      0
NPDID wellbore                     0
Date updated                       0
Date sync NPD                      0
dtype: int64

In [37]:
filt = (df_core['Core sample - top depth'].isnull()) | (df_core['Core sample -  bottom depth'].isnull()) | (df_core['Core sample depth - uom'].isnull())

# (df_core['Core sample - top depth'] == 0.0) | (df_core['Core sample -  bottom depth'] == 0.0) |

df_core[filt].count()
df_core[filt]

Unnamed: 0,Wellbore,Core sample number,Core sample - top depth,Core sample - bottom depth,Core sample depth - uom,Total core sample length [m],Number of cores samples,Cores available for sampling?,NPDID wellbore,Date updated,Date sync NPD
2,1/2-1,3,,,,56.2,8.0,YES,1382,2016-05-31,08.02.2020
341,15/9-11,8,,,,104.3,11.0,YES,329,2016-05-31,08.02.2020
531,16/1-1,9,,,,171.0,18.0,NO,147,2018-03-16,08.02.2020
889,2/11-9,2,,,,33.25,4.0,YES,2153,2016-05-31,08.02.2020
903,2/11-A-2,6,,,,126.14,10.0,YES,1616,2016-05-31,08.02.2020
1015,2/4-2,1,,,,32.31,8.0,YES,172,2016-05-31,08.02.2020
1026,2/4-3,1,,,,87.48,18.0,YES,97,2016-05-31,08.02.2020
1136,2/4-A-4 A,5,,,,131.85,7.0,YES,1376,2016-05-31,08.02.2020
1196,2/4-B-12,13,,,,304.0,40.0,YES,616,2016-05-31,08.02.2020
1198,2/4-B-12,15,,,,304.0,40.0,YES,616,2016-05-31,08.02.2020


In [38]:
df_core = df_core[~filt]
df_core

Unnamed: 0,Wellbore,Core sample number,Core sample - top depth,Core sample - bottom depth,Core sample depth - uom,Total core sample length [m],Number of cores samples,Cores available for sampling?,NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,1,10208.0,10208.40,[ft ],56.20,8.0,YES,1382,2016-05-31,08.02.2020
1,1/2-1,2,10216.0,10233.00,[ft ],56.20,8.0,YES,1382,2016-05-31,08.02.2020
3,1/2-1,4,10256.0,10286.00,[ft ],56.20,8.0,YES,1382,2016-05-31,08.02.2020
4,1/2-1,5,10286.0,10358.00,[ft ],56.20,8.0,YES,1382,2016-05-31,08.02.2020
5,1/2-1,6,10358.0,10364.60,[ft ],56.20,8.0,YES,1382,2016-05-31,08.02.2020
...,...,...,...,...,...,...,...,...,...,...,...
8273,9/2-A-4,2,4648.5,4676.70,[m ],68.00,3.0,YES,2828,2016-05-31,08.02.2020
8274,9/2-A-4,3,4676.7,4704.36,[m ],68.00,3.0,YES,2828,2016-05-31,08.02.2020
8275,9/3-1,1,1798.0,1805.40,[m ],7.40,1.0,YES,921,2016-05-31,08.02.2020
8276,9/8-1,1,1926.0,1933.35,[m ],23.77,2.0,YES,145,2016-05-31,08.02.2020


In [39]:
#df_core.isnull().sum()

In [40]:
df_core['Core sample depth - uom'].unique()

array(['[ft  ]', '[m   ]'], dtype=object)

In [41]:
for index, row in df_core.iterrows():
    if row['Core sample depth - uom'] == '[ft  ]':
        df_core.loc[index, 'Top depth'] = (row['Core sample - top depth'] * 0.3048)
    else:
        df_core.loc[index, 'Top depth'] = row['Core sample - top depth']

In [42]:
for index, row in df_core.iterrows():
    if row['Core sample depth - uom'] == '[ft  ]':
        df_core.loc[index, 'Base depth'] = (row['Core sample -  bottom depth'] * 0.3048)
    else:
        df_core.loc[index, 'Base depth'] = row['Core sample -  bottom depth']

In [43]:
df_core.isnull().sum()

Wellbore                         0
Core sample number               0
Core sample - top depth          0
Core sample -  bottom depth      0
Core sample depth - uom          0
Total core sample length [m]     0
Number of cores samples          1
Cores available for sampling?    0
NPDID wellbore                   0
Date updated                     0
Date sync NPD                    0
Top depth                        0
Base depth                       0
dtype: int64

In [44]:
df_core.dtypes
# Note extra space in 'Core sample -  bottom depth'

Wellbore                                 object
Core sample number                        int64
Core sample - top depth                 float64
Core sample -  bottom depth             float64
Core sample depth - uom                  object
Total core sample length [m]            float64
Number of cores samples                 float64
Cores available for sampling?            object
NPDID wellbore                            int64
Date updated                     datetime64[ns]
Date sync NPD                            object
Top depth                               float64
Base depth                              float64
dtype: object

In [45]:
df_core = df_core[['Wellbore', 'NPDID wellbore', 'Top depth', 'Base depth', 'Core sample number', ]].round(2)
df_core

Unnamed: 0,Wellbore,NPDID wellbore,Top depth,Base depth,Core sample number
0,1/2-1,1382,3111.40,3111.52,1
1,1/2-1,1382,3113.84,3119.02,2
3,1/2-1,1382,3126.03,3135.17,4
4,1/2-1,1382,3135.17,3157.12,5
5,1/2-1,1382,3157.12,3159.13,6
...,...,...,...,...,...
8273,9/2-A-4,2828,4648.50,4676.70,2
8274,9/2-A-4,2828,4676.70,4704.36,3
8275,9/3-1,921,1798.00,1805.40,1
8276,9/8-1,145,1926.00,1933.35,1


In [46]:
# Rename columns
rename_cols = {'Wellbore' : 'Well',
               'Core sample number': 'Legend'
               }
    
# Apply renaming
df_core.rename(columns=rename_cols, inplace=True)
df_core.head(3)

Unnamed: 0,Well,NPDID wellbore,Top depth,Base depth,Legend
0,1/2-1,1382,3111.4,3111.52,1
1,1/2-1,1382,3113.84,3119.02,2
3,1/2-1,1382,3126.03,3135.17,4


In [47]:
# Output file
output_to_csv(outname='wellbore_core', df=df_core)

Saved to: C:\ICData\Test3\output_data\wellbore_core.csv


Unnamed: 0,Well,NPDID wellbore,Top depth,Base depth,Legend
0,1/2-1,1382,3111.4,3111.52,1
1,1/2-1,1382,3113.84,3119.02,2
2,1/2-1,1382,3126.03,3135.17,4


<h3>Core Photos</h3>

In [48]:
# Outputs three files:
    # wellbore_core_photo_ERRONEOUS_withURL.csv (erroneous 'Core photo title' columns)
    # wellbore_core_photo_withURL.csv
    # wellbore_core_photo.csv

In [49]:
if data_source == 'web':
    df_core_photo = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_core_photo&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.129.189&CultureCode=en')

if data_source == 'file':
    df_core_photo = pd.read_excel('input data/wellbore_core_photo.xlsx')

df_core_photo.head(3)

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Date updated
0,1/2-1,1,10208-10228ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,2019-04-25
1,1/2-1,2,19228-10262ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,2019-04-25
2,1/2-1,3,10262-10277ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,2019-04-25


In [50]:
df_core_photo = df_core_photo.drop('Date updated', 1)

In [51]:
# See https://pythex.org/

# Match pattern:
# 10208-10228ft
# 1802-1805m

pat = '\d{3,5}-\d{3,5}\D{1,2}'
    
#filt = df_core_photo['Core photo title'].str.extract(pat)
filt = df_core_photo['Core photo title'].str.contains(pat)

# Check rows that match pattern
df_core_photo[filt].head(3)

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
0,1/2-1,1,10208-10228ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
1,1/2-1,2,19228-10262ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
2,1/2-1,3,10262-10277ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382


In [52]:
# Check rows that do not match pattern and make corrections
df_core_photo[~filt]

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
1586,16/1-5,6,2044,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3279
2121,2/4-X-47,1,15736-15751,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2122,2/4-X-47,2,15751-15766,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2123,2/4-X-47,3,15766-15781,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2124,2/4-X-47,4,15781-15796,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2125,2/4-X-47,5,15796-15811,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2126,2/4-X-47,6,15811-15826,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2127,2/4-X-47,8,15826-15837,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2128,2/4-X-47,9,15837-15852,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2129,2/4-X-47,10,15852-15865,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157


In [53]:
# Note 95 rows with erroneous values
# Apply obvious corrections then drop the rest.

# Values for well 2/4-X-47 are obviously in ft.
filt_correction = df_core_photo['Wellbore'] == '2/4-X-47'
df_core_photo.loc[filt_correction, 'Core photo title'] = (df_core_photo['Core photo title'] + 'm')
df_core_photo.loc[filt_correction]

# There are other obvious corrections to be made, but leave for now.
# Example below, but don't do this on .loc as index may as more wells added.

#['Core photo title'].loc[14234] = '2482-2483m'

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
2121,2/4-X-47,1,15736-15751m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2122,2/4-X-47,2,15751-15766m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2123,2/4-X-47,3,15766-15781m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2124,2/4-X-47,4,15781-15796m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2125,2/4-X-47,5,15796-15811m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2126,2/4-X-47,6,15811-15826m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2127,2/4-X-47,8,15826-15837m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2128,2/4-X-47,9,15837-15852m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2129,2/4-X-47,10,15852-15865m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157


In [54]:
# Assign rows that do not match pattern to new dataframe

df_core_photo_ERRONEOUS = df_core_photo[~filt]
df_core_photo_ERRONEOUS

# Output data
output_to_csv(outname='wellbore_core_photo_ERRONEOUS_withURL', df=df_core_photo_ERRONEOUS)

Saved to: C:\ICData\Test3\output_data\wellbore_core_photo_ERRONEOUS_withURL.csv


Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
0,16/1-5,6,2044,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3279
1,2/4-X-47,1,15736-15751m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157
2,2/4-X-47,2,15751-15766m,https://factpages.npd.no/pbl/core_photo_jpgs/3...,3157


In [55]:
# Keep only rows that do match pattern
# Dumps the rest (e.g. '2044', 'Core 2')

df_core_photo = df_core_photo[filt]
print(df_core_photo.shape)
df_core_photo

(20849, 5)


Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
0,1/2-1,1,10208-10228ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
1,1/2-1,2,19228-10262ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
2,1/2-1,3,10262-10277ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
3,1/2-1,4,10277-10292ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
4,1/2-1,5,10292-10307ft,https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
...,...,...,...,...,...
20939,9/2-A-4,13,4691-4696m,https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828
20940,9/2-A-4,14,4696-4701m,https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828
20941,9/2-A-4,15,4701-4704m,https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828
20942,9/3-1,1,1798-1802m,https://factpages.npd.no/pbl/core_photo_jpgs/9...,921


In [56]:
# Check for null values
df_core_photo.isna().sum()

Wellbore              0
Core sample number    0
Core photo title      0
Core photo URL        0
NPDID wellbore        0
dtype: int64

In [57]:
# Check datatypes
df_core_photo.dtypes

Wellbore              object
Core sample number     int64
Core photo title      object
Core photo URL        object
NPDID wellbore         int64
dtype: object

In [58]:
df_core_photo['Core photo title'].replace({'mj': 'm', #one erroneous 'mj' value
                                           'n': ',m', #one erroneous 'n' value
                                           'm': ',m', #then replace all 'm'
                                           'M': ',m',
                                           'ft': ',ft',
                                           'FT': ',ft'
                                          }, regex=True, inplace=True)

df_core_photo['Core photo title'].replace({'-': ','}, regex=True, inplace=True)

df_core_photo.head(3)

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382


In [59]:
df_core_photo.tail()

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore
20939,9/2-A-4,13,"4691,4696,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828
20940,9/2-A-4,14,"4696,4701,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828
20941,9/2-A-4,15,"4701,4704,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828
20942,9/3-1,1,"1798,1802,m",https://factpages.npd.no/pbl/core_photo_jpgs/9...,921
20943,9/3-1,2,"1802,1805,m",https://factpages.npd.no/pbl/core_photo_jpgs/9...,921


In [60]:
df_core_photo[['Top depth', 'Base depth', 'Unit']] = df_core_photo['Core photo title'].str.split(pat=',', n=2, expand=True)
df_core_photo

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Top depth,Base depth,Unit
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10208,10228,ft
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,19228,10262,ft
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10262,10277,ft
3,1/2-1,4,"10277,10292,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10277,10292,ft
4,1/2-1,5,"10292,10307,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10292,10307,ft
...,...,...,...,...,...,...,...,...
20939,9/2-A-4,13,"4691,4696,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4691,4696,m
20940,9/2-A-4,14,"4696,4701,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4696,4701,m
20941,9/2-A-4,15,"4701,4704,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4701,4704,m
20942,9/3-1,1,"1798,1802,m",https://factpages.npd.no/pbl/core_photo_jpgs/9...,921,1798,1802,m


In [61]:
df_core_photo['Unit'].unique()

array(['ft', 'm'], dtype=object)

In [62]:
# Drop columns that contain nulls
df_core_photo.isna().sum()

Wellbore              0
Core sample number    0
Core photo title      0
Core photo URL        0
NPDID wellbore        0
Top depth             0
Base depth            0
Unit                  0
dtype: int64

In [63]:
df_core_photo

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Top depth,Base depth,Unit
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10208,10228,ft
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,19228,10262,ft
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10262,10277,ft
3,1/2-1,4,"10277,10292,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10277,10292,ft
4,1/2-1,5,"10292,10307,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,10292,10307,ft
...,...,...,...,...,...,...,...,...
20939,9/2-A-4,13,"4691,4696,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4691,4696,m
20940,9/2-A-4,14,"4696,4701,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4696,4701,m
20941,9/2-A-4,15,"4701,4704,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4701,4704,m
20942,9/3-1,1,"1798,1802,m",https://factpages.npd.no/pbl/core_photo_jpgs/9...,921,1798,1802,m


In [64]:
df_core_photo.dtypes

Wellbore              object
Core sample number     int64
Core photo title      object
Core photo URL        object
NPDID wellbore         int64
Top depth             object
Base depth            object
Unit                  object
dtype: object

In [65]:
# Convert all top depths to metres

for index, row in df_core_photo.iterrows():
    if row['Unit'] == 'ft':
        df_core_photo.loc[index, 'Top depth'] = int(row['Top depth']) * 0.3048
    else:
        df_core_photo.loc[index, 'Top depth'] = int(row['Top depth'])

In [66]:
# Convert all base depths to metres

for index, row in df_core_photo.iterrows():
    if row['Unit'] == 'ft':
        df_core_photo.loc[index, 'Base depth'] = int(row['Base depth']) * 0.3048
    else:
        df_core_photo.loc[index, 'Base depth'] = int(row['Base depth'])

In [67]:
df_core_photo.round(2).head(3)

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Top depth,Base depth,Unit
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3111.4,3117.49,ft
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,5860.69,3127.86,ft
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3127.86,3132.43,ft


In [68]:
# Add a new column with the filepath, e.g. '.\core_photo_jpgs\9_3-1\921_01_1798-1802m.jpg'
# Where . represents the current directory

df_core_photo['Folder'] = df_core_photo['Wellbore'].str.replace('/', '_')
df_core_photo['Legend'] = '.\\' + 'core_photo_jpgs\\' + df_core_photo['Folder'] + '\\'+ df_core_photo['Core photo URL'].str.split('/').str[-1]

df_core_photo

# If you only want file name use:
# df_core_photo['Legend'] = '.\\' + df_core_photo['Core photo URL'].str.split('/').str[-1]
# df_core_photo

Unnamed: 0,Wellbore,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Top depth,Base depth,Unit,Folder,Legend
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3111.4,3117.49,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_01_10208-10228ft.jpg
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,5860.69,3127.86,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_02_19228-10262ft.jpg
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3127.86,3132.43,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_03_10262-10277ft.jpg
3,1/2-1,4,"10277,10292,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3132.43,3137,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_04_10277-10292ft.jpg
4,1/2-1,5,"10292,10307,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3137,3141.57,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_05_10292-10307ft.jpg
...,...,...,...,...,...,...,...,...,...,...
20939,9/2-A-4,13,"4691,4696,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4691,4696,m,9_2-A-4,.\core_photo_jpgs\9_2-A-4\2828_13_4691-4696m.jpg
20940,9/2-A-4,14,"4696,4701,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4696,4701,m,9_2-A-4,.\core_photo_jpgs\9_2-A-4\2828_14_4696-4701m.jpg
20941,9/2-A-4,15,"4701,4704,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4701,4704,m,9_2-A-4,.\core_photo_jpgs\9_2-A-4\2828_15_4701-4704m.jpg
20942,9/3-1,1,"1798,1802,m",https://factpages.npd.no/pbl/core_photo_jpgs/9...,921,1798,1802,m,9_3-1,.\core_photo_jpgs\9_3-1\921_01_1798-1802m.jpg


In [69]:
# Rename columns
rename_cols = {'Wellbore' : 'Well'}

# Apply renaming
df_core_photo.rename(columns=rename_cols, inplace=True)
df_core_photo

Unnamed: 0,Well,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Top depth,Base depth,Unit,Folder,Legend
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3111.4,3117.49,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_01_10208-10228ft.jpg
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,5860.69,3127.86,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_02_19228-10262ft.jpg
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3127.86,3132.43,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_03_10262-10277ft.jpg
3,1/2-1,4,"10277,10292,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3132.43,3137,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_04_10277-10292ft.jpg
4,1/2-1,5,"10292,10307,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3137,3141.57,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_05_10292-10307ft.jpg
...,...,...,...,...,...,...,...,...,...,...
20939,9/2-A-4,13,"4691,4696,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4691,4696,m,9_2-A-4,.\core_photo_jpgs\9_2-A-4\2828_13_4691-4696m.jpg
20940,9/2-A-4,14,"4696,4701,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4696,4701,m,9_2-A-4,.\core_photo_jpgs\9_2-A-4\2828_14_4696-4701m.jpg
20941,9/2-A-4,15,"4701,4704,m",https://factpages.npd.no/pbl/core_photo_jpgs/2...,2828,4701,4704,m,9_2-A-4,.\core_photo_jpgs\9_2-A-4\2828_15_4701-4704m.jpg
20942,9/3-1,1,"1798,1802,m",https://factpages.npd.no/pbl/core_photo_jpgs/9...,921,1798,1802,m,9_3-1,.\core_photo_jpgs\9_3-1\921_01_1798-1802m.jpg


In [70]:
# Output file that includes URLs before going on to generate file for IC
output_to_csv(outname='wellbore_core_photo_withURL', df=df_core_photo)

Saved to: C:\ICData\Test3\output_data\wellbore_core_photo_withURL.csv


Unnamed: 0,Well,Core sample number,Core photo title,Core photo URL,NPDID wellbore,Top depth,Base depth,Unit,Folder,Legend
0,1/2-1,1,"10208,10228,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3111.3984,3117.4944,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_01_10208-10228ft.jpg
1,1/2-1,2,"19228,10262,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,5860.6944,3127.8576,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_02_19228-10262ft.jpg
2,1/2-1,3,"10262,10277,ft",https://factpages.npd.no/pbl/core_photo_jpgs/1...,1382,3127.8576,3132.4296,ft,1_2-1,.\core_photo_jpgs\1_2-1\1382_03_10262-10277ft.jpg


In [71]:
df_core_photo = df_core_photo[['Well', 'NPDID wellbore', 'Top depth', 'Base depth', 'Legend']]
df_core_photo.head(3)

Unnamed: 0,Well,NPDID wellbore,Top depth,Base depth,Legend
0,1/2-1,1382,3111.4,3117.49,.\core_photo_jpgs\1_2-1\1382_01_10208-10228ft.jpg
1,1/2-1,1382,5860.69,3127.86,.\core_photo_jpgs\1_2-1\1382_02_19228-10262ft.jpg
2,1/2-1,1382,3127.86,3132.43,.\core_photo_jpgs\1_2-1\1382_03_10262-10277ft.jpg


In [72]:
# Output file
output_to_csv(outname='wellbore_core_photo', df=df_core_photo)

Saved to: C:\ICData\Test3\output_data\wellbore_core_photo.csv


Unnamed: 0,Well,NPDID wellbore,Top depth,Base depth,Legend
0,1/2-1,1382,3111.3984,3117.4944,.\core_photo_jpgs\1_2-1\1382_01_10208-10228ft.jpg
1,1/2-1,1382,5860.6944,3127.8576,.\core_photo_jpgs\1_2-1\1382_02_19228-10262ft.jpg
2,1/2-1,1382,3127.8576,3132.4296,.\core_photo_jpgs\1_2-1\1382_03_10262-10277ft.jpg


<h3>Thin Section</h3>

In [73]:
if data_source == 'web':
    df_thin_section = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_thin_section&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en')

if data_source == 'file':
    df_thin_section = pd.read_excel('input data/wellbore_thin_section.xlsx')

In [74]:
print(df_thin_section.shape)
df_thin_section.head(3)

(2136, 7)


Unnamed: 0,Wellbore,Number,Depth,Unit,NPDID wellbore,Date updated,Date sync NPD
0,1/3-1,117,11221.0,[ft ],154,2019-04-25,09.02.2020
1,1/3-1,118,11221.0,[ft ],154,2019-04-25,09.02.2020
2,1/3-5,121,4809.0,[m ],223,2018-12-20,09.02.2020


In [75]:
df_thin_section['Unit'].unique()

array(['[ft  ]', '[m   ]'], dtype=object)

In [76]:
df_thin_section.isna().sum()

Wellbore          0
Number            0
Depth             0
Unit              0
NPDID wellbore    0
Date updated      0
Date sync NPD     0
dtype: int64

In [77]:
for index, row in df_thin_section.iterrows():
    if row['Unit'] == '[ft  ]':
        df_thin_section.loc[index, 'Depth'] = row['Depth'] * 0.3048
    else:
        df_thin_section.loc[index, 'Depth'] = row['Depth']
        
df_thin_section.drop(columns='Unit', inplace=True)

In [78]:
df_thin_section.head(3)

Unnamed: 0,Wellbore,Number,Depth,NPDID wellbore,Date updated,Date sync NPD
0,1/3-1,117,3420.1608,154,2019-04-25,09.02.2020
1,1/3-1,118,3420.1608,154,2019-04-25,09.02.2020
2,1/3-5,121,4809.0,223,2018-12-20,09.02.2020


In [79]:
for index, row in df_thin_section.iterrows():
    df_thin_section.loc[index, 'Legend'] = 'Thin section no. ' + str(row['Number'])
    
df_thin_section.head(3)

Unnamed: 0,Wellbore,Number,Depth,NPDID wellbore,Date updated,Date sync NPD,Legend
0,1/3-1,117,3420.1608,154,2019-04-25,09.02.2020,Thin section no. 117
1,1/3-1,118,3420.1608,154,2019-04-25,09.02.2020,Thin section no. 118
2,1/3-5,121,4809.0,223,2018-12-20,09.02.2020,Thin section no. 121


In [80]:
# Rename columns
rename_cols = {'Wellbore' : 'Well'}
    
# Apply renaming
df_thin_section.rename(columns=rename_cols, inplace=True)
df_thin_section

Unnamed: 0,Well,Number,Depth,NPDID wellbore,Date updated,Date sync NPD,Legend
0,1/3-1,117,3420.1608,154,2019-04-25,09.02.2020,Thin section no. 117
1,1/3-1,118,3420.1608,154,2019-04-25,09.02.2020,Thin section no. 118
2,1/3-5,121,4809.0000,223,2018-12-20,09.02.2020,Thin section no. 121
3,1/3-5,122,4812.0000,223,2018-12-20,09.02.2020,Thin section no. 122
4,1/3-5,2319,4805.3500,223,2019-03-27,09.02.2020,Thin section no. 2319
...,...,...,...,...,...,...,...
2131,8/3-1,790,3010.0000,142,2018-12-20,09.02.2020,Thin section no. 790
2132,9/4-3,793,3820.0000,152,2018-12-20,09.02.2020,Thin section no. 793
2133,9/4-3,794,3840.0000,152,2018-12-20,09.02.2020,Thin section no. 794
2134,9/4-3,795,3860.0000,152,2018-12-20,09.02.2020,Thin section no. 795


In [81]:
df_thin_section = df_thin_section[['Well', 'NPDID wellbore', 'Depth', 'Legend']]
df_thin_section = df_thin_section.round(2)
df_thin_section

Unnamed: 0,Well,NPDID wellbore,Depth,Legend
0,1/3-1,154,3420.16,Thin section no. 117
1,1/3-1,154,3420.16,Thin section no. 118
2,1/3-5,223,4809.00,Thin section no. 121
3,1/3-5,223,4812.00,Thin section no. 122
4,1/3-5,223,4805.35,Thin section no. 2319
...,...,...,...,...
2131,8/3-1,142,3010.00,Thin section no. 790
2132,9/4-3,152,3820.00,Thin section no. 793
2133,9/4-3,152,3840.00,Thin section no. 794
2134,9/4-3,152,3860.00,Thin section no. 795


In [82]:
# Output file
output_to_csv(outname='wellbore_thin_section', df=df_thin_section)

Saved to: C:\ICData\Test3\output_data\wellbore_thin_section.csv


Unnamed: 0,Well,NPDID wellbore,Depth,Legend
0,1/3-1,154,3420.16,Thin section no. 117
1,1/3-1,154,3420.16,Thin section no. 118
2,1/3-5,223,4809.0,Thin section no. 121


In [83]:
# Point - comment
# No new IC data types

<h3>CO2</h3>

In [84]:
if data_source == 'web':
    df_co2 = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_co2&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en',
                     skiprows=[0])

if data_source == 'file':
    df_co2 = pd.read_excel('input data/wellbore_co2.xlsx', sheet_name='wellbore_co2', 
                           skiprows=[0])

In [85]:
print(df_co2.shape)
df_co2.head(3)

(94, 10)


Unnamed: 0.1,Unnamed: 0,Wellbore name,Sample sequence number,Sample top depth [m],Sample bottom depth [m],CO2 [vol %],Sample method,Sample type,NPDID wellbore,Date sync NPD
0,,15/3-7,1,4073.8,4610.1,18900,,MDT,4055,09.02.2020
1,,17/3-1,1,2406.1,2406.1,3020,,MDT,2576,09.02.2020
2,,2/7-22,1,4489.0,4496.0,4500,,DST,1495,09.02.2020


In [86]:
df_co2.drop(labels='Unnamed: 0', axis=1, inplace=True)
df_co2

Unnamed: 0,Wellbore name,Sample sequence number,Sample top depth [m],Sample bottom depth [m],CO2 [vol %],Sample method,Sample type,NPDID wellbore,Date sync NPD
0,15/3-7,1,4073.8,4610.1,18900,,MDT,4055,09.02.2020
1,17/3-1,1,2406.1,2406.1,3020,,MDT,2576,09.02.2020
2,2/7-22,1,4489.0,4496.0,4500,,DST,1495,09.02.2020
3,3/7-4,1,3440.0,3537.0,4620,,DST,1467,09.02.2020
4,30/3-4,1,3079.0,3096.0,5300,,DST,460,09.02.2020
5,30/3-7 B,1,5903.0,5903.0,4900,,MDT,3229,09.02.2020
6,34/11-4,1,4151.0,4151.0,4300,,MDT,3314,09.02.2020
7,34/11-4,1,4194.8,4194.8,4400,,MDT,3314,09.02.2020
8,34/7-12,1,2276.2,2282.5,3600,,DST,1187,09.02.2020
9,34/7-16 R,1,2821.0,2837.0,3900,,DST,1677,09.02.2020


In [87]:
df_co2.isna().sum()

Wellbore name               0
Sample sequence number      0
Sample top depth [m]        0
Sample bottom depth [m]     0
CO2 [vol %]                 0
Sample method              94
Sample type                 0
NPDID wellbore              0
Date sync NPD               0
dtype: int64

In [88]:
df_co2.drop(labels=['Sample method', 'Date sync NPD'], axis=1, inplace=True)

In [89]:
df_co2.columns

Index(['Wellbore name', 'Sample sequence number', 'Sample top depth [m]',
       'Sample bottom depth [m]', 'CO2 [vol %]', 'Sample type',
       'NPDID wellbore'],
      dtype='object')

In [90]:
# Rename columns
rename_cols = {'Wellbore name' : 'Well',
               'Sample top depth [m]' : 'Top depth',
               'Sample bottom depth [m]' : 'Base depth'
               }
    
# Apply renaming
df_co2.rename(columns=rename_cols, inplace=True)
df_co2

Unnamed: 0,Well,Sample sequence number,Top depth,Base depth,CO2 [vol %],Sample type,NPDID wellbore
0,15/3-7,1,4073.8,4610.1,18900,MDT,4055
1,17/3-1,1,2406.1,2406.1,3020,MDT,2576
2,2/7-22,1,4489.0,4496.0,4500,DST,1495
3,3/7-4,1,3440.0,3537.0,4620,DST,1467
4,30/3-4,1,3079.0,3096.0,5300,DST,460
5,30/3-7 B,1,5903.0,5903.0,4900,MDT,3229
6,34/11-4,1,4151.0,4151.0,4300,MDT,3314
7,34/11-4,1,4194.8,4194.8,4400,MDT,3314
8,34/7-12,1,2276.2,2282.5,3600,DST,1187
9,34/7-16 R,1,2821.0,2837.0,3900,DST,1677


In [91]:
# Output file
output_to_csv(outname='wellbore_co2', df=df_co2)

Saved to: C:\ICData\Test3\output_data\wellbore_co2.csv


Unnamed: 0,Well,Sample sequence number,Top depth,Base depth,CO2 [vol %],Sample type,NPDID wellbore
0,15/3-7,1,4073.8,4610.1,18900,MDT,4055
1,17/3-1,1,2406.1,2406.1,3020,MDT,2576
2,2/7-22,1,4489.0,4496.0,4500,DST,1495


<h3>Oil Samples</h3>

In [92]:
if data_source == 'web':
    df_oil_sample = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_oil_sample&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en')

if data_source == 'file':
    df_oil_sample = pd.read_excel('input data/wellbore_oil_sample.xlsx')

In [93]:
print(df_oil_sample.shape)
df_oil_sample.head(3)

(1005, 11)


Unnamed: 0,Wellbore,Test type,Bottle number,Top depth MD [m],Bottom depth MD [m],Fluid type,Test time,Received date,NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,DST,DST1,3122.3,3137.0,,1989-05-27 06:00:00,1991-09-09,1382,2016-05-19,09.02.2020
1,1/3-10,DST,,0.0,0.0,,NaT,2011-08-06,5614,2016-05-19,09.02.2020
2,1/3-11,DST,,0.0,0.0,,NaT,NaT,5806,2016-05-19,09.02.2020


In [94]:
df_oil_sample.isna().sum()

Wellbore                 0
Test type                0
Bottle number            0
Top depth MD [m]         0
Bottom depth MD [m]      0
Fluid type               0
Test time              196
Received date           62
NPDID wellbore           0
Date updated             0
Date sync NPD            0
dtype: int64

In [95]:
df_oil_sample.drop(labels=['Date updated', 'Date sync NPD'], axis=1, inplace=True)

In [96]:
df_oil_sample.columns

Index(['Wellbore', 'Test type', 'Bottle number', 'Top depth MD [m]',
       'Bottom depth MD [m]', 'Fluid type', 'Test time ', 'Received date',
       'NPDID wellbore'],
      dtype='object')

In [97]:
# Rename columns
rename_cols = {'Wellbore' : 'Well',
               'Top depth MD [m]' : 'Top depth',
               'Bottom depth MD [m]' : 'Base depth'
               }
    
# Apply renaming
df_oil_sample.rename(columns=rename_cols, inplace=True)
df_oil_sample

Unnamed: 0,Well,Test type,Bottle number,Top depth,Base depth,Fluid type,Test time,Received date,NPDID wellbore
0,1/2-1,DST,DST1,3122.30,3137.00,,1989-05-27 06:00:00,1991-09-09,1382
1,1/3-10,DST,,0.00,0.00,,NaT,2011-08-06,5614
2,1/3-11,DST,,0.00,0.00,,NaT,NaT,5806
3,1/3-11,MDT,,3294.50,0.00,OIL,NaT,2014-04-03,5806
4,1/3-3,DST,DST3A,4202.00,4208.00,WATER,1983-03-06 00:00:00,1991-10-28,87
...,...,...,...,...,...,...,...,...,...
1000,7324/7-3 S,DST,,2232.49,1771.03,OIL,NaT,2016-08-11,7875
1001,7324/8-1,MDT,,664.00,0.00,OIL,NaT,2015-07-20,7221
1002,7324/8-1,MDT,,678.00,0.00,OIL,NaT,2015-07-20,7221
1003,9/2-1,DST,TEST3,3177.00,3210.00,,1987-04-20 00:00:00,1991-10-01,1038


In [98]:
df_oil_sample

Unnamed: 0,Well,Test type,Bottle number,Top depth,Base depth,Fluid type,Test time,Received date,NPDID wellbore
0,1/2-1,DST,DST1,3122.30,3137.00,,1989-05-27 06:00:00,1991-09-09,1382
1,1/3-10,DST,,0.00,0.00,,NaT,2011-08-06,5614
2,1/3-11,DST,,0.00,0.00,,NaT,NaT,5806
3,1/3-11,MDT,,3294.50,0.00,OIL,NaT,2014-04-03,5806
4,1/3-3,DST,DST3A,4202.00,4208.00,WATER,1983-03-06 00:00:00,1991-10-28,87
...,...,...,...,...,...,...,...,...,...
1000,7324/7-3 S,DST,,2232.49,1771.03,OIL,NaT,2016-08-11,7875
1001,7324/8-1,MDT,,664.00,0.00,OIL,NaT,2015-07-20,7221
1002,7324/8-1,MDT,,678.00,0.00,OIL,NaT,2015-07-20,7221
1003,9/2-1,DST,TEST3,3177.00,3210.00,,1987-04-20 00:00:00,1991-10-01,1038


In [99]:
# Output file
output_to_csv(outname='wellbore_oil_sample', df=df_oil_sample)

Saved to: C:\ICData\Test3\output_data\wellbore_oil_sample.csv


Unnamed: 0,Well,Test type,Bottle number,Top depth,Base depth,Fluid type,Test time,Received date,NPDID wellbore
0,1/2-1,DST,DST1,3122.3,3137.0,,1989-05-27 06:00:00,1991-09-09 00:00:00,1382
1,1/3-10,DST,,0.0,0.0,,,2011-08-06 00:00:00,5614
2,1/3-11,DST,,0.0,0.0,,,,5806


In [100]:
df_oil_sample.columns

# What to do about rows with only with 0/Nan values?

Index(['Well', 'Test type', 'Bottle number', 'Top depth', 'Base depth',
       'Fluid type', 'Test time ', 'Received date', 'NPDID wellbore'],
      dtype='object')

<h3>Lithostratigraphy</h3>

In [101]:
# Lithostrat available in two places. 
# Compare the length, and number of unique wells in both sources.

# (A) NPD FactPages > Wellbore > Table View > With > Lithostratigraphy
    # File: wellbore_formation_top.xlsx
    # Sheet: wellbore_formation_top
    # Link: https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_formation_top&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en

df_a = pd.read_excel('input data/wellbore_formation_top.xlsx', sheet_name='wellbore_formation_top')
print('Source A:', df_a.shape)
print(df_a['Wellbore name'].nunique())

# (B) NPD FactPages > Stratigraphy > Table View > Wellbores
    # File: strat_litho_wellbore.xlsx
    # Sheet: strat_litho_wellbore
    # Link: https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/strat_litho_wellbore&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en

df_b = pd.read_excel('input data/strat_litho_wellbore.xlsx', sheet_name='strat_litho_wellbore')
print('Source B:', df_b.shape)
print(df_b['Wellbore name'].nunique())

# Both contain the same number of rows.
# Source A is preferrable as it has an exra column, 'Lithostrat. unit, parent'
# which will come in handy assigning parents to each text dictionary entry.

Source A: (36035, 11)
1777
Source B: (36037, 10)
1778


In [102]:
# Use Source A

if data_source == 'web':
    df_lithostrat = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_formation_top&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en')
    
if data_source == 'file':
    df_lithostrat = pd.read_excel('input data/wellbore_formation_top.xlsx')

# Print column titles
print("Lithostratigraphy wellbore well header column titles:")
print(list(df_lithostrat.columns))

Lithostratigraphy wellbore well header column titles:
['Wellbore name', 'Top depth [m]', 'Bottom depth [m]', 'Lithostrat. unit', 'Level', 'Lithostrat. unit, parent', 'NPDID wellbore', 'NPDID lithostrat. unit', 'NPDID parent lithostrat. unit', 'Date updated', 'Date sync NPD']


In [103]:
(num_lithostrat_rows, num_lithostrat_cols) = df_lithostrat.shape
print('{} rows and {} columns in Exploration wells.'.format(num_lithostrat_rows, num_lithostrat_cols))

36035 rows and 11 columns in Exploration wells.


In [104]:
df_lithostrat.head(3)

Unnamed: 0,Wellbore name,Top depth [m],Bottom depth [m],Lithostrat. unit,Level,"Lithostrat. unit, parent",NPDID wellbore,NPDID lithostrat. unit,NPDID parent lithostrat. unit,Date updated,Date sync NPD
0,9/8-1,1597.0,1625.0,BLODØKS FM,FORMATION,SHETLAND GP,145,13,143.0,2019-10-03,09.02.2020
1,9/8-1,1922.0,2109.0,SANDNES FM,FORMATION,VESTLAND GP,145,139,186.0,2019-10-03,09.02.2020
2,9/8-1,1275.0,1625.0,SHETLAND GP,GROUP,,145,143,,2019-10-03,09.02.2020


In [105]:
df_lithostrat.tail()

Unnamed: 0,Wellbore name,Top depth [m],Bottom depth [m],Lithostrat. unit,Level,"Lithostrat. unit, parent",NPDID wellbore,NPDID lithostrat. unit,NPDID parent lithostrat. unit,Date updated,Date sync NPD
36030,1/2-1,3335.0,3407.0,MAUREEN FM,FORMATION,ROGALAND GP,1382,102,131.0,2019-10-03,09.02.2020
36031,1/2-1,94.0,1777.0,NORDLAND GP,GROUP,,1382,113,,2019-10-03,09.02.2020
36032,1/2-1,3059.0,3407.0,ROGALAND GP,GROUP,,1382,131,,2019-10-03,09.02.2020
36033,1/2-1,3407.0,3574.0,SHETLAND GP,GROUP,,1382,143,,2019-10-03,09.02.2020
36034,1/2-1,3514.0,3574.0,TOR FM,FORMATION,SHETLAND GP,1382,171,143.0,2019-10-03,09.02.2020


In [106]:
#Rename columns for csv
rename_stratcols = {'Wellbore name' : 'Well',
                    'Top depth [m]' : 'Top depth',
                    'Bottom depth [m]' : 'Base depth',
                    'Lithostrat. unit' : 'Legend'
                    }

#Apply renaming to dataframe
df_lithostrat.rename(columns=rename_stratcols, inplace=True)

# Create new dataframe called "df_formation_top"
# Need to keep other columns df_lithostrat for later when writing to database

df_formation_top = df_lithostrat[['Well', 'NPDID wellbore', 'Top depth', 'Base depth', 'Legend', 'Level']]
df_formation_top.head(3)

Unnamed: 0,Well,NPDID wellbore,Top depth,Base depth,Legend,Level
0,9/8-1,145,1597.0,1625.0,BLODØKS FM,FORMATION
1,9/8-1,145,1922.0,2109.0,SANDNES FM,FORMATION
2,9/8-1,145,1275.0,1625.0,SHETLAND GP,GROUP


In [107]:
# Output file
output_to_csv(outname='wellbore_formation_top', df=df_formation_top)

Saved to: C:\ICData\Test3\output_data\wellbore_formation_top.csv


Unnamed: 0,Well,NPDID wellbore,Top depth,Base depth,Legend,Level
0,9/8-1,145,1597.0,1625.0,BLODØKS FM,FORMATION
1,9/8-1,145,1922.0,2109.0,SANDNES FM,FORMATION
2,9/8-1,145,1275.0,1625.0,SHETLAND GP,GROUP


<h3>Drill stem tests</h3>

In [108]:
if data_source == 'web':
    df_dst = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_dst&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en')

if data_source == 'file':
    df_dst = pd.read_excel('input data/wellbore_dst.xlsx')

In [109]:
print(df_dst.shape)
df_dst.head(20)

(1167, 17)


Unnamed: 0,Wellbore,Test number,From depth MD [m],To depth MD [m],Choke size [mm],Final shut-in pressure [MPa],Final flow pressure [MPa],Bottom hole pressure [MPa],Oil [Sm3/day],Gas [Sm3/day],Oil density [g/cm3],Gas grav. rel.air,GOR [m3/m3],Downhole temperature [°C],NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,1.0,3122.3,3137.0,25.4,0.0,0.0,0.0,859,57000,0.81,0.0,66,0,1382,2016-05-19,09.02.2020
1,1/3-1,1.0,4585.0,4602.0,12.5,80.0,61.0,0.0,0,1132,0.0,0.0,0,0,154,2016-05-19,09.02.2020
2,1/3-1,2.0,4565.0,4602.0,12.5,52.0,40.0,0.0,0,6626,0.0,0.0,0,0,154,2016-05-19,09.02.2020
3,1/3-1,3.0,3356.0,3361.0,12.5,70.0,50.0,0.0,0,28317,0.0,0.0,0,0,154,2016-05-19,09.02.2020
4,1/3-10,1.0,3158.0,3288.0,19.0,9.0,0.0,38.0,457,212453,0.791,0.855,465,0,5614,2016-05-19,09.02.2020
5,1/3-3,1.0,4529.0,4552.0,25.4,0.0,0.0,0.0,0,0,0.0,0.0,0,0,87,2016-05-19,09.02.2020
6,1/3-3,2.0,4233.0,4240.0,9.5,0.0,0.0,0.0,0,0,0.0,0.0,0,0,87,2016-05-19,09.02.2020
7,1/3-3,3.1,4202.0,4208.0,50.8,0.0,0.0,0.0,0,0,0.0,0.0,0,0,87,2016-05-19,09.02.2020
8,1/3-3,3.3,4211.0,4214.0,6.4,0.0,0.0,0.0,143,28000,0.829,0.82,196,165,87,2016-05-19,09.02.2020
9,1/3-6,1.0,2960.5,2977.0,17.5,0.0,3.0,0.0,78,93300,0.78,0.78,1196,107,1521,2016-05-19,09.02.2020


In [110]:
df_dst.isna().sum()

Wellbore                        0
Test number                     0
From depth MD [m]               0
To depth MD [m]                 0
Choke size [mm]                 0
Final shut-in pressure [MPa]    0
Final flow pressure [MPa]       0
Bottom hole pressure [MPa]      0
Oil [Sm3/day]                   0
Gas [Sm3/day]                   0
Oil density [g/cm3]             0
Gas grav. rel.air               0
GOR [m3/m3]                     0
Downhole temperature [°C]       0
NPDID wellbore                  0
Date updated                    0
Date sync NPD                   0
dtype: int64

In [111]:
df_dst.drop(labels=['Date updated', 'Date sync NPD'], axis=1, inplace=True)

In [112]:
df_dst.columns

Index(['Wellbore', 'Test number', 'From depth MD [m]', 'To depth MD [m]',
       'Choke size [mm]', 'Final shut-in pressure [MPa]',
       'Final flow pressure [MPa]', 'Bottom hole pressure [MPa]',
       'Oil [Sm3/day]', 'Gas [Sm3/day]', 'Oil density [g/cm3]',
       'Gas grav. rel.air', 'GOR [m3/m3]', 'Downhole temperature [°C]',
       'NPDID wellbore'],
      dtype='object')

In [113]:
#Rename well header columns to match dbo.WELLS (does capitalisation matter?)
rename_cols = {'Wellbore' : 'Well',
               'From depth MD [m]' : 'Top depth',
               'To depth MD [m]' : 'Base depth'
               }
    
#Apply renaming
df_dst.rename(columns=rename_cols, inplace=True)
df_dst.head(20)

Unnamed: 0,Well,Test number,Top depth,Base depth,Choke size [mm],Final shut-in pressure [MPa],Final flow pressure [MPa],Bottom hole pressure [MPa],Oil [Sm3/day],Gas [Sm3/day],Oil density [g/cm3],Gas grav. rel.air,GOR [m3/m3],Downhole temperature [°C],NPDID wellbore
0,1/2-1,1.0,3122.3,3137.0,25.4,0.0,0.0,0.0,859,57000,0.81,0.0,66,0,1382
1,1/3-1,1.0,4585.0,4602.0,12.5,80.0,61.0,0.0,0,1132,0.0,0.0,0,0,154
2,1/3-1,2.0,4565.0,4602.0,12.5,52.0,40.0,0.0,0,6626,0.0,0.0,0,0,154
3,1/3-1,3.0,3356.0,3361.0,12.5,70.0,50.0,0.0,0,28317,0.0,0.0,0,0,154
4,1/3-10,1.0,3158.0,3288.0,19.0,9.0,0.0,38.0,457,212453,0.791,0.855,465,0,5614
5,1/3-3,1.0,4529.0,4552.0,25.4,0.0,0.0,0.0,0,0,0.0,0.0,0,0,87
6,1/3-3,2.0,4233.0,4240.0,9.5,0.0,0.0,0.0,0,0,0.0,0.0,0,0,87
7,1/3-3,3.1,4202.0,4208.0,50.8,0.0,0.0,0.0,0,0,0.0,0.0,0,0,87
8,1/3-3,3.3,4211.0,4214.0,6.4,0.0,0.0,0.0,143,28000,0.829,0.82,196,165,87
9,1/3-6,1.0,2960.5,2977.0,17.5,0.0,3.0,0.0,78,93300,0.78,0.78,1196,107,1521


In [114]:
# Output file
output_to_csv(outname='wellbore_dst', df=df_dst)

Saved to: C:\ICData\Test3\output_data\wellbore_dst.csv


Unnamed: 0,Well,Test number,Top depth,Base depth,Choke size [mm],Final shut-in pressure [MPa],Final flow pressure [MPa],Bottom hole pressure [MPa],Oil [Sm3/day],Gas [Sm3/day],Oil density [g/cm3],Gas grav. rel.air,GOR [m3/m3],Downhole temperature [°C],NPDID wellbore
0,1/2-1,1.0,3122.3,3137.0,25.4,0.0,0.0,0.0,859,57000,0.81,0.0,66,0,1382
1,1/3-1,1.0,4585.0,4602.0,12.5,80.0,61.0,0.0,0,1132,0.0,0.0,0,0,154
2,1/3-1,2.0,4565.0,4602.0,12.5,52.0,40.0,0.0,0,6626,0.0,0.0,0,0,154


<h3>Casing and leak-off tests</h3>

In [115]:
if data_source == 'web':
    df_casinglot = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_casing_and_lot&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en')

if data_source == 'file':
    df_casinglot = pd.read_excel('input data/wellbore_casing_and_lot.xlsx')

In [116]:
print(df_casinglot.shape)
df_casinglot.head(3)

(7561, 11)


Unnamed: 0,Wellbore,Casing type,Casing diam. [inch],Casing depth [m],Hole diam. [inch],Hole depth[m],LOT mud eqv. [g/cm3],Formation test type,NPDID wellbore,Date updated,Date sync NPD
0,6507/6-4 S,CONDUCTOR,30,30.0,36,0.0,0.0,LOT,6725,2017-04-11,09.02.2020
1,6406/9-3,CONDUCTOR,36,45.0,42,0.0,0.0,,7141,2016-10-10,09.02.2020
2,2/5-4,CONDUCTOR,30,45.0,36,46.0,0.0,LOT,259,2017-04-11,09.02.2020


In [117]:
df_casinglot.isna().sum()

Wellbore                0
Casing type             0
Casing diam. [inch]     0
Casing depth [m]        0
Hole diam. [inch]       0
Hole depth[m]           0
LOT mud eqv. [g/cm3]    0
Formation test type     1
NPDID wellbore          0
Date updated            0
Date sync NPD           0
dtype: int64

In [118]:
df_casinglot.drop(labels=['Date updated', 'Date sync NPD'], axis=1, inplace=True)

In [119]:
df_casinglot.columns

Index(['Wellbore', 'Casing type', 'Casing diam. [inch]', 'Casing depth [m]',
       'Hole diam. [inch]', 'Hole depth[m]', 'LOT mud eqv. [g/cm3]',
       'Formation test type', 'NPDID wellbore'],
      dtype='object')

In [120]:
# Rename columns
rename_cols = {'Wellbore' : 'Well',
               'Casing depth [m]' : 'Depth'
               }
    
# Apply renaming
df_casinglot.rename(columns=rename_cols, inplace=True)
df_casinglot

Unnamed: 0,Well,Casing type,Casing diam. [inch],Depth,Hole diam. [inch],Hole depth[m],LOT mud eqv. [g/cm3],Formation test type,NPDID wellbore
0,6507/6-4 S,CONDUCTOR,30,30.0,36,0.0,0.0,LOT,6725
1,6406/9-3,CONDUCTOR,36,45.0,42,0.0,0.0,,7141
2,2/5-4,CONDUCTOR,30,45.0,36,46.0,0.0,LOT,259
3,7/9-1,CONDUCTOR,30,108.0,36,110.0,0.0,LOT,191
4,8/12-1,CONDUCTOR,30,114.0,36,114.0,0.0,LOT,193
...,...,...,...,...,...,...,...,...,...
7556,34/10-48 S,OPEN HOLE,,7393.0,8 1/2,7393.0,0.0,LOT,4902
7557,34/10-45 S,LINER,7,7594.0,8 1/2,7594.0,0.0,LOT,4450
7558,34/10-46 B,LINER,7,7725.0,8 1/2,7725.0,0.0,LOT,4527
7559,34/10-55 S,OPEN HOLE,,7811.0,8 1/2,7811.0,0.0,,8102


In [121]:
# Output file
output_to_csv(outname='wellbore_casing_and_lot', df=df_casinglot)

Saved to: C:\ICData\Test3\output_data\wellbore_casing_and_lot.csv


Unnamed: 0,Well,Casing type,Casing diam. [inch],Depth,Hole diam. [inch],Hole depth[m],LOT mud eqv. [g/cm3],Formation test type,NPDID wellbore
0,6507/6-4 S,CONDUCTOR,30,30.0,36,0.0,0.0,LOT,6725
1,6406/9-3,CONDUCTOR,36,45.0,42,0.0,0.0,,7141
2,2/5-4,CONDUCTOR,30,45.0,36,46.0,0.0,LOT,259


<h3>Drilling mud</h3>

In [122]:
if data_source == 'web':
    df_mud = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_mud&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.189&CultureCode=en')
    
if data_source == 'file':
    df_mud = pd.read_excel('input data/wellbore_mud.xlsx')

In [123]:
print(df_mud.shape)
df_mud.head(3)

(33843, 11)


Unnamed: 0.1,Unnamed: 0,Wellbore,Depth MD [m],Mud weight [g/cm3],Visc. [mPa.s],Yield point [Pa],Mud type,Date measured,NPDID wellbore,Date updated,Date sync NPD
0,,1/2-1,171.0,1.04,0.0,0.0,WATER BASED,1989-03-29,1382,2017-04-11,09.02.2020
1,,1/2-1,279.2,1.04,0.0,0.0,WATER BASED,1989-03-29,1382,2017-04-11,09.02.2020
2,,1/2-1,545.6,1.04,0.0,0.0,WATER BASED,1989-03-29,1382,2017-04-11,09.02.2020


In [124]:
# Drop blank first column
df_mud.drop(labels='Unnamed: 0', axis=1, inplace=True)
df_mud.head(3)

Unnamed: 0,Wellbore,Depth MD [m],Mud weight [g/cm3],Visc. [mPa.s],Yield point [Pa],Mud type,Date measured,NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,171.0,1.04,0.0,0.0,WATER BASED,1989-03-29,1382,2017-04-11,09.02.2020
1,1/2-1,279.2,1.04,0.0,0.0,WATER BASED,1989-03-29,1382,2017-04-11,09.02.2020
2,1/2-1,545.6,1.04,0.0,0.0,WATER BASED,1989-03-29,1382,2017-04-11,09.02.2020


In [125]:
df_mud.isna().sum()

Wellbore                  0
Depth MD [m]              0
Mud weight [g/cm3]        0
Visc. [mPa.s]             0
Yield point [Pa]          0
Mud type                  1
Date measured         15684
NPDID wellbore            0
Date updated              0
Date sync NPD             0
dtype: int64

In [126]:
df_mud.drop(labels=['Date updated', 'Date sync NPD'], axis=1, inplace=True)

In [127]:
df_mud.columns

Index(['Wellbore', 'Depth MD [m]', 'Mud weight [g/cm3]', 'Visc. [mPa.s]',
       'Yield point [Pa]', 'Mud type', 'Date measured', 'NPDID wellbore'],
      dtype='object')

In [128]:
# Rename columns
rename_cols = {'Wellbore' : 'Well',
               'Depth MD [m]' : 'Depth'
               }
    
# Apply renaming
df_mud.rename(columns=rename_cols, inplace=True)
df_mud

Unnamed: 0,Well,Depth,Mud weight [g/cm3],Visc. [mPa.s],Yield point [Pa],Mud type,Date measured,NPDID wellbore
0,1/2-1,171.0,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
1,1/2-1,279.2,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
2,1/2-1,545.6,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
3,1/2-1,563.9,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
4,1/2-1,648.3,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
...,...,...,...,...,...,...,...,...
33838,9/4-5,5881.1,0.00,0.0,0.0,DUMMY,NaT,5134
33839,9/8-1,137.0,0.00,0.0,0.0,seawater,NaT,145
33840,9/8-1,360.0,1.11,0.0,0.0,seawater,NaT,145
33841,9/8-1,1219.0,1.13,0.0,0.0,waterbased,NaT,145


In [129]:
# Output file
output_to_csv(outname='wellbore_mud', df=df_mud)

Saved to: C:\ICData\Test3\output_data\wellbore_mud.csv


Unnamed: 0,Well,Depth,Mud weight [g/cm3],Visc. [mPa.s],Yield point [Pa],Mud type,Date measured,NPDID wellbore
0,1/2-1,171.0,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
1,1/2-1,279.2,1.04,0.0,0.0,WATER BASED,1989-03-29,1382
2,1/2-1,545.6,1.04,0.0,0.0,WATER BASED,1989-03-29,1382


<h3>Documents</h3>

In [130]:
if data_source == 'web':
    df_document = pd.read_excel('https://factpages.npd.no/ReportServer_npdpublic?/FactPages/TableView/wellbore_document&rs:Command=Render&rc:Toolbar=false&rc:Parameters=f&rs:Format=EXCEL&Top100=false&IpAddress=108.171.128.188&CultureCode=en')
    
if data_source == 'file':
    df_document = pd.read_excel('input data/wellbore_document.xlsx')

df_document

Unnamed: 0,Wellbore,Document type,Document name,Document URL,Document format,Document size [KB],NPDID wellbore,Date updated,Date sync NPD
0,1/2-1,GEOCHEMICAL INFORMATION,1382_1,https://factpages.npd.no/pbl/geochemical_pdfs/...,pdf,174,1382,2019-04-25,08.02.2020
1,1/2-1,GEOCHEMICAL INFORMATION,1382_2,https://factpages.npd.no/pbl/geochemical_pdfs/...,pdf,3408,1382,2019-04-25,08.02.2020
2,1/2-1,OLD NPD WDSS,1382_01_WDSS_General_Information,https://factpages.npd.no/pbl/wdss_old/1382_01_...,pdf,258,1382,2019-04-25,08.02.2020
3,1/2-1,OLD NPD WDSS,1382_02_WDSS_completion_log,https://factpages.npd.no/pbl/wdss_old/1382_02_...,pdf,168,1382,2019-04-25,08.02.2020
4,1/2-1,PRESSURE PLOT,1382_Formation_pressure_(Formasjonstrykk),https://factpages.npd.no/pbl/wellbore_pressure...,pdf,206,1382,2019-05-02,08.02.2020
...,...,...,...,...,...,...,...,...,...
8545,9/8-1,NPD PAPER,145_01_NPD_Paper_No.5_Lithology__Well_9_8_1,https://factpages.npd.no/pbl/NPD_papers/145_01...,pdf,10850,145,2019-04-25,08.02.2020
8546,9/8-1,NPD PAPER,145_02_NPD_Paper_No.5_Interpreted_Lithology_lo...,https://factpages.npd.no/pbl/NPD_papers/145_02...,pdf,40257,145,2019-04-25,08.02.2020
8547,9/8-1,OLD NPD WDSS,145_01_WDSS_General_Information,https://factpages.npd.no/pbl/wdss_old/145_01_W...,pdf,192,145,2019-04-25,08.02.2020
8548,9/8-1,PRESSURE PLOT,145_Formation_pressure_(Formasjonstrykk),https://factpages.npd.no/pbl/wellbore_pressure...,pdf,217,145,2019-05-02,08.02.2020


In [131]:
df_document['Title'] = df_document['Wellbore'] + ' ' + df_document['Document type'] + ': ' + df_document['Document name'] + ' (' + df_document['Document format'] + ')'
df_document[['Wellbore', 'Title']].tail(10)

Unnamed: 0,Wellbore,Title
8540,9/4-5,9/4-5 GEOCHEMICAL INFORMATION: 5134_02_9_4_5_g...
8541,9/4-5,9/4-5 GEOCHEMICAL INFORMATION: 5134_1 (pdf)
8542,9/8-1,9/8-1 COMPOSITE LOG: 145 (pdf)
8543,9/8-1,9/8-1 GEOCHEMICAL INFORMATION: 145_1 (pdf)
8544,9/8-1,9/8-1 GEOCHEMICAL INFORMATION: 145_2 (pdf)
8545,9/8-1,9/8-1 NPD PAPER: 145_01_NPD_Paper_No.5_Litholo...
8546,9/8-1,9/8-1 NPD PAPER: 145_02_NPD_Paper_No.5_Interpr...
8547,9/8-1,9/8-1 OLD NPD WDSS: 145_01_WDSS_General_Inform...
8548,9/8-1,9/8-1 PRESSURE PLOT: 145_Formation_pressure_(F...
8549,9/8-1,9/8-1 REPORTED BY LICENSEE: 145_01_Completion_...


In [132]:
df_document = df_document[['Wellbore', 'NPDID wellbore', 'Title', 'Document URL']]
df_document

Unnamed: 0,Wellbore,NPDID wellbore,Title,Document URL
0,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_1 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
1,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_2 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
2,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_01_WDSS_General_Infor...,https://factpages.npd.no/pbl/wdss_old/1382_01_...
3,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_02_WDSS_completion_lo...,https://factpages.npd.no/pbl/wdss_old/1382_02_...
4,1/2-1,1382,1/2-1 PRESSURE PLOT: 1382_Formation_pressure_(...,https://factpages.npd.no/pbl/wellbore_pressure...
...,...,...,...,...
8545,9/8-1,145,9/8-1 NPD PAPER: 145_01_NPD_Paper_No.5_Litholo...,https://factpages.npd.no/pbl/NPD_papers/145_01...
8546,9/8-1,145,9/8-1 NPD PAPER: 145_02_NPD_Paper_No.5_Interpr...,https://factpages.npd.no/pbl/NPD_papers/145_02...
8547,9/8-1,145,9/8-1 OLD NPD WDSS: 145_01_WDSS_General_Inform...,https://factpages.npd.no/pbl/wdss_old/145_01_W...
8548,9/8-1,145,9/8-1 PRESSURE PLOT: 145_Formation_pressure_(F...,https://factpages.npd.no/pbl/wellbore_pressure...


In [133]:
# Output file
output_to_csv(outname='wellbore_document', df=df_document)

Saved to: C:\ICData\Test3\output_data\wellbore_document.csv


Unnamed: 0,Wellbore,NPDID wellbore,Title,Document URL
0,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_1 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
1,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_2 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
2,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_01_WDSS_General_Infor...,https://factpages.npd.no/pbl/wdss_old/1382_01_...


<h3>References and Documents combined</h3>

In [134]:
# Rename columns in both df_document and df_explo_references before merge

# Rename columns in df_document
rename_cols = {'Wellbore' : 'Well',
               'Document URL' : 'URL'}
    
#Apply renaming
df_document.rename(columns=rename_cols, inplace=True)
df_document

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  errors=errors,


Unnamed: 0,Well,NPDID wellbore,Title,URL
0,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_1 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
1,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_2 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
2,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_01_WDSS_General_Infor...,https://factpages.npd.no/pbl/wdss_old/1382_01_...
3,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_02_WDSS_completion_lo...,https://factpages.npd.no/pbl/wdss_old/1382_02_...
4,1/2-1,1382,1/2-1 PRESSURE PLOT: 1382_Formation_pressure_(...,https://factpages.npd.no/pbl/wellbore_pressure...
...,...,...,...,...
8545,9/8-1,145,9/8-1 NPD PAPER: 145_01_NPD_Paper_No.5_Litholo...,https://factpages.npd.no/pbl/NPD_papers/145_01...
8546,9/8-1,145,9/8-1 NPD PAPER: 145_02_NPD_Paper_No.5_Interpr...,https://factpages.npd.no/pbl/NPD_papers/145_02...
8547,9/8-1,145,9/8-1 OLD NPD WDSS: 145_01_WDSS_General_Inform...,https://factpages.npd.no/pbl/wdss_old/145_01_W...
8548,9/8-1,145,9/8-1 PRESSURE PLOT: 145_Formation_pressure_(F...,https://factpages.npd.no/pbl/wellbore_pressure...


In [135]:
# Rename columns in df_explo_references
rename_cols = {'Name' : 'Well'}
    
#Apply renaming
df_explo_references.rename(columns=rename_cols, inplace=True)
df_explo_references.head(3)

Unnamed: 0,Well,NPDID wellbore,Title,URL
3860,1/2-1,1382,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...
1930,1/2-1,1382,FactPage URL,https://factpages.npd.no/factpages/default.asp...
3861,1/2-2,5192,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...


In [136]:
# Combine References and Documents dataframes
df_refs_and_docs = df_explo_references.append(df_document) 
df_refs_and_docs.sort_values(['Well', 'Title'], ascending=[True, False], ignore_index=True, inplace=True)
df_refs_and_docs.head(20)

Unnamed: 0,Well,NPDID wellbore,Title,URL
0,1/2-1,1382,FactPage URL,https://factpages.npd.no/factpages/default.asp...
1,1/2-1,1382,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...
2,1/2-1,1382,1/2-1 REPORTED BY LICENSEE: 1382_1_2_1_COMPLET...,https://factpages.npd.no/pbl/wellbore_document...
3,1/2-1,1382,1/2-1 REPORTED BY LICENSEE: 1382_1_2_1_COMPLET...,https://factpages.npd.no/pbl/wellbore_document...
4,1/2-1,1382,1/2-1 PRESSURE PLOT: 1382_Formation_pressure_(...,https://factpages.npd.no/pbl/wellbore_pressure...
5,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_02_WDSS_completion_lo...,https://factpages.npd.no/pbl/wdss_old/1382_02_...
6,1/2-1,1382,1/2-1 OLD NPD WDSS: 1382_01_WDSS_General_Infor...,https://factpages.npd.no/pbl/wdss_old/1382_01_...
7,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_2 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
8,1/2-1,1382,1/2-1 GEOCHEMICAL INFORMATION: 1382_1 (pdf),https://factpages.npd.no/pbl/geochemical_pdfs/...
9,1/2-2,5192,Press Release URL,https://www.npd.no/fakta/nyheter/Resultat-av-l...


In [137]:
# Output file
output_to_csv(outname='wellbore_references_and_documents', df=df_refs_and_docs)

Saved to: C:\ICData\Test3\output_data\wellbore_references_and_documents.csv


Unnamed: 0,Well,NPDID wellbore,Title,URL
0,1/2-1,1382,FactPage URL,https://factpages.npd.no/factpages/default.asp...
1,1/2-1,1382,FactMaps URL,https://factmaps.npd.no/factmaps/3_0/?run=Well...
2,1/2-1,1382,1/2-1 REPORTED BY LICENSEE: 1382_1_2_1_COMPLET...,https://factpages.npd.no/pbl/wellbore_document...


# Overview

## Data summary

In [138]:
df_summary = pd.DataFrame({'Data type':
                        ['Exploration well header',
                         'Development well header',
                         'Exploration reference',
                         'Development reference',
                         'Core',
                         'Core photo',
                         'Thin section',
                         'CO2',
                         'Oil sample',
                         'Lithostratigraphy',
                         'Drill stem test',
                         'Casing and leak-off test',
                         'Drilling mud',
                         'Document',
                         'Document & Reference combined'
                        ], 
                        'No. unique wells':
                        [df_explo['Name'].nunique(),
                         df_dev['Name'].nunique(),
                         df_explo_references['Well'].nunique(),
                         df_dev_references['Name'].nunique(),
                         df_core['Well'].nunique(),
                         df_core_photo['Well'].nunique(),
                         df_thin_section['Well'].nunique(),
                         df_co2['Well'].nunique(),
                         df_oil_sample['Well'].nunique(),
                         df_formation_top['Well'].nunique(),
                         df_dst['Well'].nunique(),
                         df_casinglot['Well'].nunique(),
                         df_mud['Well'].nunique(),
                         df_document['Well'].nunique(),
                         df_refs_and_docs['Well'].nunique()
                        ],
                        'No. records':
                        [df_explo['Name'].shape[0],
                         df_dev['Name'].shape[0],
                         df_explo_references['Well'].shape[0],
                         df_dev_references['Name'].shape[0],
                         df_core['Well'].shape[0],
                         df_core_photo['Well'].shape[0],
                         df_thin_section['Well'].shape[0],
                         df_co2['Well'].shape[0],
                         df_oil_sample['Well'].shape[0],
                         df_formation_top['Well'].shape[0],
                         df_dst['Well'].shape[0],
                         df_casinglot['Well'].shape[0],
                         df_mud['Well'].shape[0],
                         df_document['Well'].shape[0],
                         df_refs_and_docs['Well'].shape[0]]
                       })

print('Data for', (df_explo['Name'].nunique()+df_dev['Name'].nunique()), 'wells in total.')

#Add thousands separators
df_summary['No. unique wells'] = df_summary['No. unique wells'].apply(lambda x : "{:,}".format(x))
df_summary['No. records'] = df_summary['No. records'].apply(lambda x : "{:,}".format(x))
df_summary = df_summary.style.hide_index()
df_summary

Data for 7082 wells in total.


Data type,No. unique wells,No. records
Exploration well header,1930,1930
Development well header,5152,5152
Exploration reference,1930,4594
Development reference,5152,10304
Core,1691,8141
Core photo,814,20849
Thin section,200,2136
CO2,52,94
Oil sample,598,1005
Lithostratigraphy,1777,36035


## Dataframes in project

In [139]:
# Show all the dataframes used in this notebook
%whos DataFrame

Variable                  Type         Data/Info
------------------------------------------------
df_a                      DataFrame          Wellbore name  Top <...>[36035 rows x 11 columns]
df_b                      DataFrame          Wellbore name  Top <...>[36037 rows x 10 columns]
df_casinglot              DataFrame                Well Casing t<...>\n[7561 rows x 9 columns]
df_co2                    DataFrame                Well  Sample <...>     DST             1038
df_core                   DataFrame             Well  NPDID well<...>\n[8141 rows x 5 columns]
df_core_photo             DataFrame              Well  NPDID wel<...>n[20849 rows x 5 columns]
df_core_photo_ERRONEOUS   DataFrame            Wellbore  Core sa<...>jpgs/1...            1135
df_dev                    DataFrame                   Name  Alte<...>n[5152 rows x 64 columns]
df_dev_references         DataFrame               Name  NPDID we<...>n[10304 rows x 4 columns]
df_document               DataFrame           W