# Geomechanical Injection Scenario Toolkit (GIST)

#Disclaimer
GIST aims to give the _gist_ of a wide range of potential scenarios and aid collective decision making when responding to seismicity.

The results of GIST are entirely dependent upon the inputs provided, which may be incomplete or inaccurate.

There are other potentially plausible inducement scenarios that are not considered, including fluid migration into the basement, 
out-of-zone poroelastic stressing, or hydraulic fracturing.

None of the individual models produced by GIST accurately represent what happens in the subsurface and cannot be credibly used 
to accurately assign liability or responsibility for seismicity.

"All models are wrong, but some are useful" - George Box, 1976

## Prerequisites

Assumes InjectionSQLScheduled completed successfully and injection data are sampled uniformly in time

##Install Dependencies
- geopandas
- gistMC.py
- eqSQL.py
- gistPlots.py
- numpy
- scipy
- pandas


In [0]:
%restart_python

In [0]:
%run "/Workspace/_utils/Utility_Functions"

In [0]:
!pip install geopandas
!pip install geodatasets
!pip install contextily
#! pip install folium matplotlib mapclassify contextily

Collecting geopandas
  Downloading geopandas-1.0.1-py3-none-any.whl (323 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 323.6/323.6 kB 5.1 MB/s eta 0:00:00
Collecting shapely>=2.0.0
  Downloading shapely-2.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 123.3 MB/s eta 0:00:00
Collecting pyproj>=3.3.0
  Downloading pyproj-3.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.3 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.3/9.3 MB 99.4 MB/s eta 0:00:00
Collecting pyogrio>=0.7.2
  Downloading pyogrio-0.10.0-cp310-cp310-manylinux_2_28_x86_64.whl (23.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.9/23.9 MB 55.5 MB/s eta 0:00:00
Installing collected packages: shapely, pyproj, pyogrio, geopandas
Successfully installed geopandas-1.0.1 pyogrio-0.10.0 pyproj-3.7.1 shapely-2.1.0
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use

##Paths

In [0]:
# Paths
homePath='/Workspace/Users/bill.curry@exxonmobil.com/'
# Injection data path 
injPath=homePath+'injection/WeeklyRun/ScheduledOutput/'
# GIST library path
gistPath=homePath+'GIST/'

##Libraries

- numpy
- scipy
- pandas
- matplotlib
- geopandas
- pyspark


In [0]:
import sys
sys.path.append(gistPath+'lib')

In [0]:
#Databricks-specific
#import dataBricksConfig as db
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("eqSQL").getOrCreate()

In [0]:
import numpy as np
import pandas as pd
import os
import gistMC as gi
import eqSQL as es
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas
#import contextily as cx



In [0]:
import gistPlots as gp

#1. Select Event

In [0]:
eventID='texnet2025agkw'
forecastYears=3.

In [0]:
runPath=gistPath+'/runs/'+eventID+'/'

In [0]:
EQ=pd.read_csv(runPath+'EQ.csv').loc[0]
print(EQ)

Unnamed: 0                                               0
SeismicEventId                                     2743652
DataSource                       TexNet Earthquake Catalog
DataSourceUrl                                          NaN
EventID                                     texnet2025agkw
EventTimeUtc                           2025-01-04 12:16:53
EventTimeInLocalTimeZone              2025-01-04T06:16:53Z
EventTimeZone                                          CST
EventType                                       Earthquake
DepthKm                                           7.202148
DepthErrorKm                                      0.593583
Magnitude                                         1.401685
MagnitudeError                                         NaN
MagnitudeType                                   ml(texnet)
Location                                     Western Texas
Status                                               final
Latitude                                         32.2897

#2. Initial Run

##2.1 Subsurface Model Parameters

###Inputs:

In [0]:
# Binary deep/shallow parameter
deepOrShallow='Deep'
# Depth from surface (ft)
# Distinguishes deep/shallow where we don't have a horizon 
depthCutoff=8000.

In [0]:
nRealizations=500

In [0]:
# Water density minimum/maximum (kg/m3)
WaterDensityMin=1015.
WaterDensityMax=1025.
# Water viscosity minimum/maximum (Pa.s)
WaterViscosityMin=0.000799
WaterViscosityMax=0.000801
# Fluid Compressibility minimum/maximum (1/Pa)
FluidCompressibilityMin=0.000000000359
FluidCompressibilityMax=0.000000000361

In [0]:
# Porosity in percent
PorosityPercentMin=3.
PorosityPercentMax=15.

# Permeability in millidarcies
PermMDMin=3.
PermMDMax=1000.

# Thickness in feet
ThicknessFTMin=700.
ThicknessFTMax=2000.

# Vertical compressibility minimum/maximum (1/Pa)
VerticalCompressibilityMin=0.00000000107
VerticalCompressibilityMax=0.00000000005

In [0]:
# Verbosity - 0=silent, 1=some, 2=lots
verb=0
# Minimum Pressure change to care about in PSI
dPCutoff=0.5
# Maximum Number of wells to plot
nWells=50
# How far back in time to plot in years
minYear=1983

### Set Interval-Specific Paths...


In [0]:
# Set an output directory for this earthquake and this interval
runIntervalPath=runPath+deepOrShallow+'/'
initialRunIntervalPath=runIntervalPath+'initialRun/'
updatedRunIntervalPath=runIntervalPath+'udpatedRun/'
forecastRunIntervalPath=runIntervalPath+'forecastRun/'
disposalPath=runIntervalPath+'updatedDisposal/'
################################################
# Output prefix for realizations of parameters #
################################################
RealizationPrefix=runIntervalPath+'MC'

In [0]:
# Point to appropriate well and injection files from initial database
if deepOrShallow=='Deep':
  defaultWellFile=injPath+'/deep.csv'
  defaultInjFile=injPath+'/deepReg.csv'
elif deepOrShallow=='Shallow':
  defaultWellFile=injPath+'/shallow.csv'
  defaultInjFile=injPath+'/shallowReg.csv'

##2.2 Results

###Compute

###Output

#3. Correct Data

##3.1 Export Disposal Data

##3.2 Import Updated Disposal Data

In [0]:

#udpatedSelectedWellsDF=pd.read_csv(disposalPath+'selectedWells.csv')
#updatedIgnoredWellsDF=pd.read_csv(disposalPath+'ignoredWells.csv')
updatedWellsFile=disposalPath+'allInZoneWells.csv'
updatedInjFile=disposalPath+'updatedInj.csv'

In [0]:
# I need to merge the updated disposal information with the prior stuff
# Load updated injection file
# Well comparisons
# Get list of well IDs in the updated injection file
# Get list of well IDs in updated well file
# Check to see overlap in well IDs
#    updated vs. prior selected wells
#    updated vs. prior ignored wells
# FilteredIgnored =  prior ignored wells - (wells in updated file and prior ignored wells)
# Merge FilteredIgnored with with updated wells
# Load injection
#    Check for time sampling of injection - all wells must be the same
#    Get overall time vector - convert to days
#    Loop over wells in well file
#      Get all disposal data for well ID
#      Convert date values to days
#      
#mergedInjFile=

#4. Updated Analysis

##4.1 Refined Parameters

In [0]:
UpdatednRealizations=400

In [0]:
# Porosity in percent
UpdatedPorosityPercentMin=4.
UpdatedPorosityPercentMax=10.

# Permeability in millidarcies
UpdatedPermMDMin=3.
UpdatedPermMDMax=1000.

# Thickness in feet
UpdatedThicknessFTMin=800.
UpdatedThicknessFTMax=1200.
# Young's modulus from Stanton model = 11.25GPa - somwhere between 9.5 and 12
# Poisson's Ratio - 0.303 = between 0.31 and 0.30
# Compressibility = 3(1-2v)/E =
# Low side = 3(1- 2 x 0.30)/9500000000 = 1.2 / 9500000000 = 0.000000000126
# High side = 3(1 - 2 x 0.31)/12000000000 = 1.14/12000000000 = 0.000000000095
# Vertical compressibility minimum/maximum (1/Pa)
# Increase the high side to match what is in FSP - I might be off by 10x but it doesn't matter!
UpdatedVerticalCompressibilityMin=0.000000000095
UpdatedVerticalCompressibilityMax=0.00000000126

In [0]:
# Verbosity - 0=silent, 1=some, 2=lots
verb=0
# Minimum Pressure change to care about in PSI
dPCutoff=1.
# Maximum Number of wells to plot
nWells=50

In [0]:
gist2=gi.gistMC(nReal=UpdatednRealizations)
gist2.initPP(rho0_min=WaterDensityMin,
             rho0_max=WaterDensityMax,
             phi_min=UpdatedPorosityPercentMin,
             phi_max=UpdatedPorosityPercentMax,
             kMD_min=UpdatedPermMDMin,
             kMD_max=UpdatedPermMDMax,
             h_min=UpdatedThicknessFTMin,
             h_max=UpdatedThicknessFTMax,
             alphav_min=UpdatedVerticalCompressibilityMin,
             alphav_max=UpdatedVerticalCompressibilityMax,
             beta_min=FluidCompressibilityMin,
             beta_max=FluidCompressibilityMax)
gist2.writeRealizations(updatedRunIntervalPath+'PorePressureRealizations.csv')

##4.2 Updated Disposal Rates

In [0]:
gist2.addWells(userWellFile=updatedWellsFile,userInjFile=updatedInjFile,verbose=verb)

In [0]:
updatedSelectedWellsDF,updatedIgnoredWellsDF,updatedInjDF=gist2.findWells(EQ,PE=False,responseYears=forecastYears,verbose=1)

gistMC.findWells:  Selecting  2250  and excluding  613  wells
 gistMC.findWells included:  2250 1258 2250
 gistMC.findWells excluded:  613 76 613
 gistMC.findWells:  2250  wells considered
 gistMC.findWells:  1258  wells with reported volumes, with  892875  injection values


##4.3 Updated Run

In [0]:
updatedScenarioDF=gist2.runPressureScenarios(EQ,updatedSelectedWellsDF,updatedInjDF,verbose=verb)
updatedScenarioDF.to_csv(updatedRunIntervalPath+'scenarios.csv')

In [0]:
#updatedScenarioTestDF=gist2.runPressureScenariosVectorized(EQ,updatedSelectedWellsDF,updatedInjDF,verbose=verb)

In [0]:
updatedFilteredDF,updatedOrderedWellIDList=gi.summarizePPResults(updatedScenarioDF,updatedSelectedWellsDF,threshold=dPCutoff,nOrder=nWells,verbose=verb)
pd.Series(data=updatedOrderedWellIDList).to_csv(updatedRunIntervalPath+'wellIDOrder.csv')
updatedFilteredDF.to_csv(updatedRunIntervalPath+'filteredScenarios.csv')

#
updatedDisaggregationDF=gi.prepDisaggregationPlot(updatedFilteredDF,updatedOrderedWellIDList,jitter=0.1,verbose=0)
updatedDisaggregationDF.to_csv(updatedRunIntervalPath+'disaggregation.csv')

#
updatedDiffRange=(min(gist2.diffPPVec),max(gist2.diffPPVec))
updatedrtDF,updatedMergedWellsDF = gi.prepRTPlot(updatedSelectedWellsDF,updatedIgnoredWellsDF,str(minYear),updatedDiffRange,EQ,clipYear=False)
updatedMergedWellsDF.to_csv(updatedRunIntervalPath+'RTwells.csv')
updatedrtDF.to_csv(updatedRunIntervalPath+'RTDF.csv')

# Generate input for tire series code
updatedWinWellsDF,updatedWinInjDF=gi.getWinWells(updatedFilteredDF,updatedSelectedWellsDF,updatedInjDF)

In [0]:
updatedScenarioTSRDF,updatedDPTimeSeriesR,updatedWellIDsR,updatedDayVecR = gist2.runPressureScenariosTimeSeries(EQ,updatedWinWellsDF,updatedWinInjDF,verbose=verb)

In [0]:
updatedTotalPPQuantilesDF=gi.prepTotalPressureTimeSeriesQuantilesPlot(updatedDPTimeSeriesR,updatedDayVecR,nQuantiles=11,epoch=pd.to_datetime('1970-01-01'),verbose=1)

prepTotalPressureTimeSeriesPlot: deltaPP.shape= (89, 400, 1503)  dayVec.shape= (1503,)
prepTotalPressureTimeSeriesPlot: totalDeltaPP.shape= (400, 1503)
prepTotalPressureTimeSeriesPlot: quantiles: [0.0, 10.0, 20.1, 30.1, 40.1, 50.1, 59.9, 69.9, 79.9, 90.0, 100.0]


In [0]:
updatedTotalPPSpaghettiDF=gi.prepTotalPressureTimeSeriesSpaghettiPlot(updatedDPTimeSeriesR,updatedDayVecR,gist2.diffPPVec,epoch=pd.to_datetime('1970-01-01'),verbose=1)

prepTotalPressureTimeSeriesSpaghettiPlot: deltaPP.shape= (89, 400, 1503)  dayVec.shape= (1503,)
prepTotalPressureTimeSeriesSpaghettiPlot: totalDeltaPP.shape= (400, 1503)


In [0]:
updatedAllPPQuantilesDF=gi.getPerWellPressureTimeSeriesQuantiles(updatedDPTimeSeriesR,updatedDayVecR,updatedWellIDsR,nQuantiles=11,epoch=pd.to_datetime('01-01-1970'),verbose=2)

getPerWellPressureTimeSeriesQuantiles - sizes:  400 1503 89
getPerWellPressureTimeSeriesQuantiles - after argsort:  400 1503 89
getPerWellPressureTimeSeriesQuantiles - quantiles:  [0.0, 10.0, 20.1, 30.1, 40.1, 50.1, 59.9, 69.9, 79.9, 90.0, 100.0]
getPerWellPressureTimeSeriesQuantiles - well:  0  of  89
getPerWellPressureTimeSeriesQuantiles - array sizes:  (400, 1503) (400, 1503) (400, 1503)
getPerWellPressureTimeSeriesQuantiles - well:  1  of  89
getPerWellPressureTimeSeriesQuantiles - array sizes:  (400, 1503) (400, 1503) (400, 1503)
getPerWellPressureTimeSeriesQuantiles - well:  2  of  89
getPerWellPressureTimeSeriesQuantiles - array sizes:  (400, 1503) (400, 1503) (400, 1503)
getPerWellPressureTimeSeriesQuantiles - well:  3  of  89
getPerWellPressureTimeSeriesQuantiles - array sizes:  (400, 1503) (400, 1503) (400, 1503)
getPerWellPressureTimeSeriesQuantiles - well:  4  of  89
getPerWellPressureTimeSeriesQuantiles - array sizes:  (400, 1503) (400, 1503) (400, 1503)
getPerWellPressure

In [0]:
updatedAllPPSpaghettiDF=gi.getPerWellPressureTimeSeriesSpaghetti(updatedDPTimeSeriesR,updatedDayVecR,gist2.diffPPVec,updatedWellIDsR,epoch=pd.to_datetime('01-01-1970'),verbose=2)

In [0]:
#updatedAllPPQuantilesDF,updatedAllPPSpaghettiDF=gi.getPerWellPressureTimeSeriesSpaghettiAndQuantiles(updatedDPTimeSeriesR,updatedDayVecR,gist2.diffPPVec,updatedWellIDsR,nQuantiles=11,epoch=pd.to_datetime('01-01-1970'),verbose=2)

com.databricks.backend.common.rpc.CommandCancelledException
	at com.databricks.spark.chauffeur.SequenceExecutionState.$anonfun$cancel$5(SequenceExecutionState.scala:136)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.spark.chauffeur.SequenceExecutionState.$anonfun$cancel$3(SequenceExecutionState.scala:136)
	at com.databricks.spark.chauffeur.SequenceExecutionState.$anonfun$cancel$3$adapted(SequenceExecutionState.scala:133)
	at scala.collection.immutable.Range.foreach(Range.scala:158)
	at com.databricks.spark.chauffeur.SequenceExecutionState.cancel(SequenceExecutionState.scala:133)
	at com.databricks.spark.chauffeur.ExecContextState.cancelRunningSequence(ExecContextState.scala:717)
	at com.databricks.spark.chauffeur.ExecContextState.$anonfun$cancel$1(ExecContextState.scala:435)
	at scala.Option.getOrElse(Option.scala:189)
	at com.databricks.spark.chauffeur.ExecContextState.cancel(ExecContextState.scala:435)
	at com.databricks.spark.chauffeur.ExecutionContextManagerV1.can

In [0]:
updatedWellPressureDict=gi.prepPressureAndDisposalTimeSeriesPlots(updatedAllPPQuantilesDF,updatedAllPPSpaghettiDF,updatedWinWellsDF,updatedWinInjDF,updatedOrderedWellIDList[:-1],verbose=0)

In [0]:
updatedSensitivityDF,updatedSensitivitySumDF = gist2.getPressureSensitivity(updatedWinInjDF,updatedWinWellsDF,EQ)

##4.4 Updated Run Output

In [0]:
#gist2.writeRealizations(updatedRunIntervalPath+'PorePressureRealizations.csv')
#updatedScenarioDF.to_csv(updatedRunIntervalPath+'scenarios.csv')
#updatedScenarioTestDF.to_csv(updatedRunIntervalPath+'scenariosTest.csv')
#updatedFilteredDF.to_csv(updatedRunIntervalPath+'filteredScenarios.csv')
#pd.Series(data=updatedOrderedWellList).to_csv(updatedRunIntervalPath+'wellOrder.csv')
#updatedMergedWellsDF.to_csv(updatedRunIntervalPath+'RTwells.csv')
#updatedrtDF.to_csv(updatedRunIntervalPath+'RTDF.csv')
#updatedDisaggregationDF.to_csv(updatedRunIntervalPath+'disaggregation.csv')
updatedTotalPPQuantilesDF.to_csv(updatedRunIntervalPath+'totalPPQuantiles.csv')
updatedTotalPPSpaghettiDF.to_csv(updatedRunIntervalPath+'totalPPSpaghetti.csv')

In [0]:
i=0
for wellDictKey, wellDictValue in updatedWellPressureDict.items():
  wellID=wellDictValue['WellInfo']['ID'].to_list()[0]
  wellFilePrefix='/perWell/well_'+str(i)+'_'
  wellDictValue['PPQuantiles'].to_csv(updatedRunIntervalPath+wellFilePrefix+'PPQuantiles.csv')
  wellDictValue['Disposal'].to_csv(updatedRunIntervalPath+wellFilePrefix+'Disposal.csv')
  wellDictValue['WellInfo'].to_csv(updatedRunIntervalPath+wellFilePrefix+'WellInfo.csv')
  wellDictValue['Spaghetti'].to_csv(updatedRunIntervalPath+wellFilePrefix+'Spaghetti.csv')
  print('well',i,', ID:',wellID,' completed')
  i=i+1

well 0 , ID: 2116501  completed
well 1 , ID: 2117633  completed
well 2 , ID: 2121203  completed
well 3 , ID: 2121195  completed
well 4 , ID: 2123953  completed
well 5 , ID: 2112104  completed
well 6 , ID: 2117616  completed
well 7 , ID: 2118517  completed
well 8 , ID: 2110592  completed
well 9 , ID: 2121715  completed
well 10 , ID: 2111946  completed
well 11 , ID: 2117806  completed
well 12 , ID: 2081404  completed
well 13 , ID: 2123938  completed
well 14 , ID: 2112058  completed
well 15 , ID: 2112198  completed
well 16 , ID: 2113042  completed
well 17 , ID: 2121097  completed
well 18 , ID: 2119704  completed
well 19 , ID: 2110718  completed
well 20 , ID: 2120980  completed
well 21 , ID: 2112288  completed
well 22 , ID: 2121271  completed
well 23 , ID: 2111922  completed
well 24 , ID: 2120964  completed
well 25 , ID: 2120827  completed
well 26 , ID: 2120615  completed
well 27 , ID: 2116381  completed
well 28 , ID: 2111358  completed
well 29 , ID: 2121230  completed
well 30 , ID: 210868

In [0]:
updatedScenarioTSRDF.to_csv(updatedRunIntervalPath+'materialScenariosR.csv')
np.savez_compressed(updatedRunIntervalPath+'timeSeriesR.npz', deltaPP=updatedDPTimeSeriesR,dayVec=updatedDayVecR,wellIDs=updatedWellIDsR)
updatedSensitivityDF.to_csv(updatedRunIntervalPath+'sensitivity.csv')
updatedSensitivitySumDF.to_csv(updatedRunIntervalPath+'sensitivitySum.csv')
#rateDF.to_csv(updatedRunIntervalPath+'rates.csv')

In [0]:
# Write out gist object to disk
import pickle
with open(updatedRunIntervalPath+'gist.pkl', 'wb') as file:
  pickle.dump(gist2, file)