# Geomechanical Injection Scenario Toolkit (GIST)

#Disclaimer
GIST aims to give the _gist_ of a wide range of potential scenarios and aid collective decision making when responding to seismicity.

The results of GIST are entirely dependent upon the inputs provided, which may be incomplete or inaccurate.

There are other potentially plausible inducement scenarios that are not considered, including fluid migration into the basement, 
out-of-zone poroelastic stressing, or hydraulic fracturing.

None of the individual models produced by GIST accurately represent what happens in the subsurface and cannot be credibly used 
to accurately assign liability or responsibility for seismicity.

"All models are wrong, but some are useful" - George Box, 1976

## Prerequisites

Assumes InjectionSQLScheduled completed successfully and injection data are sampled uniformly in time

##Install Dependencies
- geopandas
- gistMC.py
- eqSQL.py
- gistPlots.py
- numpy
- scipy
- pandas


In [0]:
%restart_python

In [0]:
%run "/Workspace/_utils/Utility_Functions"

In [0]:
!pip install geopandas
!pip install geodatasets
!pip install contextily
#! pip install folium matplotlib mapclassify contextily

Collecting geopandas
  Downloading geopandas-1.0.1-py3-none-any.whl (323 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 323.6/323.6 kB 6.0 MB/s eta 0:00:00
Collecting shapely>=2.0.0
  Downloading shapely-2.0.7-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.5/2.5 MB 17.3 MB/s eta 0:00:00
Collecting pyogrio>=0.7.2
  Downloading pyogrio-0.10.0-cp310-cp310-manylinux_2_28_x86_64.whl (23.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 23.9/23.9 MB 66.5 MB/s eta 0:00:00
Collecting pyproj>=3.3.0
  Downloading pyproj-3.7.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.2 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.2/9.2 MB 129.1 MB/s eta 0:00:00
Installing collected packages: shapely, pyproj, pyogrio, geopandas
Successfully installed geopandas-1.0.1 pyogrio-0.10.0 pyproj-3.7.0 shapely-2.0.7
[43mNote: you may need to restart the kernel using %restart_python or dbutils.library.restartPython() to use

##Paths

In [0]:
# Paths
homePath='/Workspace/Users/bill.curry@exxonmobil.com/'
# Injection data path 
injPath=homePath+'injection/WeeklyRun/ScheduledOutput/'
# GIST library path
gistPath=homePath+'GIST/'

##Libraries

- numpy
- scipy
- pandas
- matplotlib
- geopandas
- pyspark


In [0]:
import sys
sys.path.append(gistPath+'lib')

In [0]:
#Databricks-specific
#import dataBricksConfig as db
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("eqSQL").getOrCreate()

In [0]:
import numpy as np
import pandas as pd
import os
import gistMC as gi
import eqSQL as es
import matplotlib.pyplot as plt
import seaborn as sns
import geopandas
#import contextily as cx



In [0]:
import gistPlots as gp

#1. Select Event

In [0]:
eventID='texnet2024oqfb'

In [0]:
runPath=gistPath+'/runs/'+eventID+'/'
os.makedirs(runPath, exist_ok=True)

In [0]:
eqs=es.eqSQL()
EQDF=eqs.getEarthquake(eventID)
EQDF=EQDF.rename(columns={'LatitudeErrorKm':'LatitudeError','LongitudeErrorKm':'LongitudeError','EventId':'EventID'})
# What about the other fault plane solution? - 251 / 36 / -97
EQDF['Strike']=80.
EQDF['Dip']=55.
EQDF['Rake']=-85.
# We could make use of the Rake
EQDF.info()
eq=EQDF.loc[0]

getEarthquake:     SeismicEventId  ... B3RecordDeletedUTCDateTime
0         2481836  ...                        NaT

[1 rows x 32 columns] 1
<class 'geopandas.geodataframe.GeoDataFrame'>
RangeIndex: 1 entries, 0 to 0
Data columns (total 38 columns):
 #   Column                      Non-Null Count  Dtype         
---  ------                      --------------  -----         
 0   SeismicEventId              1 non-null      int64         
 1   DataSource                  1 non-null      object        
 2   DataSourceUrl               0 non-null      object        
 3   EventID                     1 non-null      object        
 4   EventTimeUtc                1 non-null      datetime64[ns]
 5   EventTimeInLocalTimeZone    1 non-null      object        
 6   EventTimeZone               1 non-null      object        
 7   EventType                   1 non-null      object        
 8   DepthKm                     1 non-null      float64       
 9   DepthErrorKm                1 non-null   

In [0]:
EQDF.to_csv(runPath+'EQ.csv')

#2. Initial Run

##2.1 Subsurface Model Parameters

###Inputs:

In [0]:
# Binary deep/shallow parameter
deepOrShallow='Deep'
# Depth from surface (ft)
# Distinguishes deep/shallow where we don't have a horizon 
depthCutoff=8000.

In [0]:
nRealizations=500

In [0]:
# Water density minimum/maximum (kg/m3)
WaterDensityMin=1015.
WaterDensityMax=1025.
# Water viscosity minimum/maximum (Pa.s)
WaterViscosityMin=0.000799
WaterViscosityMax=0.000801
# Fluid Compressibility minimum/maximum (1/Pa)
FluidCompressibilityMin=0.000000000359
FluidCompressibilityMax=0.000000000361

In [0]:
# Porosity in percent
PorosityPercentMin=3.
PorosityPercentMax=15.

# Permeability in millidarcies
PermMDMin=5.
PermMDMax=500.

# Thickness in feet
ThicknessFTMin=200.
ThicknessFTMax=2000.

# Vertical compressibility minimum/maximum (1/Pa)
VerticalCompressibilityMin=0.00000000107
VerticalCompressibilityMax=0.00000000109

In [0]:
# Verbosity - 0=silent, 1=some, 2=lots
verb=0
# Minimum Pressure change to care about in PSI
dPCutoff=0.5
# Maximum Number of wells to plot
nWells=50
# How far back in time to plot in years
minYear=-40

### Set Interval-Specific Paths...


In [0]:
# Set an output directory for this earthquake and this interval
runIntervalPath=runPath+deepOrShallow+'/'
initialRunIntervalPath=runIntervalPath+'initialRun/'
updatedRunIntervalPath=runIntervalPath+'udpatedRun/'
forecastRunIntervalPath=runIntervalPath+'forecastRun/'
disposalPath=runIntervalPath+'updatedDisposal/'
# Make directory if it doesn't exist
os.makedirs(runIntervalPath, exist_ok=True)
os.makedirs(initialRunIntervalPath, exist_ok=True)
os.makedirs(updatedRunIntervalPath, exist_ok=True)
os.makedirs(forecastRunIntervalPath, exist_ok=True)
os.makedirs(disposalPath, exist_ok=True)
os.makedirs(initialRunIntervalPath+'perWell', exist_ok=True)
os.makedirs(updatedRunIntervalPath+'perWell', exist_ok=True)
os.makedirs(forecastRunIntervalPath+'perWell', exist_ok=True)
################################################
# Output prefix for realizations of parameters #
################################################
RealizationPrefix=runIntervalPath+'MC'

In [0]:
# Point to appropriate well and injection files from initial database
if deepOrShallow=='Deep':
  WellFile=injPath+'/deep.csv'
  InjFile=injPath+'/deepReg.csv'
elif deepOrShallow=='Shallow':
  WellFile=injPath+'/shallow.csv'
  InjFile=injPath+'/shallowReg.csv'

In [0]:
# Oversampling of time axis (poroelastic-only)
nTimeBins=21
# Poroelastic parameters - for in-zone poroelasticity in v2
ShearModulusMin=4e9
ShearModulusMax=6e9
PoissonsRatioDrainedMin=0.295
PoissonsRatioDrainedMax=0.305
PoissonsRatioUndrainedMin=0.305
PoissonsRatioUndrainedMax=0.315
BiotsCoefficientMin=0.26
BiotsCoefficientMax=0.36
# Fault parameters (poroelastic-only)
FaultFrictionCoeffMin=0.55
FaultFrictionCoeffMax=0.65
RockFrictionCoeffMin=0.55
RockFrictionCoeffMax=0.65

##2.2 Results

###Compute

In [0]:
gist=gi.gistMC(nReal=nRealizations,
                ntBin=nTimeBins)
gist.initPP(rho0_min=WaterDensityMin,
             rho0_max=WaterDensityMax,
             phi_min=PorosityPercentMin,
             phi_max=PorosityPercentMax,
             kMD_min=PermMDMin,
             kMD_max=PermMDMax,
             h_min=ThicknessFTMin,
             h_max=ThicknessFTMax,
             alphav_min=VerticalCompressibilityMin,
             alphav_max=VerticalCompressibilityMax,
             beta_min=FluidCompressibilityMin,
             beta_max=FluidCompressibilityMax)

In [0]:
gist.addWells(WellFile,InjFile,verbose=verb)
if verb>0:
  gist.wellDF.info()
  EQDF.info()

In [0]:
selectedWellsDF,ignoredWellsDF,injDF=gist.findWells(eq,PE=False,verbose=verb)

In [0]:
scenarioDF=gist.runPressureScenarios(eq,selectedWellsDF,injDF,verbose=verb)

In [0]:
filteredDF,orderedWellList=gi.summarizePPResults(scenarioDF,selectedWellsDF,threshold=dPCutoff,nOrder=nWells,verbose=verb)

In [0]:
disaggregationDF=gi.prepDisaggregationPlot(filteredDF,orderedWellList,jitter=0.1,verbose=0)

In [0]:
diffRange=(min(gist.diffPPVec),max(gist.diffPPVec))
rtDF,mergedWellsDF = gi.prepRTPlot(selectedWellsDF,ignoredWellsDF,minYear,diffRange,clipYear=False)

In [0]:
winWellsDF,winInjDF=gi.getWinWells(filteredDF,selectedWellsDF,injDF)

In [0]:

scenarioTSRDF,dPTimeSeriesR,wellIDsR,dayVecR = gist.runPressureScenariosTimeSeries(eq,winWellsDF,winInjDF,verbose=verb)

In [0]:
totalPPQuantilesDF=gi.prepTotalPressureTimeSeriesPlot(dPTimeSeriesR,dayVecR,nQuantiles=11,epoch=pd.to_datetime('1970-01-01'),verbose=1)
print(totalPPQuantilesDF)

prepTotalPressureTimeSeriesPlot: deltaPP.shape= (39, 500, 1469)  dayVec.shape= (1469,)
prepTotalPressureTimeSeriesPlot: totalDeltaPP.shape= (500, 1469)
prepTotalPressureTimeSeriesPlot: quantiles: [0.0, 10.0, 20.0, 30.1, 40.1, 50.1, 59.9, 69.9, 80.0, 90.0, 100.0]
        DeltaPressure     Days  Realization  Percentile  Ordering       Date
0            0.000000   5450.0            0         0.0       0.0 1984-12-03
1            0.000000   5460.0            0         0.0       0.0 1984-12-13
2            0.000000   5470.0            0         0.0       0.0 1984-12-23
3            0.000000   5480.0            0         0.0       0.0 1985-01-02
4            0.000000   5490.0            0         0.0       0.0 1985-01-12
...               ...      ...          ...         ...       ...        ...
733601       0.224215  11150.0          499        20.0     100.0 2000-07-12
733602       0.224149  11160.0          499        20.0     100.0 2000-07-22
733994       0.305871  15080.0          499 

In [0]:
allPPQuantilesDF=gi.getPerWellPressureTimeSeriesQuantiles(dPTimeSeriesR,dayVecR,wellIDsR,nQuantiles=11,epoch=pd.to_datetime('01-01-1970'))

In [0]:

wellPressureDict=gi.prepPressureAndDisposalTimeSeriesPlots(allPPQuantilesDF,winWellsDF,winInjDF,orderedWellList[:-1],verbose=0)

In [0]:
# Calculate sensitivities to different parameters
sensitivityDF,sensitivitySumDF = gist.getPressureSensitivity(winInjDF,winWellsDF,eq,verbose=2)
# I'm not getting negative pressures anymore
print(sensitivitySumDF)

getPressureSensitivity: ntaS min/max:  0.0009 0.0011
getPressureSensitivity: phiS min/max:  3.0 15.0
getPressureSensitivity: hMS min/max:  60.96 609.6
getPressureSensitivity: alphavS min/max:  1.07e-09 1.08e-09
getPressureSensitivity: betaS min/max:  3.59e-10 3.61e-10
getPressureSensitivity: kMDS min/max:  5.0 500.0
getPressureSensitivity: SS:  [[0.00371367]
 [0.00373197]
 [0.00375026]
 [0.00373197]
 [0.00373197]
 [0.00373197]
 [0.0036595 ]
 [0.00373197]
 [0.00380443]
 [0.00067854]
 [0.00373197]
 [0.00678539]
 [0.00369842]
 [0.00373197]
 [0.00373197]
 [0.00373167]
 [0.00373197]
 [0.00373227]
 [0.00373197]
 [0.00373197]
 [0.00373197]]
getPressureSensitivity: TS:  [[8.34524852e-04]
 [8.38635812e-04]
 [8.42746772e-04]
 [9.31817569e-04]
 [8.38635812e-04]
 [7.62396193e-04]
 [8.38635812e-04]
 [8.38635812e-04]
 [8.38635812e-04]
 [1.52479239e-04]
 [8.38635812e-04]
 [1.52479239e-03]
 [8.38635812e-04]
 [8.38635812e-04]
 [8.38635812e-04]
 [8.38635812e-04]
 [8.38635812e-04]
 [8.38635812e-04]
 [1.6

In [0]:
futureEQ=eq.copy()
futureEQ.loc['Origin Date']=pd.to_datetime('2026-9-1')
print(eq,futureEQ)
rateDF = gist.getTHdPdT0(winWellsDF,winInjDF,eq,futureEQ,2)

SeismicEventId                                      2481836
DataSource                        TexNet Earthquake Catalog
DataSourceUrl                                          None
EventID                                      texnet2024oqfb
EventTimeUtc                            2024-07-27 02:11:07
EventTimeInLocalTimeZone               2024-07-26T21:11:07Z
EventTimeZone                                           CST
EventType                                        Earthquake
DepthKm                                            8.837891
DepthErrorKm                                       0.636678
Magnitude                                           1.64016
MagnitudeError                                          NaN
MagnitudeType                                    ml(texnet)
Location                                      Western Texas
Status                                                final
Latitude                                          32.144165
LatitudeError                           

  Q_new=(dpdt * futureInjSec / OneOver4piKappaH) / expTerm


###Output

In [0]:
gist.writeRealizations(initialRunIntervalPath+'PorePressureRealizations.csv')

In [0]:
scenarioDF.to_csv(initialRunIntervalPath+'scenarios.csv')

In [0]:
filteredDF.to_csv(initialRunIntervalPath+'filteredScenarios.csv')
pd.Series(data=orderedWellList).to_csv(initialRunIntervalPath+'wellOrder.csv')

In [0]:
mergedWellsDF.to_csv(initialRunIntervalPath+'RTwells.csv')
rtDF.to_csv(initialRunIntervalPath+'RTDF.csv')

In [0]:
disaggregationDF.to_csv(initialRunIntervalPath+'disaggregation.csv')

In [0]:
totalPPQuantilesDF.to_csv(initialRunIntervalPath+'totalPPQuantiles.csv')

In [0]:
i=0
for wellDictKey, wellDictValue in wellPressureDict.items():
  wellID=wellDictValue['WellInfo']['ID'].to_list()[0]
  wellFilePrefix='/perWell/well_'+str(i)+'_'
  wellDictValue['PPQuantiles'].to_csv(initialRunIntervalPath+wellFilePrefix+'PPQuantiles.csv')
  wellDictValue['Disposal'].to_csv(initialRunIntervalPath+wellFilePrefix+'Disposal.csv')
  wellDictValue['WellInfo'].to_csv(initialRunIntervalPath+wellFilePrefix+'WellInfo.csv')
  print('well',i,', ID:',wellID,' completed')
  i=i+1

well 0 , ID: 2121094  completed
well 1 , ID: 2121271  completed
well 2 , ID: 2123938  completed
well 3 , ID: 2117885  completed
well 4 , ID: 2111946  completed
well 5 , ID: 2109094  completed
well 6 , ID: 2116501  completed
well 7 , ID: 2112288  completed
well 8 , ID: 2081404  completed
well 9 , ID: 2114817  completed
well 10 , ID: 2112104  completed
well 11 , ID: 2111358  completed
well 12 , ID: 2121191  completed
well 13 , ID: 2056720  completed
well 14 , ID: 2113898  completed
well 15 , ID: 2111690  completed
well 16 , ID: 2118993  completed
well 17 , ID: 2121195  completed
well 18 , ID: 2120454  completed
well 19 , ID: 2117633  completed
well 20 , ID: 2112172  completed
well 21 , ID: 2081279  completed
well 22 , ID: 2121203  completed
well 23 , ID: 2112058  completed
well 24 , ID: 2105036  completed
well 25 , ID: 2120859  completed
well 26 , ID: 2064982  completed
well 27 , ID: 2119049  completed
well 28 , ID: 2120980  completed
well 29 , ID: 2123953  completed
well 30 , ID: 205182

In [0]:
scenarioTSRDF.to_csv(initialRunIntervalPath+'materialScenariosR.csv')
np.savez_compressed(initialRunIntervalPath+'timeSeriesR.npz', deltaPP=dPTimeSeriesR,dayVec=dayVecR,wellIDs=wellIDsR)

In [0]:
sensitivityDF.to_csv(initialRunIntervalPath+'sensitivity.csv')
sensitivitySumDF.to_csv(initialRunIntervalPath+'sensitivitySum.csv')

In [0]:
rateDF.to_csv(initialRunIntervalPath+'rates.csv')

#3. Correct Data

##3.1 Export Disposal Data

In [0]:
selectedWellsDF.to_csv(disposalPath+'selectedWells.csv')
ignoredWellsDF.to_csv(disposalPath+'ignoredWells.csv')
allWellsDF=pd.concat([selectedWellsDF,ignoredWellsDF])
allWellsDF.to_csv(disposalPath+'allInZoneWells.csv')
injDF.to_csv(disposalPath+'inj.csv')

##3.2 Import Updated Disposal Data

In [0]:

udpatedSelectedWellsDF=pd.read_csv(disposalPath+'selectedWells.csv')
updatedIgnoredWellsDF=pd.read_csv(disposalPath+'ignoredWells.csv')
updatedWellsFile=disposalPath+'allInZoneWells.csv'
updatedInjFile=disposalPath+'inj.csv'

#4. Updated Analysis

##4.1 Refined Parameters

In [0]:
UpdatednRealizations=2000

In [0]:
# Porosity in percent
UpdatedPorosityPercentMin=3.
UpdatedPorosityPercentMax=15.

# Permeability in millidarcies
UpdatedPermMDMin=10.
UpdatedPermMDMax=500.

# Thickness in feet
UpdatedThicknessFTMin=500.
UpdatedThicknessFTMax=1200.

# Vertical compressibility minimum/maximum (1/Pa)
UpdatedVerticalCompressibilityMin=0.00000000107
UpdatedVerticalCompressibilityMax=0.00000000109

In [0]:
# Verbosity - 0=silent, 1=some, 2=lots
verb=0
# Minimum Pressure change to care about in PSI
dPCutoff=0.5
# Maximum Number of wells to plot
nWells=50
# How far back in time to plot in years
minYear=-40

In [0]:
gist2=gi.gistMC(nReal=UpdatednRealizations,
                ntBin=nTimeBins)
gist2.initPP(rho0_min=WaterDensityMin,
             rho0_max=WaterDensityMax,
             phi_min=UpdatedPorosityPercentMin,
             phi_max=UpdatedPorosityPercentMax,
             kMD_min=UpdatedPermMDMin,
             kMD_max=UpdatedPermMDMax,
             h_min=UpdatedThicknessFTMin,
             h_max=UpdatedThicknessFTMax,
             alphav_min=VerticalCompressibilityMin,
             alphav_max=VerticalCompressibilityMax,
             beta_min=FluidCompressibilityMin,
             beta_max=FluidCompressibilityMax)

##4.2 Updated Disposal Rates

In [0]:
gist2.addWells(updatedWellsFile,updatedInjFile,verbose=verb)

In [0]:
updatedSelectedWellsDF,updatedIgnoredWellsDF,updatedInjDF=gist2.findWells(eq,PE=False,verbose=verb)

##4.3 Updated Run

In [0]:
updatedScenarioDF=gist2.runPressureScenarios(eq,updatedSelectedWellsDF,updatedInjDF,verbose=verb)

In [0]:
updatedFilteredDF,updatedOrderedWellList=gi.summarizePPResults(updatedScenarioDF,updatedSelectedWellsDF,threshold=dPCutoff,nOrder=nWells,verbose=verb)
updatedDisaggregationDF=gi.prepDisaggregationPlot(updatedFilteredDF,updatedOrderedWellList,jitter=0.1,verbose=0)
updatedDiffRange=(min(gist2.diffPPVec),max(gist2.diffPPVec))
updatedrtDF,updatedMergedWellsDF = gi.prepRTPlot(updatedSelectedWellsDF,updatedIgnoredWellsDF,minYear,updatedDiffRange,clipYear=False)
updatedWinWellsDF,updatedWinInjDF=gi.getWinWells(updatedFilteredDF,updatedSelectedWellsDF,updatedInjDF)

In [0]:
updatedScenarioTSRDF,updatedDPTimeSeriesR,updatedWellIDsR,updatedDayVecR = gist2.runPressureScenariosTimeSeries(eq,updatedWinWellsDF,updatedWinInjDF,verbose=verb)

In [0]:
updatedTotalPPQuantilesDF=gi.prepTotalPressureTimeSeriesPlot(updatedDPTimeSeriesR,updatedDayVecR,nQuantiles=11,epoch=pd.to_datetime('1970-01-01'),verbose=1)
updatedAllPPQuantilesDF=gi.getPerWellPressureTimeSeriesQuantiles(updatedDPTimeSeriesR,updatedDayVecR,updatedWellIDsR,nQuantiles=11,epoch=pd.to_datetime('01-01-1970'))
updatedWellPressureDict=gi.prepPressureAndDisposalTimeSeriesPlots(updatedAllPPQuantilesDF,winWellsDF,winInjDF,orderedWellList[:-1],verbose=0)

prepTotalPressureTimeSeriesPlot: deltaPP.shape= (18, 2000, 1414)  dayVec.shape= (1414,)
prepTotalPressureTimeSeriesPlot: totalDeltaPP.shape= (2000, 1414)
prepTotalPressureTimeSeriesPlot: quantiles: [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]


In [0]:
updatedSensitivityDF,updatedSensitivitySumDF = gist2.getPressureSensitivity(updatedWinInjDF,updatedWinWellsDF,eq)

##4.4 Updated Run Output

In [0]:
gist2.writeRealizations(updatedRunIntervalPath+'PorePressureRealizations.csv')
updatedScenarioDF.to_csv(updatedRunIntervalPath+'scenarios.csv')
updatedFilteredDF.to_csv(updatedRunIntervalPath+'filteredScenarios.csv')
pd.Series(data=updatedOrderedWellList).to_csv(updatedRunIntervalPath+'wellOrder.csv')
updatedMergedWellsDF.to_csv(updatedRunIntervalPath+'RTwells.csv')
updatedrtDF.to_csv(updatedRunIntervalPath+'RTDF.csv')
updatedDisaggregationDF.to_csv(updatedRunIntervalPath+'disaggregation.csv')
updatedTotalPPQuantilesDF.to_csv(updatedRunIntervalPath+'totalPPQuantiles.csv')

In [0]:
i=0
for wellDictKey, wellDictValue in updatedWellPressureDict.items():
  wellID=wellDictValue['WellInfo']['ID'].to_list()[0]
  wellFilePrefix='/perWell/well_'+str(i)+'_'
  wellDictValue['PPQuantiles'].to_csv(updatedRunIntervalPath+wellFilePrefix+'PPQuantiles.csv')
  wellDictValue['Disposal'].to_csv(updatedRunIntervalPath+wellFilePrefix+'Disposal.csv')
  wellDictValue['WellInfo'].to_csv(updatedRunIntervalPath+wellFilePrefix+'WellInfo.csv')
  print('well',i,', ID:',wellID,' completed')
  i=i+1

well 0 , ID: 2121094  completed
well 1 , ID: 2121271  completed
well 2 , ID: 2123938  completed
well 3 , ID: 2117885  completed
well 4 , ID: 2111946  completed
well 5 , ID: 2109094  completed
well 6 , ID: 2116501  completed
well 7 , ID: 2112288  completed
well 8 , ID: 2081404  completed
well 9 , ID: 2114817  completed
well 10 , ID: 2112104  completed
well 11 , ID: 2111358  completed
well 12 , ID: 2121191  completed
well 13 , ID: 2056720  completed
well 14 , ID: 2113898  completed
well 15 , ID: 2111690  completed
well 16 , ID: 2118993  completed
well 17 , ID: 2121195  completed
well 18 , ID: 2120454  completed
well 19 , ID: 2117633  completed
well 20 , ID: 2112172  completed
well 21 , ID: 2081279  completed
well 22 , ID: 2121203  completed
well 23 , ID: 2112058  completed
well 24 , ID: 2105036  completed
well 25 , ID: 2120859  completed
well 26 , ID: 2064982  completed
well 27 , ID: 2119049  completed
well 28 , ID: 2120980  completed
well 29 , ID: 2123953  completed
well 30 , ID: 205182

In [0]:
updatedScenarioTSRDF.to_csv(updatedRunIntervalPath+'materialScenariosR.csv')
np.savez_compressed(updatedRunIntervalPath+'timeSeriesR.npz', deltaPP=updatedDPTimeSeriesR,dayVec=updatedDayVecR,wellIDs=updatedWellIDsR)
updatedSensitivityDF.to_csv(updatedRunIntervalPath+'sensitivity.csv')
updatedSensitivitySumDF.to_csv(updatedRunIntervalPath+'sensitivitySum.csv')
#rateDF.to_csv(updatedRunIntervalPath+'rates.csv')

#5. Forecast

##5.1 Input Forecast Times

In [0]:
# End Year of time series forecast
EndYear=2030
# Time to calculate rate distributions


##5.2 Distributions of rates for each well

In [0]:
# Rerun select wells with an updated earthquake time set to the last day of EndYear
future=eq.copy()
future['Origin Date']=pd.to_datetime('12-31-'+str(EndYear))
forecastSelectedWellsDF,forecastIgnoredWellsDF,forecastInitialInjDF=gist2.findWells(future,PE=False,verbose=verb)
# Merge udpated well dataframe with prior unselected wells
addedWellsDF=forecastSelectedWellsDF[~forecastSelectedWellsDF['InjectionWellId'].isin(updatedSelectedWellsDF['InjectionWellId'])]
print(addedWellsDF)
forecastSelectedWellsDF['Added']=False
forecastSelectedWellsDF.loc[forecastSelectedWellsDF['InjectionWellId'].isin(addedWellsDF['InjectionWellId']),'Added']=True
# Re-generate an R-T plot here. Should it have two sets of curves?
# We need a column for wells that are newly added - 


forecastRTDF,forecastMergedWellsDF = gi.prepRTPlot(forecastSelectedWellsDF,forecastIgnoredWellsDF,minYear,diffRange,clipYear=False)


     Unnamed: 0       ID  ...         EventID  TotalBBL
278         186  1007808  ...  texnet2024oqfb       0.0
279         224  1004456  ...  texnet2024oqfb       0.0
280         229  1004722  ...  texnet2024oqfb       0.0
282         333  1004337  ...  texnet2024oqfb       0.0
283         334  1004455  ...  texnet2024oqfb       0.0
..          ...      ...  ...             ...       ...
454        2304  2112378  ...  texnet2024oqfb       0.0
455        2305  2120653  ...  texnet2024oqfb       0.0
456        2306  2120087  ...  texnet2024oqfb       0.0
457        2307  2120302  ...  texnet2024oqfb       0.0
458        2313  2118758  ...  texnet2024oqfb       0.0

[162 rows x 71 columns]


In [0]:
forecastMergedWellsDF.to_csv(forecastRunIntervalPath+'forecastRTwells.csv')
forecastRTDF.to_csv(forecastRunIntervalPath+'forecastRTDF.csv')
forecastInitialInjDF.to_csv(forecastRunIntervalPath+'forecastInitialInjDF.csv')

In [0]:
# I need new code here that calculates the time derivatives at a future point
# TDTD0DF= gist.getTHTD0(newSelectedWellsDF,injDF,future,verb)

In [0]:
# Get ordered well IDs from orderedWellList[:-1]
print(updatedOrderedWellList[:-1])
forecastWellIDList=[]
for wellName in updatedOrderedWellList[:-1]:
  wellID=forecastMergedWellsDF[forecastMergedWellsDF['WellName']==wellName]['ID'].to_list()[0]
  print(wellName,' ID:',wellID)
  forecastWellIDList.append(wellID)

Index(['MARIENFELD 13 1D', 'ANNALEA SWD 1', 'PALO VERDE 1', 'FAUDREE 1D',
       'NEPTUNE SWD 1', 'DICKENSON 20 8D', 'CORFU SWD 1', 'SALE RANCH 27 1DD',
       'PAT, K. 3', 'SALT LAKE SWD 1', 'LIMEQUEST 6 SWD 1D',
       'MCMORRIES 18 SWD 2D', 'PENROSE-OLDHAM SWD 2', 'BERRY SWD 1',
       'DAGGER LAKE SWD 08SD', 'MARBILL SWD 1', 'NAIL RANCH \"36\" 1D',
       'BROWN SWD 1'],
      dtype='object', name='Name')
MARIENFELD 13 1D  ID: 2121094
ANNALEA SWD 1  ID: 2121271
PALO VERDE 1  ID: 2123938
FAUDREE 1D  ID: 2117885
NEPTUNE SWD 1  ID: 2111946
DICKENSON 20 8D  ID: 2109094
CORFU SWD 1  ID: 2116501
SALE RANCH 27 1DD  ID: 2112288
PAT, K. 3  ID: 2081404
SALT LAKE SWD 1  ID: 2114817
LIMEQUEST 6 SWD 1D  ID: 2112104
MCMORRIES 18 SWD 2D  ID: 2111358
PENROSE-OLDHAM SWD 2  ID: 2056720
BERRY SWD 1  ID: 2121191
DAGGER LAKE SWD 08SD  ID: 2113898
MARBILL SWD 1  ID: 2111690
NAIL RANCH \"36\" 1D  ID: 2118993
BROWN SWD 1  ID: 2093925


In [0]:
# Put in new rates
rateDict=dict(zip(forecastWellIDList,[10000.]*len(forecastWellIDList)))
print(rateDict)

{2121094: 10000.0, 2121271: 10000.0, 2123938: 10000.0, 2117885: 10000.0, 2111946: 10000.0, 2109094: 10000.0, 2116501: 10000.0, 2112288: 10000.0, 2081404: 10000.0, 2114817: 10000.0, 2112104: 10000.0, 2111358: 10000.0, 2056720: 10000.0, 2121191: 10000.0, 2113898: 10000.0, 2111690: 10000.0, 2118993: 10000.0, 2093925: 10000.0}


In [0]:
startDate=pd.to_datetime('2025-01-15')
endDate=pd.to_datetime('2030-12-31')
startDay=int((startDate-pd.to_datetime('1970-01-01')).days)
endDay=int((endDate-pd.to_datetime('1970-01-01')).days)
# Pull injection DF for updated well selection from forecastInjDF
# Stop time at "startDate" or startDay
forecastPastInjDF=forecastInitialInjDF[forecastInitialInjDF['Days']<startDay]
# Get last disposal value for all wells, rate and time
# Assume time is regularized for all wells
lastDay=int(forecastPastInjDF['Days'].max())
print('lastDay: ',lastDay)
lastInjDF=forecastInitialInjDF[forecastInitialInjDF['Days']==lastDay]
# generate future time series for each well
# Here is where the input well rates come into play
dDay=10
futureDays=np.arange(lastDay+dDay,endDay+dDay,float(dDay))
futureDates=[pd.to_datetime('1970-01-01')+pd.to_timedelta(d,unit='D') for d in futureDays]
#print(list(zip(futureDates,futureDays)))
futureInjDF=pd.DataFrame(columns=['ID','Days','BPD','Date','Type'])
print(forecastSelectedWellsDF.info())
for wellID in forecastSelectedWellsDF['ID'].to_list():
  IDs=[wellID]*len(futureDays)
  print(wellID)
  # If this well has a prescribed rate, set it
  if wellID in rateDict.keys():
    BPD=rateDict[wellID]
    print('set ',wellID,rateDict[wellID])
    rateType=['Set'] * len(futureDays)
  elif len(lastInjDF[lastInjDF['ID']==wellID])>0:
    BPD=lastInjDF[lastInjDF['ID']==wellID]['BPD'].to_list()[0]
    print('extrapolated ',BPD)
    rateType=['Extrapolated'] * len(futureDays)
  else:
    # No data case
    BPD=0.
    print('no data - zeroes')
    rateType=['No Data'] * len(futureDays)
  BPDs=np.ones(len(futureDays))*BPD
  futureWellInjDF=pd.DataFrame({'ID':IDs,'Days':futureDays,'BPD':BPDs,'Date':futureDates,'Type':rateType})
  futureInjDF=pd.concat([futureInjDF,futureWellInjDF])
  # If not, extrapolate given the last known value
# Create updated dataframe with new wells
forecastPastInjDF['Type']='Original'
forecastInjDF=pd.concat([forecastPastInjDF,futureInjDF])
print(forecastInjDF)

lastDay:  20100
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 459 entries, 0 to 458
Data columns (total 71 columns):
 #   Column                                                   Non-Null Count  Dtype  
---  ------                                                   --------------  -----  
 0   Unnamed: 0                                               459 non-null    int64  
 1   ID                                                       459 non-null    int64  
 2   InjectionWellId                                          459 non-null    int64  
 3   UniqueWellIdentifier                                     459 non-null    int64  
 4   UICNumber                                                433 non-null    object 
 5   APINumber                                                459 non-null    object 
 6   LeaseName                                                437 non-null    object 
 7   Operator                                                 458 non-null    object 
 8   OperatorType  

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  forecastPastInjDF['Type']='Original'


In [0]:
forecastScenarioDF=gist2.runPressureScenarios(future,forecastSelectedWellsDF,forecastInjDF,verbose=verb)
forecastFilteredDF,forecastOrderedWellList=gi.summarizePPResults(forecastScenarioDF,forecastSelectedWellsDF,threshold=dPCutoff,nOrder=nWells,verbose=verb)
forecastDisaggregationDF=gi.prepDisaggregationPlot(forecastFilteredDF,forecastOrderedWellList,jitter=0.1,verbose=1)
forecastWinWellsDF,forecastWinInjDF=gi.getWinWells(forecastFilteredDF,forecastSelectedWellsDF,forecastInjDF)

 prepDisaggregationPlot:  36  wells in disaggregation plot with  2000  realizations
 prepDisaggregationPlot:  2000  rows for  MARIENFELD 13 1D
 prepDisaggregationPlot:  2000  rows for  ANNALEA SWD 1
 prepDisaggregationPlot:  2000  rows for  PALO VERDE 1
 prepDisaggregationPlot:  2000  rows for  SALT LAKE SWD 1
 prepDisaggregationPlot:  2000  rows for  FAUDREE 1D
 prepDisaggregationPlot:  2000  rows for  SALE RANCH 27 1DD
 prepDisaggregationPlot:  2000  rows for  DICKENSON 20 8D
 prepDisaggregationPlot:  2000  rows for  NEPTUNE SWD 1
 prepDisaggregationPlot:  2000  rows for  CORFU SWD 1
 prepDisaggregationPlot:  2000  rows for  LIMEQUEST 6 SWD 1D
 prepDisaggregationPlot:  2000  rows for  DAGGER LAKE SWD 08SD
 prepDisaggregationPlot:  2000  rows for  PENROSE-OLDHAM SWD 2
 prepDisaggregationPlot:  2000  rows for  MCMORRIES 18 SWD 2D
 prepDisaggregationPlot:  2000  rows for  BERRY SWD 1
 prepDisaggregationPlot:  2000  rows for  MARBILL SWD 1
 prepDisaggregationPlot:  2000  rows for  BROWN 

In [0]:
forecastScenarioTSRDF,forecastdPTimeSeriesR,forecastWellIDsR,forecastDayVecR = gist2.runPressureScenariosTimeSeries(future,forecastWinWellsDF,forecastWinInjDF,verbose=verb)

In [0]:
forecastTotalPPQuantilesDF=gi.prepTotalPressureTimeSeriesPlot(forecastdPTimeSeriesR,forecastDayVecR,nQuantiles=11,epoch=pd.to_datetime('1970-01-01'),verbose=1)

prepTotalPressureTimeSeriesPlot: deltaPP.shape= (35, 2000, 1630)  dayVec.shape= (1630,)
prepTotalPressureTimeSeriesPlot: totalDeltaPP.shape= (2000, 1630)
prepTotalPressureTimeSeriesPlot: quantiles: [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]


In [0]:
forecastAllPPQuantilesDF=gi.getPerWellPressureTimeSeriesQuantiles(forecastdPTimeSeriesR,forecastDayVecR,forecastWellIDsR,nQuantiles=11,epoch=pd.to_datetime('01-01-1970'),verbose=2)

getPerWellPressureTimeSeriesQuantiles - sizes:  2000 1630 35
getPerWellPressureTimeSeriesQuantiles - quantiles:  [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
getPerWellPressureTimeSeriesQuantiles - well:  0  of  35
getPerWellPressureTimeSeriesQuantiles - array sizes:  (2000, 1630) (2000, 1630) (2000, 1630)
getPerWellPressureTimeSeriesQuantiles - well:  1  of  35
getPerWellPressureTimeSeriesQuantiles - array sizes:  (2000, 1630) (2000, 1630) (2000, 1630)
getPerWellPressureTimeSeriesQuantiles - well:  2  of  35
getPerWellPressureTimeSeriesQuantiles - array sizes:  (2000, 1630) (2000, 1630) (2000, 1630)
getPerWellPressureTimeSeriesQuantiles - well:  3  of  35
getPerWellPressureTimeSeriesQuantiles - array sizes:  (2000, 1630) (2000, 1630) (2000, 1630)
getPerWellPressureTimeSeriesQuantiles - well:  4  of  35
getPerWellPressureTimeSeriesQuantiles - array sizes:  (2000, 1630) (2000, 1630) (2000, 1630)
getPerWellPressureTimeSeriesQuantiles - well:  5  of  35
getPerWellPre

In [0]:
print(forecastAllPPQuantilesDF.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1141000 entries, 0 to 1140999
Data columns (total 7 columns):
 #   Column         Non-Null Count    Dtype         
---  ------         --------------    -----         
 0   DeltaPressure  1141000 non-null  float64       
 1   Days           1141000 non-null  float64       
 2   Realization    1141000 non-null  object        
 3   Order          1141000 non-null  float64       
 4   WellID         1141000 non-null  object        
 5   Percentile     1141000 non-null  float64       
 6   Date           1141000 non-null  datetime64[ns]
dtypes: datetime64[ns](1), float64(4), object(2)
memory usage: 60.9+ MB
None


In [0]:
forecastWellPressureDict=gi.prepPressureAndDisposalTimeSeriesPlots(forecastAllPPQuantilesDF,forecastWinWellsDF,forecastWinInjDF,forecastOrderedWellList[:-1],verbose=0)

In [0]:
forecastMergedWellsDF.to_csv(forecastRunIntervalPath+'RTwells.csv')
forecastRTDF.to_csv(forecastRunIntervalPath+'RTDF.csv')
forecastInjDF.to_csv(forecastRunIntervalPath+'forecastInjDF.csv')
forecastSelectedWellsDF.to_csv(forecastRunIntervalPath+'updatedSelectedWells.csv')
forecastIgnoredWellsDF.to_csv(forecastRunIntervalPath+'updatedIgnoredWells.csv')
futureInjDF.to_csv(forecastRunIntervalPath+'futureInj.csv')
forecastScenarioDF.to_csv(forecastRunIntervalPath+'fullScenarios.csv')
forecastFilteredDF.to_csv(forecastRunIntervalPath+'filteredScenarios.csv')
pd.Series(data=forecastOrderedWellList).to_csv(forecastRunIntervalPath+'forecastWellOrder.csv')
forecastDisaggregationDF.to_csv(forecastRunIntervalPath+'forecastDisaggregation.csv')
forecastTotalPPQuantilesDF.to_csv(forecastRunIntervalPath+'forecastTotalPPQuantiles.csv')

In [0]:
i=0
for wellDictKey, wellDictValue in forecastWellPressureDict.items():
  wellID=wellDictValue['WellInfo']['ID'].to_list()[0]
  futureWellFilePrefix='/perWell/well_'+str(i)+'_'
  wellDictValue['PPQuantiles'].to_csv(forecastRunIntervalPath+futureWellFilePrefix+'PPQuantiles.csv')
  wellDictValue['Disposal'].to_csv(forecastRunIntervalPath+futureWellFilePrefix+'Disposal.csv')
  wellDictValue['WellInfo'].to_csv(forecastRunIntervalPath+futureWellFilePrefix+'WellInfo.csv')
  print('well',i,', ID:',wellID,' completed')
  i=i+1

well 0 , ID: 2121094  completed
well 1 , ID: 2121271  completed
well 2 , ID: 2123938  completed
well 3 , ID: 2114817  completed
well 4 , ID: 2117885  completed
well 5 , ID: 2112288  completed
well 6 , ID: 2109094  completed
well 7 , ID: 2111946  completed
well 8 , ID: 2116501  completed
well 9 , ID: 2112104  completed
well 10 , ID: 2113898  completed
well 11 , ID: 2056720  completed
well 12 , ID: 2111358  completed
well 13 , ID: 2121191  completed
well 14 , ID: 2111690  completed
well 15 , ID: 2121195  completed
well 16 , ID: 2118993  completed
well 17 , ID: 2117633  completed
well 18 , ID: 2081404  completed
well 19 , ID: 2120980  completed
well 20 , ID: 2123953  completed
well 21 , ID: 2112198  completed
well 22 , ID: 2112172  completed
well 23 , ID: 2105036  completed
well 24 , ID: 2120454  completed
well 25 , ID: 2111988  completed
well 26 , ID: 2119704  completed
well 27 , ID: 2112058  completed
well 28 , ID: 2111922  completed
well 29 , ID: 2121203  completed
well 30 , ID: 211851

In [0]:
forecastScenarioTSRDF.to_csv(forecastRunIntervalPath+'materialScenariosR.csv')
np.savez_compressed(forecastRunIntervalPath+'forecastTimeSeriesR.npz', deltaPP=forecastdPTimeSeriesR,dayVec=forecastDayVecR,wellIDs=forecastWellIDsR)