# Interpret Topics

In [1]:
# library("oildata")
library("incidentmodels")
suppressMessages(library("tidyverse"))
library("tidytext")
suppressMessages(library("magrittr"))
suppressMessages(library("glue"))
suppressMessages(library("here"))

data_folder <- purrr::partial(here, "data-raw", ".temp", "data")

In [2]:
gammas <- readRDS(data_folder("gammas.rds"))
gammas$label <- as.character(NA)
gammas$cause_topic <- NA
head(gammas)

incident_ID,ID,year,commodity,on_offshore,topic,gamma,label,cause_topic
<chr>,<chr>,<dbl>,<chr>,<chr>,<int>,<dbl>,<chr>,<lgl>
20040015,19237,2004,rpp,onshore,1,0.03150599,,
20040015,19237,2004,rpp,onshore,2,0.03150599,,
20040015,19237,2004,rpp,onshore,3,0.04599874,,
20040015,19237,2004,rpp,onshore,4,0.17643352,,
20040015,19237,2004,rpp,onshore,5,0.03150599,,
20040015,19237,2004,rpp,onshore,6,0.03150599,,


In [3]:
head(betas)

topic,term,beta
<int>,<chr>,<dbl>
1,line,7.913214e-05
2,line,7.982247e-06
3,line,0.004370248
4,line,8.421904e-06
5,line,6.856923e-06
6,line,0.3538048


## Functions

In [4]:
top_terms <- betas %>%
    arrange(desc(beta)) %>%
    group_by(topic) %>%
    slice_head(n = 8) %>%
    ungroup() %>%
    arrange(topic, -beta)

head(top_terms)

topic,term,beta
<int>,<chr>,<dbl>
1,nrc,0.06489555
1,determin,0.05266603
1,report,0.05065176
1,cost,0.04022071
1,estim,0.04014877
1,releas,0.03669573


In [5]:
get_terms <- function(topic) {
    filter(top_terms, topic == {{ topic }})
}
narratives <- select(incidents, incident_ID, narrative) %>%
    mutate(incident_ID = as.character(incident_ID))
get_narrative <- function(topic) {
    gammas %>%
        filter(topic == {{ topic }}) %>%
        filter(gamma == max(gamma)) %>%
        left_join(narratives, by = "incident_ID") %$%
        glue("Incident ID: {incident_ID}
              Operator ID: {ID}

              {narrative}")
}

## Topics 1-5

### Topic 1

In [6]:
get_terms(topic = 1)

topic,term,beta
<int>,<chr>,<dbl>
1,nrc,0.06489555
1,determin,0.05266603
1,report,0.05065176
1,cost,0.04022071
1,estim,0.04014877
1,releas,0.03669573
1,notif,0.03086873
1,time,0.02712794


In [7]:
print(get_narrative(1))

Incident ID: 20190167
Operator ID: 32551

During the night of wednesday may 8, 2019, the cushing area was inundated by a torrential rainfall event.  This event produced over 5 inches of rainfall in a matter of hours.  The roof drain system was unable to relieve the volume of water due to buildup of blast media in the check valve, preventing full actuation.  This caused the roof to draft to oil level in tank, resulting in a manway failure, which released oil onto the roof exiting via the drain.  At the time the release was discovered, the winds were steady at over 30 mph with gusts exceeding 45 mph.  The high winds pushed the oil into one corner of the containment and gave the appearance of a smaller release.  When the wind direction shifted, the oil circumvented the containment booms and covered the surface of the rainwater throughout the containment area.  On thursday, once the amount of oil in the containment area became clear, bkep estimated the cost of the response to be below $50,

**This topic seems to be related to the management of incident reports**

In [8]:
gammas[gammas$topic == 1, ]$label <- "report_mngt"
gammas[gammas$topic == 1, ]$cause_topic <- F

### Topic 2

In [9]:
get_terms(topic = 2)

topic,term,beta
<int>,<chr>,<dbl>
2,oper,0.23787896
2,personnel,0.09036702
2,local,0.06929389
2,facil,0.05237153
2,investig,0.04031833
2,immedi,0.0316975
2,discov,0.02714762
2,isol,0.02076183


In [10]:
print(get_narrative(2))

Incident ID: 20120331
Operator ID: 32602

Operator making routine rounds on facility discovered crude oil coming up from the ground in the right of way for the dot pipeline. Upon discovery of the leak, operator notified facility operations supervisor (fos) of leak.  Fos contacted assistant facility operations supervisor who then proceeded to site 3 to isolate shipping pump and block valve for line (shipping pump was not shipping oil at time of leak).  Operator who discovered leak then proceeded to site 11 to close next block valve on pipeline to isolate leak.  Operators contained leak and called for clean up contractors.


**This topic is related to the management of spills**

In [11]:
gammas[gammas$topic == 2, ]$label <- "spill_mngt"
gammas[gammas$topic == 2, ]$cause_topic <- F

### Topic 3

In [12]:
get_terms(topic = 3)

topic,term,beta
<int>,<chr>,<dbl>
3,pipelin,0.11802786
3,contractor,0.04947495
3,excav,0.04464508
3,call,0.03973732
3,locat,0.030545
3,parti,0.02883117
3,damag,0.02197588
3,feet,0.0156659


In [13]:
print(get_narrative(3))

Incident ID: 20130246
Operator ID: 32011

¿Tech con trenching struck the pipeline while trenching on june 12, 2013.  Tech con had failed to (1) send proper one-call notification; (2) accurately identify or white line the area of excavation; or (3) notify holly energy prior to excavating and/or exposing (pot-holing) the pipeline so that holly energy could re-mark the pipeline and witness this excavation process as requested in accordance with texas utility code subchapter d sec. 251.151(c).      Tech con trenching never notified holly energy of its intent to excavate across holly energy's pipeline. Holly energy received a one-call notice from jd king on the evening of june 6. Holly energy does not fully understand the connection between tech con trenching and jd king, but both entities appeared to have been working on the same project for which tech con trenching was conducting trenching on june 13.  On the morning of june 7, john osborne of holly energy called ricky knelsen of jd king 

**This topic is related to contractors and excavation**

In [14]:
gammas[gammas$topic == 3, ]$label <- "excavation"
gammas[gammas$topic == 3, ]$cause_topic <- T

### Topic 4

In [15]:
get_terms(topic = 4)

topic,term,beta
<int>,<chr>,<dbl>
4,drain,0.11858883
4,water,0.06620458
4,roof,0.0459078
4,product,0.03605417
4,caus,0.03024306
4,system,0.02502148
4,contain,0.02316866
4,allow,0.02038943


In [16]:
print(get_narrative(4))

Incident ID: 20150231
Operator ID: 22610

From the evening of may 25 through the morning of may 26, 2015 an excessive rainfall event occurred resulting in at least 6 inches of rain in a very short period of time.  This rainfall subsequently resulted in houston being declared a national disaster area.  Even though the roof drain was open during the rain event, enough rainfall collected on the external floating roof of tank 1222 resulting in more weight being on the roof than the roof had the buoyancy to support.   As the rainwater accumulated on the roof, the added weight of the water caused the roof to tilt to one side forcing product up through the mechanical vacuum breaker(s) and the leg sleeves onto the external floating roof.  Over the subsequent days, product was transferred out of tank 1222 and an attempt to land the roof on the floor was made.  During the landing of the roof, a roof leg gouged a 3" by 6" hole in the floor that caused the remaining product bottoms to escape the t

**This topic is related to storms, water and related damages**

In [17]:
gammas[gammas$topic == 4, ]$label <- "water"
gammas[gammas$topic == 4, ]$cause_topic <- T

### Topic 5

In [18]:
get_terms(topic = 5)

topic,term,beta
<int>,<chr>,<dbl>
5,control,0.12494
5,station,0.10759199
5,center,0.07426734
5,shut,0.04409687
5,receiv,0.03922846
5,arriv,0.03634855
5,notifi,0.0324401
5,technician,0.03079444


In [19]:
print(get_narrative(5))

Incident ID: 20140244
Operator ID: 2731

On may 24th 2014 cpl control center sent a shutdown command to p2 at belridge. Control    center confirms that p2 has shut down and both suction/discharge valves are closed and the motor is off.    Control center receives a pump seal alarm on p2 and calls local i&e specialist to respond to belridge station.     Control center receives a call from a third party stating that a pump is on fire in the belridge station. Control center    immediately begins shutting down all flow to and from the station and notifies local supervisor. At 1654 kern county  fire department arrived at belridge station and monitored the fire and did not attempt to extinguish. At 1745 local    i&e specialist arrived on site and observes a small fire on p2 and notices that the pump is still spinning. He immediately     throws the main breaker that feeds all power to the station which stops the pump from spinning and the fire went out.    After the station was safely isolated

**This topic is related to remote operation of pipelines**

In [20]:
gammas[gammas$topic == 5, ]$label <- "control_center"
gammas[gammas$topic == 5, ]$cause_topic <- F

## Topics 6-10

### Topic 6

In [21]:
get_terms(topic = 6)

topic,term,beta
<int>,<chr>,<dbl>
6,line,0.35380484
6,repair,0.06750294
6,leak,0.04448779
6,instal,0.04073355
6,servic,0.04040709
6,clamp,0.02653271
6,complet,0.02628787
6,perman,0.01543321


In [22]:
print(get_narrative(6))

Incident ID: 20110381
Operator ID: 31618

On 09/12/2011 at 13:45 the beaumont terminal operations supervisor received a call from a beaumont area pipeline technician reporting a possible leak on one of the five pipelines that connect motiva par to the enterprise npa refined product terminal. The enterprise pipeline technician reported to the beaumont terminal operation supervisor that a valero pipeline technician had spotted an area of brown grass and could smell hydro carbon odor at the location. The beaumont terminal operations supervisor reported the possible leak to the beaumont area manager and an investigation got under way. Upon arriving at the reported site the beaumont area operations team discovered gasoline product just under the ground service in the p108, p109, p110, p, 111 and p112 pipelines right of way. Samples were taken to the enterprise npa lab and tested,  the product analysis  to concluded that the gravity and n point match that of the product being delivered from 

**This topic is related to service and repair**

In [23]:
gammas[gammas$topic == 6, ]$label <- "service"
gammas[gammas$topic == 6, ]$cause_topic <- F

### Topic 7

In [24]:
get_terms(topic = 7)

topic,term,beta
<int>,<chr>,<dbl>
7,pump,0.21992089
7,seal,0.10250132
7,unit,0.04184313
7,failur,0.04125493
7,station,0.0376522
7,fail,0.037064
7,mixer,0.03199077
7,replac,0.02860861


In [25]:
print(get_narrative(7))

Incident ID: 20140441
Operator ID: 31684

Spill caused by colex#1 evacuation pump seal failure. Approximately 0.5bbls leaked onto rock/gravel in pasadena's transfer header area while performing valve line-up for product delivery. The evacuation pump was not running at the time of seal failure. Operator discovered seal leak while verifying delivery piping's valve line-up.     Terminal personnel secured area by isolating and depressurizing colex#1's evacuation pump. The pump seal stopped leaking once the pump and piping were secured.     The pasadena colex #1 evacuation pump is not a mainline pump and is used exclusively for evacuating the colex #1 pump and associated piping for service or maintenance. Due to the infrequent use and potential for seal failures, the pump was not repaired and will not be placed back into service at this time.     Overall, phillips 66 tracks mechanical seal failures in our sap database and the manufacturer performs a seal failure report for every seal that c

**This topic is related to pumps and their components**

In [26]:
gammas[gammas$topic == 7, ]$label <- "pumps"
gammas[gammas$topic == 7, ]$cause_topic <- T

### Topic 8

In [27]:
get_terms(topic = 8)

topic,term,beta
<int>,<chr>,<dbl>
8,incid,0.04470011
8,procedur,0.03270759
8,employe,0.02551208
8,action,0.02415293
8,result,0.02351333
8,activ,0.02303363
8,mainten,0.02247398
8,review,0.02231408


In [28]:
print(get_narrative(8))

Incident ID: 20130036
Operator ID: 22855

Scheduled maintenance work was being performed to replace the pump 2 discharge flange gasket at the intermediate hugo pump station on minnesota pipe line (mpl). Mpl lines 1, 2 & 3 were isolated from the hugo pump station by closing manual suction & discharge 16" station gate valves, allowing the station to be drained up.  The mechanical field technician ("mft") performed station lockout, then opened drain valve on the pump header to bleed pressure and crude oil to the sump.  There was a temporary obstruction located at the bottom of mpl 1, above the ground suction 16" station gate valve, which stopped the valve from fully closing. The above ground header drain valve also had an obstruction, which released after being fully opened, sending a surge through the drain line.  This resulted in a backup of crude oil in connecting drain lines which sprayed out from under the 4 pump inboard and outboard seal box covers onto the pump base and ground.    

**This topic is related to procedures**

In [29]:
gammas[gammas$topic == 8, ]$label <- "procedures"
gammas[gammas$topic == 8, ]$cause_topic <- T

### Topic 9

In [30]:
get_terms(topic = 9)

topic,term,beta
<int>,<chr>,<dbl>
9,product,0.10367648
9,gallon,0.05547347
9,recov,0.0324493
9,volum,0.02389219
9,estim,0.02376907
9,manag,0.02124503
9,excav,0.0203216
9,approxim,0.01989067


In [31]:
print(get_narrative(9))

Incident ID: 20090288
Operator ID: 2552

The release was discovered by an operator at 0715 on 9/23/09. During a routine check of the stingwater system, the operator observed free product (gasoline) inside the concrete wall surrounding the stingwater oil water seperators and a small amount on the adjacent spillway. Incident was reported at 0730 to the operations manager. Cleanup began immediately and continued through september 28th.  A light sheen was found on a portion of pond 2.  No notifications to the state or nrc were made since the pond is not considered "waters of the state". Groundwater was not impacted as a result of the release and no product or sheen was observed at the outfall of pond 2.  Approximately 45 gallons were recovered using a pneumatic pump inside the concrete berm. Diapers in the area collected another 4 gallons.  A very light sheen was observed on an approximately 15' x 80' area of the pond. Hard boom was used to contain the sheen and consolidate it to the area 

**This topic is related to the commodity transported**

In [32]:
gammas[gammas$topic == 9, ]$label <- "commodity"
gammas[gammas$topic == 9, ]$cause_topic <- F

### Topic 10

In [33]:
get_terms(topic = 10)

topic,term,beta
<int>,<chr>,<dbl>
10,test,0.04275311
10,identifi,0.04038685
10,pressur,0.02931882
10,sourc,0.02870817
10,time,0.02519693
10,locat,0.02191469
10,repair,0.0204644
10,monitor,0.02031174


In [34]:
print(get_narrative(10))

Incident ID: 20160073
Operator ID: 39191

February 16, 2016-fuel oil pipeline (fopl)id#39191 was in normal operation, transferring 109 gpm #6 fuel oil (product) from 74th street station (manhattan) to ravenswood station (queens) at 23.3 psig at ravenswood station and 25.5 psig at 74th street at the time of product release discovery. Discovery of product was from an electrical facility underground vault structure id mh-16261, located 28 feet east of pipeline. Pipeline was immediately shutdown and notifications made. Site area cleanup commenced immediately with the onsite ravenswood environmental response contractor. Pipeline was excavated following utility mark out . Leak location was discovered on february 27, 2016. Leak was characterized as 2- 1/8" diameter pinholes in the extrados of a buried 90 degree pipeline elbow. The foam glass insulation and corrosion prevention external coating were observed as displaced and disbonded.   April 19- new elbow section was bench pressure tested an

**This topic is related to testing and monitoring**

In [35]:
gammas[gammas$topic == 10, ]$label <- "monitoring"
gammas[gammas$topic == 10, ]$cause_topic <- T

## Topics 11-15

### Topic 11

In [36]:
get_terms(topic = 11)

topic,term,beta
<int>,<chr>,<dbl>
11,sump,0.08913927
11,pressur,0.07114799
11,relief,0.04678481
11,flow,0.03808903
11,meter,0.03104244
11,system,0.02871857
11,caus,0.02362104
11,thermal,0.02159703


In [37]:
print(get_narrative(11))

Incident ID: 20190288
Operator ID: 39785

On 8/18/2019, the customer delivery point downstream of the robstown to ingleside 20" valve/meter site located inside the robstown terminal unexpectedly shut in with no notice given to epic operations prior to the shut in. When the delivery point shut in it caused a back-pressure on the pipeline system that pressured the pipeline system above 1000 psig (but did not exceed the mop of 1440) and a thermal relief on the system set at 1000 psig  actuated and vented into a sump at the site, and the sump reached capacity and overflowed into a construction ditch surrounding the sump. The thermal relief failed to close after the pressure fell below the 1000 psig set-point. The accident took place in the early morning hours on a sunday and no operations personnel were on site at the time. One of the project managers was performing a station check on his unrelated project, and found the sump overflowing and notified operations and the thermal relief was s

**This topic is related flow, pressure and relief**

In [38]:
gammas[gammas$topic == 11, ]$label <- "pressure"
gammas[gammas$topic == 11, ]$cause_topic <- T

### Topic 12

In [39]:
get_terms(topic = 12)

topic,term,beta
<int>,<chr>,<dbl>
12,pipe,0.22118051
12,corros,0.08081936
12,section,0.05401054
12,intern,0.04622982
12,pinhol,0.02581505
12,extern,0.02350394
12,remov,0.02057654
12,replac,0.02011432


In [40]:
print(get_narrative(12))

Incident ID: 20150393
Operator ID: 300

Approximately 4 bbls of crude oil were released from dead leg piping on the suction piping from tank 1204 as a result of internal corrosion. There is a station 'dead leg' pipe removal project budgeted for cy-2016, which included this 'dead leg' end of the pipe. As a result of the internal corrosion this 'dead leg' end of the pipe has been removed.


**This topic is related to corrosion of pipes**

In [41]:
gammas[gammas$topic == 12, ]$label <- "corrosion"
gammas[gammas$topic == 12, ]$cause_topic <- T

### Topic 13

In [42]:
get_terms(topic = 13)

topic,term,beta
<int>,<chr>,<dbl>
13,flang,0.06248347
13,instal,0.04835218
13,connect,0.04446813
13,gasket,0.04182368
13,fit,0.04041882
13,tube,0.03661741
13,plug,0.03232018
13,nippl,0.02934517


In [43]:
print(get_narrative(13))

Incident ID: 20130289
Operator ID: 31189

On july 26, 2013 a bp pipeline technician upon exiting the pipeline office identified a loss of primary containment of crude oil in the blake pump station of the bp #1 crude system. The technician contacted the control center which immediately shut down the pipeline.  Analysis has determined that a partially buried valve experienced a mechanical failure of a threaded connection in its cast body.  It is believed that a pre-existing flaw in the 1950¿s vintage valve body casting reduced the threaded engagement length, diminishing the frictional forces retaining the ½¿ plug allowing it to back out. The impacted area was remediated and the valve was replaced.    The metallurgical cause of the leak could not be conclusively determined primarily because the plug was not recovered from the leak site. Five possible failure scenarios were considered and are discussed below. Failure scenarios 1, 4, and 5 can be ruled out.     Failure scenario 1 ¿ the plug

**This topic is related to gaskets and related components**

In [44]:
gammas[gammas$topic == 13, ]$label <- "gaskets"
gammas[gammas$topic == 13, ]$cause_topic <- T

### Topic 14

In [45]:
get_terms(topic = 14)

topic,term,beta
<int>,<chr>,<dbl>
14,report,0.10358096
14,updat,0.08974287
14,supplement,0.04801678
14,final,0.0403211
14,phmsa,0.03248422
14,time,0.02796566
14,submit,0.02457674
14,accid,0.02168203


In [46]:
print(get_narrative(14))

Incident ID: 20180202
Operator ID: 31888

On november 21, 2017, at 5:15pm, a contract inspector performing an in-service tank inspection at midland station near midland, texas, while working for centurion pipeline, l.p.(Cpl), discovered a pinhole leak on the 12'' isolation valve to tank 1966's suction/fill line. The contract employee reported the leak to cpl's station operator, who initiated a prompt and effective response. However, approximately 40 barrels of crude oil leaked into the tank containment berm with all 40 barrels of crude oil later being recovered by vacuum truck and soil remediation.    Upon investigation, cpl determined the cause of the leak, located in the 6 o'clock position of the valve body, to be internal corrosion due to the presence of water and microbiologically induced corrosion (mic). Both a visual inspection of the corroded area and lab analysis of the solids removed from the corrosion pits provided the information necessary to determine the cause of the leak.

**This topic is related to reporting of the spill**

In [47]:
gammas[gammas$topic == 14, ]$label <- "report"
gammas[gammas$topic == 14, ]$cause_topic <- F

### Topic 15

In [48]:
get_terms(topic = 15)

topic,term,beta
<int>,<chr>,<dbl>
15,pipelin,0.15527622
15,personnel,0.03103073
15,site,0.02980513
15,leak,0.02773692
15,locat,0.02344731
15,creek,0.01387229
15,shutdown,0.01387229
15,shut,0.01356589


In [49]:
print(get_narrative(15))

Incident ID: 20140177
Operator ID: 30782

The black bay oil pipeline is an unregulated gathering line. This accident is being reported because the pipeline is located in an inlet to the gulf of mexico.3/16/14 - t. Baker smith(tbs) was mobilized to venice, la by eh&s consulting services inc. To provide survey services to assist uscg in identifying existing pipelines in the area of the leak.    3/17/14 - tbs crew met with es&h in the morning and traveled to the work site but wind and sea conditions prevented any survey work.  3/18/14 - the crew returned to the site with eh&s, uscg and j&j diving and performed the field work.  3/19/14 - tbs provided gis information for the pipeline located in the area. Identifying there was only one pipeline existing near the structure that ran north-south direction.  Leak was located just north of the structure.  Harvest pipeline company (hpc) was contacted by uscg about a small discharge in black bay located at 29 28¿ 00.97 n -89 30¿ 53.43¿ w.    3/20/1

**This topic is related to the personnels response on site**

In [50]:
gammas[gammas$topic == 15, ]$label <- "response"
gammas[gammas$topic == 15, ]$cause_topic <- F

## Topics 16-20

### Topic 16

In [51]:
get_terms(topic = 16)

topic,term,beta
<int>,<chr>,<dbl>
16,valv,0.36457309
16,close,0.05783387
16,check,0.04355586
16,block,0.02499445
16,oper,0.02285275
16,replac,0.02229749
16,bodi,0.02102833
16,pack,0.02031443


In [52]:
print(get_narrative(16))

Incident ID: 20100248
Operator ID: 30829

Valve would not operate several months ago and operations left the valve partially opened while after attempting to operate it, they did not realize they left it partially opened.  Two months later, a lighter crude was put through the valve, this product ate away at the plug of heavier crude that had clogged up the valve causing the lighter product to flow through the valve that was partially left opened due to the clog and to subsequently leak; then to overfill the sump.  Operated the valve to fully closed.    Valve appeared to be closed because when operations tried to open the valve which was underground, no product went through it and continued to hold product back for several months.  Operations concluded that the valve was closed and did not realize it was very minutely open until the lighter crude washed out the heavier crude plug.      This was the ball valve drains the terminal piping into the sump.    8-22-2011 - the drain ball valve 

**This topic is related to valves and related equipment**

In [53]:
gammas[gammas$topic == 16, ]$label <- "valve"
gammas[gammas$topic == 16, ]$cause_topic <- T

### Topic 17

In [54]:
get_terms(topic = 17)

topic,term,beta
<int>,<chr>,<dbl>
17,leak,0.24252745
17,found,0.05218967
17,trap,0.03494195
17,clean,0.03187764
17,ring,0.03038926
17,pig,0.02802536
17,discov,0.02636187
17,replac,0.0255739


In [55]:
print(get_narrative(17))

Incident ID: 20150038
Operator ID: 32147

A technician found oil had leaked out the enclosure door of a strainer during a routine facility inspection.  Three strainers at st. James capline went through annual preventative maintenance on thursday january 16, 2015. While cleaning the strainers, a technician found the o-rings protruded over the lip of the groove and was crushed flat. The o-ring size could not be confirmed due to damage and age. Manufacturer specifications were unknown at the time. Based on field measurements it was determined that the o-rings needed to be 1/4 inch and a 1/4 inch o-ring was installed on each strainer. On january 19 a technician found oil had leaked out of the enclosure door of strainer 212. Thirty-one gallons were found below the strainer on the ground. The manufacturer was contacted and confirmed that the specification was for a 1/4 inch o-ring. It was determined from in-depth measurements that the groove had irregularities, which would not allow the corr

**This topic is related to the leak and clean up**

In [56]:
gammas[gammas$topic == 17, ]$label <- "leak"
gammas[gammas$topic == 17, ]$cause_topic <- F

### Topic 18

In [57]:
get_terms(topic = 18)

topic,term,beta
<int>,<chr>,<dbl>
18,soil,0.15333931
18,remov,0.08312348
18,contamin,0.05840504
18,remedi,0.05817331
18,site,0.05137574
18,impact,0.05091226
18,dispos,0.0336866
18,sampl,0.02936087


In [58]:
print(get_narrative(18))

Incident ID: 20180375
Operator ID: 39410

Valve manually operated by livestock. The livestock entered through an open gate.   We determined the cause based on cattle footprints observed around the valve. All other logical explanations were eliminated. Completing excavation and blending remediation currently. Re-seed after completion per university lands requirements.     Stingray environmental and construction began excavation on september 21,  2018, to bring all soi.ls above one percent total petroleum hydrocarbons (tph) to the  surface as per swr rule 91(c) 3. The spill was excavated to below rrc regulations,  which yielded approximately 5,861 cubic yards of spill-impacted soil over 1 % tph.  Confirmation soil samples of the excavation bottom were collected for laboratory  analysis oftph using method tx1005 in accordance with rrc guidelines.    Approximately 5,861 cubic yards of spoil material was blended with  approximately 2,500 cubic yards of clean soil from the surrounding area t

**This topic is related to soil, contamination and cleanup**

In [59]:
gammas[gammas$topic == 18, ]$label <- "contamination"
gammas[gammas$topic == 18, ]$cause_topic <- F

### Topic 19

In [60]:
get_terms(topic = 19)

topic,term,beta
<int>,<chr>,<dbl>
19,releas,0.32112348
19,approxim,0.11096079
19,barrel,0.04765673
19,contain,0.04187087
19,result,0.02732115
19,properti,0.0263852
19,station,0.02612994
19,immedi,0.02366245


In [61]:
print(get_narrative(19))

Incident ID: 20190084
Operator ID: 300

Release occurred due to an insulating flange gasket failure on the fill line of tank 1704, resulting in a release of crude oil.  Oil released was confined within secondary containment on paa property.  The date and time (02/14/19, 18:00) provided for the release is reflective of the last time the area of the incident had been visited, and is therefore the earliest possible time that the release could have started.


**This topic is related to the release of oil**

In [62]:
gammas[gammas$topic == 19, ]$label <- "release"
gammas[gammas$topic == 19, ]$cause_topic <- F

### Topic 20

In [63]:
get_terms(topic = 20)

topic,term,beta
<int>,<chr>,<dbl>
20,oil,0.24599413
20,crude,0.15974581
20,spill,0.04922203
20,approxim,0.04408524
20,bbl,0.04002552
20,truck,0.03944556
20,barrel,0.03182323
20,ground,0.02262672


In [64]:
print(get_narrative(20))

Incident ID: 20180323
Operator ID: 31871

Markwest operation's personnel had loaded a pig into the launcher barrel to be pushed with nitrogen to displace the crude oil in the pipeline for abandonment.  A 2" drain valve on the launcher barrel was accidentally left partially open allowing crude oil to fill the sump.  Eight (8) barrels of crude oil spilled out onto the ground, but was contained on our launcher property.  A vacuum truck was on-site for the operation and it was used to sucked up the spilled crude oil that was returned the pipeline system.  The contaminated soil was sucked up separately and placed in a contained vessel for testing and non-hazardous disposal.


**This topic contains words to describe a spill**

In [65]:
gammas[gammas$topic == 20, ]$label <- "spill"
gammas[gammas$topic == 20, ]$cause_topic <- F

## Topics 21-23

### Topic 21

In [66]:
get_terms(topic = 21)

topic,term,beta
<int>,<chr>,<dbl>
21,fire,0.03924397
21,respons,0.03652385
21,emerg,0.03479286
21,gasolin,0.02918775
21,hour,0.02440693
21,fuel,0.02119224
21,depart,0.02045039
21,employe,0.02028553


In [67]:
print(get_narrative(21))

Incident ID: 20090367
Operator ID: 1845

On december 29, 2009 at 10:00 p.m., buckeye partners’ (buckeye) control center in breinigsville, pa received a call from the aston, pa fire marshall reporting gasoline odors in the area of clearview lane in aston, pa. Buckeye’s 8” pipeline in the area (ct553jp) had been shut down earlier in the day as part of a normal, scheduled shut down and the pipeline pressure was being monitored. After notification from buckeye‘s control center, field personnel arrived at the site and confirmed the presence of gasoline odors.     At 12:45 a.m. On december 30, 2009, a representative of the pennsylvania department of environmental protection (padep) measured gas odors in several adjacent sanitary sewer manholes  as well as a few nearby residences. The fire department requested the evacuation of four residences as a precaution until the vapor levels subsided.  At 1:40 a.m. On december 30, 2009 a unified command was established by buckeye’s local operations man

**This topic is related to fire and emergencies**

In [68]:
gammas[gammas$topic == 21, ]$label <- "fire"
gammas[gammas$topic == 21, ]$cause_topic <- T

### Topic 22

In [69]:
get_terms(topic = 22)

topic,term,beta
<int>,<chr>,<dbl>
22,tank,0.30374284
22,inspect,0.04784685
22,product,0.0452715
22,servic,0.03863803
22,dike,0.02162512
22,floor,0.02045451
22,fill,0.01943998
22,farm,0.01780112


In [70]:
print(get_narrative(22))

Incident ID: 20110009
Operator ID: 4805

Seventy-five gallons of free-phase jet fuel was identified next to a storage tank at explorer pipelines port arthur, texas tank farm. The release was completely contained inside the tank dike. The contents of the tank were transferred to another tank to allow the tank to be cleaned and inspected.    Following cleaning the tank, the tank valve was removed and inspected, a visual inspection of the tank floor was completed, a vacuum box test was completed, a helium test that included probing under the tank through the helium test ports was completed, as well as a magnetic particle testing of the corner welds and the dollar plate was completed. In addition, a guided wave test of the entire suction and fill line as well as a hydrostatic test of the suction and fill line under the tank and to the manifold was completed. However, none of the tests have identified the source of the product that was located by the tank.       A may 30, 2012 metallurgical

**This topic is related to tanks and related facilities and equipment**

In [71]:
gammas[gammas$topic == 22, ]$label <- "tanks"
gammas[gammas$topic == 22, ]$cause_topic <- T

### Topic 23

In [72]:
get_terms(topic = 23)

topic,term,beta
<int>,<chr>,<dbl>
23,analysi,0.06005783
23,weld,0.053316
23,failur,0.05192594
23,crack,0.05067488
23,metallurg,0.02377709
23,result,0.01627073
23,inch,0.01606222
23,sleev,0.01481116


In [73]:
print(get_narrative(23))

Incident ID: 20050073
Operator ID: 12470

Failure occurred in weld located 50 ft from the bank of the kentucky river.  The failure is believed to have been caused by soil subsidence which was caused by high water.  The soil subsidence exerted force on the weld, causing it to crack.  Failed section of pipe has been sent to cc technologies for failure analysis.  The ops failure analysis protocol is being used by cc technologies.  Leak volumes are still estimates subject to upward revision.   5-11-05 the failed section of pipe was taken by cc technologies (cct) to their facility for a failure analysis.  The result of the failure analysis confirmed the presence of a through-wall crack in the 22" girth weld.  The crack extended for 2:30 to the 8:00 orientation, looking downstream, on the bottom of the pipe.  Cct concluded that there were three contributing factors to the pipeline failure as follows: "in conclusion, the three contributing factors in the pipeline failure were a bending stress

**This topic is related to cracks and other kinds of failures**

In [74]:
gammas[gammas$topic == 23, ]$label <- "crack"
gammas[gammas$topic == 23, ]$cause_topic <- T

## Save results

In [75]:
head(gammas)

incident_ID,ID,year,commodity,on_offshore,topic,gamma,label,cause_topic
<chr>,<chr>,<dbl>,<chr>,<chr>,<int>,<dbl>,<chr>,<lgl>
20040015,19237,2004,rpp,onshore,1,0.03150599,report_mngt,False
20040015,19237,2004,rpp,onshore,2,0.03150599,spill_mngt,False
20040015,19237,2004,rpp,onshore,3,0.04599874,excavation,True
20040015,19237,2004,rpp,onshore,4,0.17643352,water,True
20040015,19237,2004,rpp,onshore,5,0.03150599,control_center,False
20040015,19237,2004,rpp,onshore,6,0.03150599,service,False


In [76]:
table(unique(paste(gammas$topic, gammas$label)))


   1 report_mngt    10 monitoring      11 pressure     12 corrosion 
               1                1                1                1 
      13 gaskets        14 report      15 response         16 valve 
               1                1                1                1 
         17 leak 18 contamination       19 release     2 spill_mngt 
               1                1                1                1 
        20 spill          21 fire         22 tanks         23 crack 
               1                1                1                1 
    3 excavation          4 water 5 control_center        6 service 
               1                1                1                1 
         7 pumps     8 procedures      9 commodity 
               1                1                1 

In [77]:
labels <- distinct(select(gammas, topic, label, cause_topic))
labels$topic <- as.character(labels$topic)
head(labels)

topic,label,cause_topic
<chr>,<chr>,<lgl>
1,report_mngt,False
2,spill_mngt,False
3,excavation,True
4,water,True
5,control_center,False
6,service,False


In [78]:
head(gammas)

incident_ID,ID,year,commodity,on_offshore,topic,gamma,label,cause_topic
<chr>,<chr>,<dbl>,<chr>,<chr>,<int>,<dbl>,<chr>,<lgl>
20040015,19237,2004,rpp,onshore,1,0.03150599,report_mngt,False
20040015,19237,2004,rpp,onshore,2,0.03150599,spill_mngt,False
20040015,19237,2004,rpp,onshore,3,0.04599874,excavation,True
20040015,19237,2004,rpp,onshore,4,0.17643352,water,True
20040015,19237,2004,rpp,onshore,5,0.03150599,control_center,False
20040015,19237,2004,rpp,onshore,6,0.03150599,service,False


In [79]:
write_rds(labels, data_folder("labels.rds"))

In [80]:
incidents <- readRDS(data_folder("incidents_merged.rds"))
incidents_gammas <- gammas %>%
    select(incident_ID, topic, gamma) %>%
    pivot_wider(incident_ID, names_from = topic, values_from = gamma, names_prefix = "topic_") %>%
    right_join(incidents) %>%
    select(-starts_with("topic_"), starts_with("topic_")) # Move topic columns to back
head(incidents_gammas)

Joining, by = "incident_ID"



incident_ID,DATAFILE_AS_OF,significant,serious,ID,name,state,on_offshore,system,item,⋯,topic_14,topic_15,topic_16,topic_17,topic_18,topic_19,topic_20,topic_21,topic_22,topic_23
<chr>,<dttm>,<lgl>,<lgl>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,⋯,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
20040015,2020-09-30 05:25:45,False,False,19237,Te Products Pipeline,AR,onshore,Above Ground Storage Tank,Other,⋯,0.03150599,0.03150599,0.04599874,0.03150599,0.03150599,0.03150599,0.03150599,0.03150599,0.06049149,0.03150599
20040018,2020-09-30 05:25:45,True,False,300,All American Pipeline,TX,onshore,"Pump/Meter Station; Terminal/Tank Farm Piping And Equipment, Including Sumps",Pump,⋯,0.02587992,0.02587992,0.04968944,0.03778468,0.02587992,0.03778468,0.02587992,0.02587992,0.02587992,0.10921325
20040025,2020-09-30 05:25:45,True,False,2371,Ciniza Pipeline,NM,onshore,"Onshore Pipeline, Including Valve Sites",Body Of Pipe,⋯,0.03952569,0.05770751,0.03952569,0.03952569,0.03952569,0.03952569,0.03952569,0.03952569,0.03952569,0.03952569
20040026,2020-09-30 05:25:45,False,False,9175,Jayhawk Pipeline,KS,onshore,"Pump/Meter Station; Terminal/Tank Farm Piping And Equipment, Including Sumps",Pump,⋯,0.03396739,0.03396739,0.03396739,0.03396739,0.03396739,0.03396739,0.04959239,0.03396739,0.04959239,0.03396739
20040027,2020-09-30 05:25:45,False,False,2552,Colonial Pipeline,NJ,onshore,"Pump/Meter Station; Terminal/Tank Farm Piping And Equipment, Including Sumps",Valve,⋯,0.02885375,0.03794466,0.09249012,0.02885375,0.04703557,0.01976285,0.01976285,0.02885375,0.02885375,0.01976285
20040028,2020-09-30 05:25:45,False,False,31174,Shell Pipeline,CA,onshore,"Pump/Meter Station; Terminal/Tank Farm Piping And Equipment, Including Sumps",Pump,⋯,0.03606719,0.04743083,0.03606719,0.03606719,0.05879447,0.03606719,0.03606719,0.11561265,0.03606719,0.02470356


In [81]:
write_rds(incidents_gammas, data_folder("incidents_topics.rds"))