In [None]:
import sys
sys.path.insert(0,'c:/MyDocs/integrated/') # adjust to your setup

%run "catalog_support.py" 
showHeader('Operator Index')

In [None]:
# fetch data set
df = fh.get_df(os.path.join(hndl.sandbox_dir,'workdf.parquet'))

In [None]:
df = df[df.in_std_filtered]

|Explanation of columns in the index|
| :---: |

| Column      | Description |
| :----: | :-------- |
|*names*| Names used for this Operator. The version in '[]' is the 'bgOperatorName'.  Click on the link for details on that company's fracking record in FracFocus.|
|*num_fracks*| Total number of disclosures this company is named as Operator|
|*years*| the years recorded in the disclosures (and the number disclosures in each year)|
|*states*| the states recorded in the disclosures (and the number disclosures for each state)|
|*Water,median*| the median volume of water (gallons) used as carrier|
|*Water,max*| the median volume of water (gallons) used as carrier|

## Operators

Companies in the table below are lumped by the curated field 'bgOperatorName'.  This is our attempt to treat similar names as the same company.  The actual names used in FracFocus as 'OperatorName' are listed here above the 'bgOperatorName' (which is in brackets). 

For Operators, there is typically only one name because it must be an entity registered with FracFocus.  Nevertheless, please let us know if you suspect made a mistake in lumping.

If you are interested in Operator activity in a particular state, try the Operator tables in the [State pages](Open-FF_States_and_Counties.html).

### Company ownership

Within the fracking industry, company ownership is remarkably fluid.  Acquisitions and mergers are quite common.  However, the FracFocus names used may not reflect changes in ownership.  **We do not attempt to adjust the Operator Names to reflect actual ownership** at the time of disclosure.  All values of "OperatorName" are as the operator reported them.

### Contacting operators
Operator names are often not uniquely identifying, that is, there may exist multiple but separate companies that have the same name. Because of that, it can be difficult to contact the company with questions or comment about their disclosures.  While FracFocus has contact information for the companies that submit disclosures, they do NOT release that information.  In their own words from a 2023 email:

>**FracFocus data submission is a state agency requirement for many states and as such we are only authorized to release information provided directly on the individual disclosures.  We cannot release any additional information than what is already supplied on the website.**

When we asked how we might contact the Operator when we found disclosure errors, FracFocus Support responded:

>**If you wish to supply us with the information you have compiled, we will review and attempt to relay it to any active operator logins when possible.**



In [None]:
def make_water(row):
    s = str(th.round_sig(row.TotalBaseWaterVolume,3))
    s += '<br>'
    s += str(th.round_sig(row.TBWV90,3))
    return s

df['year'] = df.date.dt.year.astype(str)
gbOp = df.groupby(['DisclosureId','bgStateName','bgOperatorName','year'],as_index=False)['TotalBaseWaterVolume'].first()
gbOp.bgStateName = gbOp.bgStateName.str.title()

gbOp1 = gbOp.groupby('bgOperatorName',as_index=False)['DisclosureId'].count().rename({'DisclosureId':'num_fracks'},axis=1)

# gbOp2 = gbOp.groupby('bgOperatorName',as_index=False)['year'].agg(['min','max'])
# gbOp2.rename({'min':'yr_min','max':'yr_max'},axis=1,inplace=True)
# gbOp2['years'] = gbOp2.apply(lambda x:make_years(x),axis=1)

gbOp2 = gbOp.groupby(['bgOperatorName','year'],as_index=False)['DisclosureId'].count()
gbOp2['year_cnt'] = gbOp2.year + ' (' + gbOp2.DisclosureId.astype(str) + ')'

gbOpY = gbOp2.groupby('bgOperatorName')['year_cnt'].apply(set).reset_index()
gbOpY['years'] = gbOpY.year_cnt.map(lambda x: th.xlate_to_str(x,sep='<br>'))

gbOp3 = gbOp.groupby(['bgOperatorName','bgStateName'],as_index=False)['DisclosureId'].count()
gbOp3['states_cnt'] = gbOp3.bgStateName + ' (' + gbOp3.DisclosureId.astype(str) + ')'

gbOp4 = gbOp3.groupby('bgOperatorName')['states_cnt'].apply(set).reset_index()
gbOp4['states'] = gbOp4.states_cnt.map(lambda x: th.xlate_to_str(x,sep='<br>'))

# gbOp5 = df.groupby('bgOperatorName')['OperatorName'].agg(lambda x: x.value_counts().index[0])
gbOp5 = df.groupby('bgOperatorName')['OperatorName'].apply(set).reset_index()
gbOp5['names'] = gbOp5.OperatorName.map(lambda x: th.xlate_to_str(x,sep='<br>'))

gbOp6 = gbOp.groupby('bgOperatorName',as_index=False)['TotalBaseWaterVolume'].median()
gbOp6.rename({'TotalBaseWaterVolume':'Water, median (gal)'},axis=1,inplace=True)
# gbOp7 = gbOp.groupby('bgOperatorName',as_index=False)['TotalBaseWaterVolume'].agg(lambda x: np.percentile(x,90))
gbOp7 = gbOp.groupby('bgOperatorName',as_index=False)['TotalBaseWaterVolume'].max()
gbOp7.rename({'TotalBaseWaterVolume':'Water, max (gal)'},axis=1,inplace=True)
mg = pd.merge(gbOp6,gbOp7,on='bgOperatorName')
# mg.fillna(0,inplace=True)
# mg['TBWV'] = mg.apply(lambda x: make_water(x),axis=1)
mg = pd.merge(mg,gbOp1,on='bgOperatorName')
mg = pd.merge(mg,gbOpY,on='bgOperatorName')
mg = pd.merge(mg,gbOp4[['bgOperatorName','states']],on='bgOperatorName')
mg = pd.merge(mg,gbOp5,on='bgOperatorName').sort_values('num_fracks',ascending=False)
mg['link'] = mg.bgOperatorName.map(lambda x: th.getOpLink(x,x))
mg.names = '<center><h3>'+mg.names+'<br><br>'+mg.link+'</h3></center>'

iShow(mg[['names','num_fracks','years','states','Water, median (gal)','Water, max (gal)']].reset_index(drop=True),
      maxBytes=0,columnDefs=[{"width": "150px", "targets": 0}])