Skip to content
Replication files for Covert and Sweeney (2019)
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Code and Data for Covert and Sweeney (2019)

The data can be downloaded at this Dropbox link.

To recreate the results and analysis datasets from the raw files, create a data.txt file in the root code directory. This file should just have a single file path to direct to where the data folder has been placed on your hard drive, such as C:/Users/JaneDoe/Dropbox/texas/public. Then, navigate to the code folder on your command line and type make.

  • Mac users should already have make.
  • Windows users can install chocolatey, and then type choco install make on the command line
    • Note that make cannot be run on Windows computers if there is a space in the file path. In this circumstance, the user can create a junction.

To run the code, make sure the following packages are installed in R:

  • data shaping packages: tidyverse, lubridate, readxl, pdftools, sf, fst, tictoc, furrr, raster, lwgeom,
  • tables and figures: knitr, grid, gridExtra, gtable, tigris, broom, kableExtra,
  • regressions: lmtest, sandwich, splines, lfe, modelr, Formula, grf

Raw Data:

  • leases/Active: The last lease transaction on active state owned leases in Texas. This was downloaded in January of 2017 from the General Land Office (GLO) GIS database.
  • leases/Inactive: Last lease transaction on inactive state owned leases in Texas. The data is not available online but was sent by the Texas General Land Office (GLO) in January of 2017.
  • LeaseLandFile: Sent by GLO in February of 2017. This file consists of Control Numbers (administrative number used by the GLO) for each lease. The first two digits of the control number indicate the lease type.
  • tablMineralLeaseAllInfo: Sent by GLO in February of 2017. This dataset consists of the information we have from inactive and active lease files, but also those leases that were not mapped by the GIS department (usually because they are old and irrelevant for the purposes of GLO).
  • bids/final_bids: GLO bids are available online as PDFs. The PDFs were manually entered in and saved as excel sheets.
  • payments/rentals.xlsx: Obtained through a public information request in June of 2017
  • payments/royalties_2019.01.02: Royalty data for leases going back to 1/1/2005. Obtained through a public information request on 1/2/2019
  • prices: historical spot prices for production were downloaded in November 2018 from the EIA. Oil prices can be found here, and gas prices can be found here


  • Basins and Shale_Plays: Downloaded from EIA We selected out the major shale plays, Barnett, Haynesville, Spraberry, Delaware, and Eagle Ford
  • Land_Cover: Downloaded in November 2017 from here. The original dataset includes the entire US. The raster file was clipped in ArcGIS to just Texas.
  • infrastructure/txdot-roads_tx: Road data was downloaded in August 2017 from the Transportation Planning and Programming (TPP) Division of the Texas Department of Transportation (TxDOT), who maintain a spatial dataset of roadway polylines for planning and asset inventory purposes, found here This includes data associate with On-System highways, County Roads, Functional Classified City Streets, Toll Roads and Local Streets.
  • infrastructure/usgs-rivers_tx: Downloaded in September 2018 from the U.S. Geological Survey. This National Hydrography Dataset is a comprehensive set of digital spatial data that encodes information about naturally occurring and constructed bodies of water, paths through which water flows, and related entities.
  • us_county: Census county shapefiles downloaded in January 2017 from the census website

Intermediate Data

This folder contains datasets that involved manual entry to create. We have only included the final .Rda files.

  • addenda: While GLO state leases are uniform, RAL leases often have additional addenda clauses. These RAL lease contract terms can be found in the public contracts online, and were inputted and categorized manually in August 2018.
  • assignments: obtained on December 5th, 2017 through a public information request. We manually fixed around 200 assignment dates that were inputted incorrectly, where assignment dates are earlier than effective dates
  • coversheets: All RAL/GLO contracts are online page, and all RAL leases should have a review sheet that contains information on the proposed terms of a lease. The terms from the review sheet were manually entered in in July 2017.
  • recommended_rentals: this also was manually entered in from the coversheets, and shows what GLO recommended delay rentals should be for a particular lease.
  • company_names: dictionary of firm names used to standardize lessee names
  • texas_grids: grids are created using the bounding box of our parcels data
  • glo_notices: GLO notices are available from the same source as the bids. We used a combination of R and manual input to read in the PDFs
  • river_streams.Rda: .Rda version of river streams, from the source for infrastructure/usgs-rivers_tx, listed under Shapfiles. The size of the original river streams .shp file was too large to upload for public use.
  • leases: manual corrections to lease variables

Private Data

  • parcels: Public School Land parcel level data was purchased from P2E in December 2017. While we cannot publish the raw data, we have included the following code which cleans and analyzes the data: Data_Cleaning/clean_parcels.R, Anlaysis/parcel_selection.R, and Analysis/parcel_stats.R.
  • DI wells: We also have data from DrillingInfo on location of wells, which we use in the parcels cleaning process. If a user has access to DI data and would like to replicate how we use this dataset, we can email the user our code.
You can’t perform that action at this time.