# Overview
Regulations.gov hosts the publicly submitted comments that are part of Sec. Zinke's monuments review. You can browse the comments there, and download a CSV that lists the comment numbers. This project downloads these comments into a CSV dataset using the Regulations.gov dataset.

[Docket Browser](https://www.regulations.gov/docketBrowser?rpp=25&so=DESC&sb=commentDueDate&po=0&dct=PS&D=DOI-2017-0002)

[Docket Description](https://www.regulations.gov/document?D=DOI-2017-0002-0001)

# Instructions
To use this code yourself, you'll need to register for a free API key here: [API Key Request](https://regulationsgov.github.io/developers/key/)

Update the first cell below with your API key.

## Environment
* Python 3

## Example Document
```json
{'agencyAcronym': 'DOI',
 'allowLateComment': False,
 'attachmentCount': 0,
 'commentDueDate': '2017-07-10T23:59:59-04:00',
 'commentStartDate': '2017-05-11T00:00:00-04:00',
 'commentText': "National Monuments are an important part of America's history and we need to protect more land for future generations, not less. ",
 'docketId': 'DOI-2017-0002',
 'docketTitle': 'Review of Certain National Monuments Established Since 1996; Notice of Opportunity for Public Comment',
 'docketType': 'Nonrulemaking',
 'documentId': 'DOI-2017-0002-0007',
 'documentStatus': 'Posted',
 'documentType': 'Public Submission',
 'numberOfCommentsReceived': 1,
 'openForComment': True,
 'postedDate': '2017-05-11T00:00:00-04:00',
 'title': 'Comment on FR Doc # 2017-09490'}
```

In [None]:
import os
import requests
import json
import csv

YOUR_API_KEY = 'set your api key as an environmentla variable, or here'

API_KEY = os.getenv('API_KEY', YOUR_API_KEY)

In [None]:
doc_count_url = 'https://api.data.gov:443/regulations/v3/documents.json?api_key=%s&countsOnly=1&dct=PS&dktid=DOI-2017-0002' % API_KEY
r = requests.get(doc_count_url)
if r.status_code == 200:
    result = r.json()
    records = result['totalNumRecords']
    print('Records available: ', str(records))

In [None]:
# this will break if the number of posted comments exceeds 1m
# since there's a rate limie of 1000 queries per hour
# If breaks, can ask Reg.gov to increase cap or modify code
# to run less than 1k queries per hour

documents_url = 'https://api.data.gov:443/regulations/v3/documents.json?api_key=%s&dct=PS&dktid=DOI-2017-0002&rpp=1000&po=%s&sb=docId&so=ASC'
documents = list()

offset = 0
while offset < records :
    print 'Downloading comments {0}-{1} ({2:.1f}%)'.format(offset, offset+999, float(offset) / float(records) * 100)
    r = requests.get(documents_url % (API_KEY, offset))
    if r.status_code == 200:
        result = r.json()
        documents = documents + result['documents']
        offset += 1000
    else:
        raise Exception("non 200 response code").with_traceback(tracebackobj)
        
print 'Done!'

In [None]:
import csv
with open('dataset/comments.csv', 'w') as f:
    field_names = ['documentId', 'postedDate', 'attachmentCount', 'commentText']
    writer = csv.DictWriter(f, field_names, extrasaction='ignore')
    writer.writeheader()
    writer.writerows(documents)