## Using the Requests module to get a file

Documentation for Requests is available at https://requests.readthedocs.io/en/latest/

This demonstration simply requests a file from the HHS Open Data portal: https://healthdata.gov/dataset/electronic-health-record-ehr-incentive-program-payments-eligible-hospitals

In this example, we get the file from HHS inspect some interesting information about it, then write the data to a local file.

In [None]:
import requests

In [None]:
r = requests.get('http://dhcs-chhsagency.opendata.arcgis.com/datasets/8e4f3a0c75b9424d888d11c1f949cc32_0.csv?outSR={%22latestWkid%22:3857,%22wkid%22:102100}')

In [None]:
r.status_code

In [None]:
len(r.text)

In [None]:
r.text[0:1000]

In [None]:
with open('nadac.csv','w') as f:
    f.write(r.text)

In [None]:
import json
print(json.dumps(dict(r.headers), indent=4))

### Total payments in this file?

In [None]:
import csv

In [None]:
total = 0

with open('nadac.csv') as f:
    reader = csv.reader(f)

    header = next(reader)
    print(header)
    payments_idx = header.index('Total_payments')
    
    for record in reader:
        total += int(record[payments_idx])



In [None]:
print("CA hospitals have received ${:,.2f} in payments.".format(total))

## Reading internet files with Pandas

Pandas is smart enough to know that when you provide an HTTP url it is supposed to go access that data from the internet.

https://pandas.pydata.org/pandas-docs/version/0.23.4/io.html


In [None]:
import pandas as pd

In [None]:
df = pd.read_csv('http://dhcs-chhsagency.opendata.arcgis.com/datasets/8e4f3a0c75b9424d888d11c1f949cc32_0.csv')

In [None]:
df.head()

In [None]:
df.shape

In [None]:
print("CA hospitals have received ${:,.2f} in payments.".format(df['Total_payments'].sum()))