Purpose:
    1. Download all necessary data from the Fire and Aviation Management Web Application (FAMWEB) website
    2. Unpack all data
    3. Select necessary dataframes from each Access Database to be fed into the cleaning routine.

Import all necessary packages

In [1]:
import os.path as op
import os
import matplotlib.pyplot as plt
from urllib.request import urlretrieve
import subprocess
import fnmatch
import stat
import pypyodbc
import csv

plt.style.use('ggplot')

Create a list of all web URLs to pull the data from

In [4]:
data_urls = []
for i in range(1999,2018):
        data_urls.append('https://fam.nwcg.gov/fam-web/sit/sit_' + str(i) + '.exe')

Create a directory that the raw ICS-209 data will live.  

In [5]:
download_dirs = './raw'
if not op.exists(download_dirs):
    os.makedirs(download_dirs)

In [6]:
# For every line in the file
for url in data_urls:
    # Split on the rightmost / and take everything on the right side of that
    name = url.rsplit('/', 1)[-1]

    # Combine the name and the downloads directory to get the local filename
    filename = os.path.join(download_dirs, os.path.basename(name))

    # Download the file if it does not exist
    if not os.path.isfile(filename):
        urlretrieve(url, filename)

Unpack all of the exe datafiles

In [9]:
def filetree(top):
    for dirpath, dirnames, fnames in os.walk(top):
        for fname in fnames:
            if fnmatch.fnmatch(fname, '*.exe'):
                yield os.path.join(dirpath, fname)
                        
files_to_unpack = list(filetree(download_dirs))

# changes the permissions on the files so there is read, write, and execute
for i in files_to_unpack:
    st = os.stat(i)
    os.chmod(i, st.st_mode | stat.S_IEXEC)

Note (on April 5, 2018):

The *.exe* files are not unpacking nicely.  We are throwing an error stating `OSError: [Errno 8] Exec format error: `.  I have a feeling this is because the .exe file is not a typical program, but instead acting more like a zipped file. For now I manually extracted the access database so I can code up the layer extractions.

Below is the code that was not working and the associated error.  An alternative to `suprocess.call` is using `os.sytem`.  When using `os.system` there is no error generated but nothing happens either....

In [10]:
for i in files_to_unpack:
    subprocess.call(i)

OSError: [Errno 8] Exec format error: './raw_ics209/sit_2008.exe'

Assuming that you either manually open each .exe file or got the code snippet above to work, now we can start extracting each of the layers within the Microsoft Access databases for each year that is needed.

In [None]:
unpacked_dir = './unpacked'
if not op.exists(unpacked_dir):
    os.makedirs(unpacked_dir)

In [None]:
# MS ACCESS DB CONNECTION
pypyodbc.lowercase = False
conn = pypyodbc.connect(
    r"Driver={Microsoft Access Driver (*.mdb, *.accdb)};" +
    r"Dbq=./unpacked_ics209/sit_1999.mdb;")

# OPEN CURSOR AND EXECUTE SQL
cur = conn.cursor()
cur.execute("SELECT * FROM Table1");

# OPEN CSV AND ITERATE THROUGH RESULTS
with open('Output.csv', 'w', newline='') as f:
    writer = csv.writer(f)    
    for row in cur.fetchall() :
        writer.writerow(row)

cur.close()
conn.close()