Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate the gathering of county limit data #108

Merged
merged 12 commits into from
Feb 2, 2017
Merged

Conversation

higs4281
Copy link
Member

@higs4281 higs4281 commented Jan 25, 2017

This annual task involves downloading fixed-width files from HUD,
formatting them, combining them and producing a final file that can be
loaded into the owning-a-home API. Included here are a script to
automate that process and a management command to run the script.

Testing

  • Run the new manage.py command for 2017:
python manage.py gather_limit_data --year 2017

It will fetch, process and save CSVs, but git status should show no changes, because the data files have already been updated for this year.

If you run the command without a year, it will default to seeking next year's data files, which are not yet available. So you should see the failure message beginning with "Script failed to process all files," and no files should be saved.

python manage.py gather_limit_data

@mthibos If stars align, we should be able to run the bare command next December and grab 2018 values.

This annual task involves downloading fixed-width files from HUD,
formatting them, combining them and producing a final file that can be
loaded into the owning-a-home API. Included here are a script to
automate that process and a management command to run the script.
@higs4281 higs4281 requested a review from chosak January 25, 2017 13:52
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 99.245% when pulling 42463a4 on data-automation2 into 8a92993 on master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.1%) to 99.245% when pulling fb1edda on data-automation2 into 8a92993 on master.

…nd-alone instance with csvkit installed or in a cfgov environment with no csvkit
@coveralls
Copy link

Coverage Status

Coverage increased (+0.003%) to 99.365% when pulling 1fb03ad on data-automation2 into 8a92993 on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.004%) to 99.366% when pulling 34d2541 on data-automation2 into 8a92993 on master.

return final_data


def get_chums_data(year=(datetime.date.today().year + 1)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be an edge case, but be careful assigning default values like this. The datetime.date.today().year call happens only once, when the module gets imported, not every time the method gets called with no year argument. It's unlikely to happen, but I have done something similar in the path with

def do_something_based_on_date(date=datetime.date.today()):
    ...

and then spent a bunch of time debugging why my jobs that happened to run around midnight gave strange results. A safer pattern might be to do something like

def get_chums_data(year=None):
    year = year or datetime.date.today().year
    ...

"Data can be loaded with this command: \n"
"`python manage.py load_county_limits "
"data/county_limit_data_latest.csv --confirm=y`")
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From PEP8:

A bare except: clause will catch SystemExit and KeyboardInterrupt exceptions, making it harder to interrupt a program with Control-C, and can disguise other problems. If you want to catch all exceptions that signal program errors, use except Exception: (bare except is equivalent to except BaseException: ).

@coveralls
Copy link

Coverage Status

Coverage increased (+0.06%) to 99.424% when pulling 03c9dd9 on data-automation2 into 8a92993 on master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.06%) to 99.424% when pulling d12c6e1 on data-automation2 into 8a92993 on master.

@higs4281
Copy link
Member Author

higs4281 commented Feb 2, 2017

Thank you @chosak for the suggestions.

@higs4281 higs4281 merged commit e3ca5d6 into master Feb 2, 2017
@higs4281 higs4281 deleted the data-automation2 branch February 2, 2017 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants