Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Figure out the proper timezones for DateTimeField values #1457
After inspecting the raw .csv files (using agate), I've found seven DateTimeFields with values where the time part is something other than 12:00:00 AM. Though in all but one case, these real times are a small percent of all non-null values in each column:
To properly preserve this data, we need to know the timezone that goes along with each of these time values. But the raw data doesn't include any timezone info, and I haven't yet found anything about timezones for time values in the official documentation.
There are two possibilities I can imagine:
I suppose it's also possible that not all of these fields fall under the same scenario.
Short of asking one of the CAL-ACCESS admins about this, I started looking into each of these fields:
CVR_CAMPAIGN_DISCLOSURE_CD.RPT_DATE: All of the values with real times in this column are on rows where
I was able to pull up some of these filings via the CAL-ACCESS search tool, but the results contain only date info, no time. I checked the Excel export too.
EFS_FILING_LOG_CD.FILING_DATE: This seems like an internal system table for logging electronic filings from third-party vendors. Not sure it's still in use since the most recent
F501_502_CD.RPT_DATE and F501_502_CD.EXECUTE_DT: Both of these fields contain values with real times that are as recent as March 2016. Form 501 is the Campaign Intention Statement and Form 502 is the Campaign bank account statement. But these forms don't seem to be available via the CAL-ACCESS PDFgen program, throws the CGI error like this one.
FILER_ETHICS_CLASS_CD.ETHICS_DATE: This table contains a record for each lobbyist training class, and the
FILER_ADDRESS_CD.EFFECT_DT: This is a table of address history for filers, and the
FILER_INTERESTS_CD.EFFECT_DATE: This is the table that links filers to their interests (e.g., "AGRICULTURE", "FINANCE/INSURANCE"). The
tl;dr: I believe there are only as many as four DateTimeFields where times are actively being collected:
And the values with real times are very small percentage of non-NULL values in these fields.