Skip to content

Conversation

@chinandrew
Copy link
Contributor

Since nowcasting uses sensor values computed as_of certain dates, the issue dates need to be the as_of dates and not datetime.today() like other acquisitions. These as_of dates are specified during sensorization (it calls the covidcast as_of indicator data to generate the sensors) and added to the files to be ingested as a new column.

cc @mariajahja

@chinandrew chinandrew requested a review from krivard January 26, 2021 21:13
@krivard
Copy link
Contributor

krivard commented Jan 27, 2021

Just to confirm, when running in production you expect each line of the CSV might have a different issue date?

@chinandrew
Copy link
Contributor Author

Just to confirm, when running in production you expect each line of the CSV might have a different issue date?

That's supported here, though the way it's currently implemented only 1 issue gets produced at a time so the CSV files should be homogeneous in that field

@krivard
Copy link
Contributor

krivard commented Jan 27, 2021

If you don't have a use case that needs a different issue per line, I'd prefer putting the issue into the filename instead of the file content -- saves a ton on disk space for the archived csvs.

@chinandrew
Copy link
Contributor Author

If you don't have a use case that needs a different issue per line, I'd prefer putting the issue into the filename instead of the file content -- saves a ton on disk space for the archived csvs.

That makes sense to me. Since I'm sharing the covidcast CsvImporter that auto-detects source/geo/signal from the filename, would you recommend embedding the issue into signal name and parsing it out later (e.g. make the signal name _), adjusting CsvImporter, ,writing new code to parse, or something else

@krivard
Copy link
Contributor

krivard commented Jan 27, 2021

Borrow the issue-specific importer instead 😄

def find_issue_specific_csv_files(scan_dir, glob=glob):

Format is issue_{issue}/{source}/{normal covidcast filename}

@chinandrew
Copy link
Contributor Author

Borrow the issue-specific importer instead

def find_issue_specific_csv_files(scan_dir, glob=glob):

Format is issue_{issue}/{source}/{normal covidcast filename}

Oh cool, didn't know about that. Thanks for pointing it out, I'll work on integrating it now

@chinandrew
Copy link
Contributor Author

@krivard added issue specific csv importer, ready for review again

Co-authored-by: Katie Mazaitis <krivard@cs.cmu.edu>
Copy link
Contributor

@krivard krivard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xlnt

@krivard krivard merged commit 92742d5 into cmu-delphi:main Jan 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants