Skip to content

Conversation

@krivard
Copy link
Contributor

@krivard krivard commented Jan 6, 2022

Description

We now get an additional file in the drop for flu inpatient.

Changelog

Itemize code/test/documentation changes and files added/removed.

  • download_ftp_files.py: update expected file count

Fixes

We now get an additional file for flu inpatient.
@krivard krivard added the bug Something isn't working label Jan 6, 2022
@jingjtang
Copy link
Contributor

@krivard Was the inpatient Flu data added for all the dates before 2021-12-27 at once? The first inpatient Flu data that I can see is for 2021-12-22 which is not 2021-12-27 or 2021-12-28(the problematic date).

@krivard
Copy link
Contributor Author

krivard commented Jan 6, 2022

The new files were added one at a time during the normal upload cycle (no backfill).

The first inpatient flu file was uploaded on 2021-12-23 at 5pm, and the CHNG pipeline would have run at 10:30pm that day. According to the logs, 2021-12-23 is the first day that the exception in #1440 occurred, and it occurred each day after that as well (including the 27th and 28th).

The most recent CHNG issue that we have in covidcast is 20211222.

I'm not sure whether that answers your question -- was there something else I'm missing?

@krivard
Copy link
Contributor Author

krivard commented Jan 6, 2022

@jingjtang Updated implementation & fixed tests

@jingjtang
Copy link
Contributor

The new files were added one at a time during the normal upload cycle (no backfill).

The first inpatient flu file was uploaded on 2021-12-23 at 5pm, and the CHNG pipeline would have run at 10:30pm that day. According to the logs, 2021-12-23 is the first day that the exception in #1440 occurred, and it occurred each day after that as well (including the 27th and 28th).

The most recent CHNG issue that we have in covidcast is 20211222.

I'm not sure whether that answers your question -- was there something else I'm missing?

Yeah, that makes sense. Thanks!

# When too many new files are present, AssertionError
file_batch = [
self.FileAttr(dt.timestamp(dt.now()), f"00001122_foo{i}")
for i in range(EXPECTED_FILES_PER_DROP + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there should be for i in range(1, EXPECTED_FILES_PER_DROP + 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider the following: if EXPECTED_FILES_PER_DROP=7, then for this test we want to construct a list of 8 files.

>>> len(list(range(7+1)))
8
>>> len(list(range(1, 7+1)))
7

Then we should use the former.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Thanks.

Copy link
Contributor

@jingjtang jingjtang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

# When too many new files are present, AssertionError
file_batch = [
self.FileAttr(dt.timestamp(dt.now()), f"00001122_foo{i}")
for i in range(EXPECTED_FILES_PER_DROP + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. Thanks.

@krivard krivard merged commit 3703b60 into main Jan 6, 2022
@krivard krivard deleted the krivard/chng-flu-inpatient branch January 6, 2022 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chng failing with assert len(filepaths_to_download) <= 6, "more files dropped than expected" AssertionError: more files dropped than expected

3 participants