You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I really like your idea, but I thought one thing that could be improved was getting data that corrects for changing testing volume. The site www.rt.live provides this, and the code below, specifically get_dates_positives_totals_and_deaths_rt_live(), can be simply substituted for get_DNC(). The new predictions are similar, but slightly lower, and the regression is nicer.
Also,
Thank you for sharing your code!
You might want to refactor it a bit and perhaps upgrade to Python 3 as well. It was difficult to work with.
Thanks again, and hopefully this is helpful.
-Mike
import csv
tmp = sys.argv
try:
state = tmp[1]
except IndexError:
state = "FL"
print 'Entering debug mode with state' + state
def date_to_datetime(my_date):
if isinstance(my_date, (str,)):
my_date = int(my_date.replace('-',''))
if isinstance(my_date, (int,)) :
return datetime(int(my_date/10000), int(my_date/100) % 100, int(my_date) % 100)
else:
raise NotImplementedError
def rt_live_str_to_float(my_str):
if my_str == '':
return 0.0
else:
return float(my_str)
def get_dates_positives_totals_and_deaths_rt_live(my_state, max_days=99,
num_days_to_use_for_aligning_positive_adjusted=7):
# Download data from rt.live, which corrects cases for testing volume
with requests.Session() as s:
download = s.get('https://d14wlfuexuxgcm.cloudfront.net/covid/rt.csv')
decoded_content = download.content.decode('utf-8')
rt_live_csv_reader = csv.reader(decoded_content.splitlines(), delimiter=',')
rt_live_rows = list(rt_live_csv_reader)
my_state_index = rt_live_rows[0].index('region')
my_date_index = rt_live_rows[0].index('date')
my_positive_index = rt_live_rows[0].index('positive_test_adjusted_raw')
my_positive_no_adjustment_index = rt_live_rows[0].index('positive')
my_test_index = rt_live_rows[0].index('new_tests')
my_death_index = rt_live_rows[0].index('new_deaths')
my_dates = list()
my_positives = list()
my_totals = list()
my_deaths = list()
my_positive_no_adjustment = list()
rt_live_rows_sorted_body = rt_live_rows.copy()
rt_live_rows_sorted_body = rt_live_rows_sorted_body[1:]
rt_live_rows_sorted_body.sort()
for rt_live_row in rt_live_rows_sorted_body: #Make sure to skip the header row.
this_date = date_to_datetime(rt_live_row[my_date_index])
if rt_live_row[my_state_index] == my_state or (my_state == "US" and this_date not in my_dates):
my_dates.append(this_date)
my_positives.append(rt_live_str_to_float(rt_live_row[my_positive_index]))
my_positive_no_adjustment.append(rt_live_str_to_float(rt_live_row[my_positive_no_adjustment_index]))
my_deaths.append(rt_live_str_to_float(rt_live_row[my_death_index]))
my_totals.append(rt_live_str_to_float(rt_live_row[my_test_index]))
#TODO: fix this
elif my_state == "US": # And we haven't seen this date before
my_index = my_dates.index(this_date)
my_positives[my_index] += rt_live_str_to_float(rt_live_row[my_positive_index])
my_positive_no_adjustment[my_index] += rt_live_str_to_float(rt_live_row[my_positive_no_adjustment_index])
my_deaths[my_index] += rt_live_str_to_float(rt_live_row[my_death_index])
my_totals[my_index] += rt_live_str_to_float(rt_live_row[my_test_index])
my_cumulative_deaths = [0.0] * len(my_deaths)
for my_cd_i, daily_new_deaths in enumerate(my_deaths):
my_cumulative_deaths[my_cd_i] = my_cumulative_deaths[max(my_cd_i - 1, 0)] + daily_new_deaths
for my_list in my_dates, my_positives, my_totals, my_cumulative_deaths, my_positive_no_adjustment:
my_list.reverse()
num_to_pop = max(0, len(my_list) - max_days)
for popper in range(num_to_pop):
my_list.pop()
# Align most recent x days
positives_total = 0
positives_no_adjustment_total = 0
for jj in range(num_days_to_use_for_aligning_positive_adjusted):
positives_total += my_positives[jj]
positives_no_adjustment_total += my_positive_no_adjustment[jj]
factor_to_multiply_adjusted_positives_so_they_approximate_tested_positives = positives_no_adjustment_total / positives_total
adjusted_positives = [int(x * factor_to_multiply_adjusted_positives_so_they_approximate_tested_positives) for x in my_positives]
adjusted_totals = [int(x * factor_to_multiply_adjusted_positives_so_they_approximate_tested_positives) for x in my_totals]
return my_dates, adjusted_positives, adjusted_totals, my_cumulative_deaths```
The text was updated successfully, but these errors were encountered:
Mike: Thanks for your suggestion! I do not agree with them that this correction is valid.
As I understand, this approach of P/T (P:positive, T:total test) is valid only when sampling is random. This is not true in reality, and it could lead to unreasonable conclusions. For example, their correction suggests that current positive number is smaller than that in April in CA. I do not agree. Another counter example is that imagine a state can manipulate and reduce P/T by forcefully increasing T, while the real positive population is independent of T. In other words, imagine we have T and then intentionally expand it by another T. In the second T, people are less likely to show symptoms, tend to be relatively healthy, so P/T becomes smaller.
I believe a reasonable correction is P/(T^a), where 0<a<1. We don't know the real value, but a<1. In my model a=0.25. At least this effect is partially taken into account in my model.
I really like your idea, but I thought one thing that could be improved was getting data that corrects for changing testing volume. The site www.rt.live provides this, and the code below, specifically get_dates_positives_totals_and_deaths_rt_live(), can be simply substituted for get_DNC(). The new predictions are similar, but slightly lower, and the regression is nicer.
Also,
-Mike
The text was updated successfully, but these errors were encountered: