Skip to content
This repository has been archived by the owner on Mar 10, 2023. It is now read-only.

Apply proper daily_reports filenames for 08-15-2020.csv files #3038

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lorint
Copy link

@lorint lorint commented Aug 16, 2020

Why?
The automated update for 08-15-2020 overwrote the 08-14-2020.csv files
instead of creating new 08-15-2020.csv files.

Why?
  The automated update for 08-15-2020 overwrote the 08-14-2020.csv files
  instead of creating new 08-15-2020.csv files.
@Wilkus
Copy link

Wilkus commented Aug 16, 2020

As a user of the daily_reports I confirm that the problem with the 08-15-2020.csv problem remains unresolved, and the total deaths for all regions (global) for 08-14-2020 appear about double the expected values, viz.:
image

image

I look forward to the resolution of this problem including the repair of the 08-14-2020.csv file.
Stephen Wilkus
Stephen.Wilkus@SpectrumFinancialPartners.com
+1 732 533 3286

@lorint
Copy link
Author

lorint commented Aug 16, 2020

... total deaths for all regions (global) for 08-14-2020 appear about double the expected values ...

Hi @Wilkus !

Note that this is a PR which holds the updated stats for the 14th and 15th. To confirm things are in order, I've just written this quick Ruby script which parses all files in csse_covid_19_data/csse_covid_19_daily_reports/ and gathers daily totals for each country as well as daily grand totals:

# Read in all daily files
require 'fileutils'
require 'date'
require 'csv'
require 'json'
prefix = 'Covid/csse_covid_19_data/csse_covid_19_daily_reports/'

@total = {}
dates = []
all_conf = []
all_death = []
Dir.open(prefix).children.each do |filename|
  name = filename.split('.')
  next if name.last != 'csv' || (name = name.first).length != 10

  # Convert to a standard ANSI date
  name = name.split('-')
  name = [name.last, name.first, name[1]].join('-')
  puts name
  dates << Date.parse(name)

  data = CSV.read(prefix + filename)
  header = data.shift
  cr_idx = header.index('Country/Region') || header.index('Country_Region')
  ps_idx = header.index('Province/State') || header.index('Province_State') || header.index("\uFEFFProvince/State")
  lu_idx = header.index('Last Update') || header.index('Last_Update')
  conf_idx = header.index('Confirmed')
  death_idx = header.index('Deaths')
  this_conf = 0
  this_death = 0
  data.each do |row|
    crps = "#{row[cr_idx]}, #{row[ps_idx]}"
    row.each_with_index do |cell, idx|
      next unless [conf_idx, death_idx].include?(idx) # if [cr_idx, ps_idx, lu_idx].include?(idx)

      location = @total[crps] ||= {}
      series = location[header[idx]] ||= []
      unless cell.nil?
        cell_num = cell.to_f
        cell_num_str = cell_num.to_s
        cell_num_str = cell_num_str[0..-3] if cell_num_str.end_with?('.0') && !cell.end_with?('.0')
        if cell == cell_num_str
          cell_num = cell_num.to_i if cell_num == cell_num.to_i && !cell.end_with?('.0')
          cell = cell_num
        end
        # Daily total of all confirmed cases
        case idx
        when conf_idx
          this_conf += cell
        when death_idx
          this_death += cell
        end
      end
      series << cell
    end
  end
  all_conf << this_conf
  all_death << this_death
end

@total['All'] = { 'Confirmed' => all_conf, 'Deaths' => all_death }

total_jsons = @total.map { |k, v| "#{k.inspect}:#{v.to_json}" }
puts '{'
puts "#{total_jsons.join(",\n")}"
puts '}'

My results that come from the above match what you show in your Total Cases and Total Deaths column except that you're missing an entry entirely for Sat 15 Aug 2020. So the real totals all the way up to the 13th are just what you show, the 13th properly having 20905995 and 755589 for confirmed cases and deaths, and then you're missing entirely proper entries for the 14th, which should be 21159927 and 764689, and finally what you show in your spreadsheet for the 14th is really the totals for the 15th. Same as the master branch has.

So -- long story short -- use the CSVs from this PR (I will email them to you also for convenience) and then you'll get accurate results.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants