# Are Millions of Indians at Risk?
> ICMR finds 2% prevalence rate for Coronavirus in India's SARI patients

- toc: true 
- badges: false
- comments: true
- categories: [official-data-only]

In [1]:
#hide
import json
import random
from pathlib import Path

import pandas as pd

%load_ext autoreload
%autoreload 2
Path.ls = lambda x: list(x.iterdir())

In [2]:
# hide
filepath = Path("../_pdfs/SARI_Covid2020.pdf")
assert filepath.exists()

> twitter: https://twitter.com/NirantK/status/1248132432617861121

Yesterday, I made a bit preposterous claim when I shared that *upto* 1 Million Indians might be at Infection Risk, or might even be infected in India by 2nd week of April. See: https://bit.ly/undertestingindia 

This is ~0.1% of India's population

**Now**, ICMR [1] confirmed that upto 1.8% Indians with Severe Acute Respiratory Illness (SARI) might be infected. 

This is a weak validation for what I assume. Since I assume ~0.1% of the population to be infected, a ~2% incidence of Covid19 in SARI patients appears to be in trend line of what we would expect.

**More important**, this is actual data saying that reality might be better than my models. 

Let's assume only `x%` of Indians suffer from SARI, then only ~2% of that `x%` = 0.02x are symptomatic because of Covid.

This lowers our symptomatic population which will require Covid specific medical care by 50x!

## Here is what I want you to know

ICMR tested ~6K people across 20 States and Union Territories which had Severe Acute Respiratory Illness.  

- 2 out of 100 patients tested positive
    - Exact: 104 of the 5911 patients were Coronavirus positive
- 1 out of 2 patients did not report any history of contact with a known case or international travel
    - Exact: 40 out of 104 patients
- 2 out of 100 men and 1 out 100 women tested positive
    - Exact: 85 men out of 3676 men tested positive. 17 out of 2047 women tested positive

## FAQ
Q. How many tests were done in all? 5911, out of which 104 tested positive. 

Q. Where these tests done? They were done across 20 States and Union Territories in India. 

Q. Which were the highest prevalence rate states? Tripura (11.1%), Chandigarh (4.2%), Telengana (4.2%), Maharashtra (3.8%), West Bengal (3.5%)

Q. What was the prevalance rate in known hotspots? Kerala (0.2%), Rajasthan (0%), Tamil Nadu (0.9%)

Q. What was the age distribution of tests and cases?  
A. 
See table here: 

Age groups (yr) | Number of Case (% Positive) | Number of SARI Patients| Percent Positivity | 
---|---|---|---
0‑9| 2 (2.0) | 386 (6.8)| 0.5 
10‑19| 0 (0.0) | 371 (6.5)| 0.0 
20‑29| 9 (8.8) | 1419 (25.0)| 0.6 
30‑39| 8 (7.8) | 971 (17.1)| 0.8 
40‑49| 16 (15.7) | 634 (11.2)| 2.5
50‑59| 31 (30.4) | 637 (11.2)| 4.9 
60‑69| 26 (25.5) | 672 (11.8)| 3.9 
70‑79| 8 (7.8) | 405 (7.1)| 2.0 
≥80| 2 (2.0) | 187 (3.3)| 1.1

Q. What does SARI mean? Severe Acute Respiratory Illness

Q. What does this tell us? For now, it tells us that upto 2% of India's population which has SARI has SARI because of Covid19 caused by Coronavirus / CCP Virus. 

This study's sample size of ~6K SARI Patients is a loose approximation. This obviously is not a satisfactory sample of the 1 Billion+ Indians, yet I think this is the best and most reliable data I've seen on this topic so far.

I'm also happy with their testing distribution which they've shared in detail.

[1] Official Source for this information is ICMR: [Severe acute respiratory illness surveillance for coronavirus disease 2019, India, 2020](http://www.ijmr.org.in/aheadofprint_cv.asp), Gupta et al. 
DOI:10.4103/ijmr.IJMR_1035_20

In [3]:
# hide
# !pip install pdfplumber
import pdfplumber

In [4]:
#hide
pagewise_df = []

with pdfplumber.open(filepath) as pdf:
    pages = pdf.pages
    page = pages[2]
    page_df = pd.DataFrame(
        page.extract_tables(
            {
                "vertical_strategy": "text",
                "horizontal_strategy": "text",
                "keep_blank_chars": True,
                #                     "min_words_horizontal": 6,
                #                     "text_tolerance":15
            }
        ),
    )
    print(page_df)

                                                  0   \
0  [Characteristics Number of COVID-19 cases  Num...   
1  [Nadu  (577),  Maharashtra  (553)  and  Kerala...   

                                                  1   \
0  [(per cent of total), (per cent of total), pos...   
1  [with COVID-19 positivity of 1.6, 0.9, 3.8 and...   

                                                  2   \
0                           [Gender n=102, n=5723, ]   
1  [per cent, respectively (Table III). COVID-19 ...   

                                                  3   \
0                 [Male 85 (83.3), 3676 (64.2), 2.3]   
1  [SARI patients were detected from eight distri...   

                                                  4   \
0               [Female 17 (16.7), 2047 (35.8), 0.8]   
1  [Maharashtra, six in West Bengal and five each...   

                                  5                              6   \
0  [Age groups (yr) n=102, n=5682, ]  [0-9 2 (2.0), 386 (6.8), 0.5]   
1      [Nadu 