# Proposal for Data Collection

## AI-Powered Antimicrobial Resistance (AMR) Surveillance Support System in Nigeria

### 1. Background and Rationale

Antimicrobial resistance (AMR) poses a growing public health threat in Nigeria, with increasing reports of multidrug-resistant pathogens such as Escherichia coli, Klebsiella pneumoniae, and Staphylococcus aureus. Although microbiology laboratories across Nigeria routinely generate antimicrobial susceptibility testing (AST) data, these data are often fragmented, non-standardized, and under-utilized for predictive surveillance and early warning.

This project seeks to collect anonymized laboratory AMR data to develop machine learning and AI-based tools that can:

* Detect resistance patterns,
* Predict emerging resistance trends,
* Support national AMR surveillance and policy trends.

This project aligns with:

* Nigeria's [National Action Plan on AMR(NAP 2024-2028)](https://ncdc.gov.ng/news/524/press-release%3A-national-amr-stakeholders-hold-first-quarter-meeting)
* World Health Organization's [GLASS surveillance framework](https://www.who.int/initiatives/glass).

### 2. Purpose of Data Collection

The purpose of this data collection is to obtain retrospective and prospective laboratory AMR data that will be used strictly for research and public-health surveillance, not for commercial or diagnostic decision-making.

The collected data will enable:

* Training ML models to predict resistance patterns

* Identifying high-risk pathogen–antibiotic combinations

* Generating localized AMR insights for Nigeria


### 3. Scope of Data Requested

### 3.1. Type of Laboratories

We request collaboration from:

* NCDC-affiliated surveillance laboratories
* Federal and State medical centres
* University Teaching hospitals microbiology laboratories etc.

### 3.2 Data Types Required (No Personal Identifiers)

Only data that cannot be traced to an individual will be collected. This is for the sake of data privacy.

### A. Sample Metadata

* Sample ID (laboratory-generated, anonymized)
* Date of sample collection
* Sample type (blood, urine, sputum, wound swab, stool, etc.)
* Healthcare setting (outpatient / inpatient / ICU)
* Facility location (state only; no patient address)

### B. Pathogen Information

* Identified organism (species level where available)
* Method of identification (culture, biochemical, automated system)

### C. Antimicrobial Susceptibility Testing (AST)

* Antibiotics tested
* Susceptibility result (S / I / R)
* Testing standard used (CLSI / EUCAST)
* Method (disk diffusion, MIC, automated)

**The following information must be excluded from the data:**

* **Patient names**
* **Hospital numbers**
* **Addresses**
* **Phone numbers**
* **Clinician notes**


### 4. Microorganisms/Pathogens of Interest

Initial analysis will focus on high-burden pathogens in Nigeria:

* **_Escherichia coli_**
* **_Staphylococcus aureus_ (including MRSA)**
* **_Klebsiella pneumoniae_**
* **Carbapenem-resistant _Enterobacteriaceae_ (CRE)**
* **_Pseudomonas aeruginosa_**

This prioritization reflects national surveillance data and ensures sufficient data volume for ML modeling.



### 5. Data Collection Methodology and Format

### 5.1 Data Sources

* Laboratory Information Systems (LIS)
* Excel / CSV laboratory records
* WHONET exports (where available)

### 5.2 Collection Approach

* Retrospective data: last 3–5 years (where available)
* Prospective data: monthly or quarterly uploads (optional)
* Secure file transfer (encrypted email or secure cloud folder)

### 5.3 Data Standardization

* Data will be harmonized using WHO GLASS-aligned formats
* Antibiotic and organism names will be normalized
* Inconsistent entries will be flagged, not altered

### 6. Data Use and AI/ML Application

The data collected will be used for the following:

**Machine Learning Models**

* Resistance prediction models (pathogen × antibiotic)
* Trend forecasting models
* Anomaly detection for unusual resistance spikes

**AI Components**

* RAG-based knowledge system combining lab data with AMR guidelines
* Agent-based workflows for automated trend summaries
* Decision-support dashboards (non-clinical)

**Important:**
_Outputs are surveillance and research tools, not treatment recommendations._

### 7. Data Security, Governance, and Ethical Considerations

* All data will be stored in encrypted storage

* Access limited to authorized research personnel

* No attempt will be made to re-identify individuals

* Data will not be shared with third parties without written permission

* Results will be shared in aggregated form only

* This project uses secondary, anonymized data

* No direct patient contact is involved

* Ethical approval will be sought where required

* Participating laboratories will be acknowledged in reports

### 8. Benefits to Participating Institutions

Participating laboratories will receive:

* Summary AMR trend reports
* Facility-level resistance insights (if requested)
* Capacity-building exposure to AI-driven surveillance
* Recognition in project outputs and publications

## Conclusion

This data collection initiative aims to strengthen Nigeria’s AMR surveillance ecosystem by transforming routine laboratory data into actionable intelligence using **AI and Machine learning**. Collaboration with laboratories is essential to ensure evidence-based, locally relevant solutions to antimicrobial resistance.