In [1]:
import { parse } from 'csv-parse/sync';
import { readFileSync } from 'fs';
import { analysisGraph } from 'agentic-data-analysis';
import type { GraphState, Data } from 'agentic-data-analysis';

async function analyzeCSVData(filepath: string): Promise<GraphState> {
  try {
    // Read and parse CSV file
    const fileContent = readFileSync(filepath, 'utf-8');
    const records = parse(fileContent, {
      columns: true,
      skip_empty_lines: true,
      cast: true,
    }) as Data;

    // Initialize graph with data
    const result = await analysisGraph.invoke({
      data: records
    });

    return result;
  } catch (error) {
    console.error('Error analyzing CSV data:', error);
    throw error;
  }
}

In [2]:
const filepath = './healthcare_dataset.csv';
const analysis = await analyzeCSVData(filepath);

console.log('\nDataset Summary:');
console.log(analysis.metadata?.summary);

console.log('\nField Analysis:');
Object.entries(analysis.metadata?.fields ?? {}).forEach(([field, meta]) => {
  console.log(`\n${field}:`);
  console.log(meta.description);
});

console.log('\nData Quality Issues:');
console.log(analysis.metadata?.dataQualityIssues);


Dataset Summary:
### Dataset Summary

#### 1. Dataset Purpose and Content
This dataset appears to represent a comprehensive collection of healthcare-related records, likely from a hospital or a network of healthcare facilities. It captures various aspects of patient demographics, medical conditions, healthcare services, and financial transactions. The primary purpose of this dataset could be to facilitate healthcare management, research, and analysis, focusing on patient care, resource allocation, and financial operations.

Key fields include:
- **Name, Age, Gender**: Demographic information that helps in understanding patient profiles.
- **Blood Type, Medical Condition, Medication, Test Results**: Health-related data that provides insights into patient health status and treatment.
- **Date of Admission, Discharge Date, Admission Type**: Temporal data that tracks patient flow and hospital operations.
- **Doctor, Hospital, Insurance Provider**: Institutional data that links patients to

In [3]:
console.log('\nVisualization Questions:');
analysis.metadata?.questions?.forEach((q, index) => {
  console.log(`\n${index + 1}. ${q.question}`);
  console.log(`   Chart Type: ${q.type}`);
  console.log(`   Fields: ${q.fields.join(', ')}`);
  console.log(`   Rationale: ${q.description}`);
});


Visualization Questions:

1. What is the distribution of patient ages?
   Chart Type: histogram
   Fields: Age
   Rationale: A histogram of the 'Age' field will show the distribution of patient ages, revealing the age demographics of the patient population.

2. What is the gender distribution among patients?
   Chart Type: pie
   Fields: Gender
   Rationale: A pie chart of the 'Gender' field will visualize the gender distribution among patients, revealing male-female proportions.

3. What are the most common medical conditions?
   Chart Type: bar
   Fields: Medical Condition
   Rationale: A bar chart of the 'Medical Condition' field will show the frequency of each condition, revealing the most common health issues among patients.

4. How does billing amount vary with age?
   Chart Type: scatter
   Fields: Billing Amount, Age
   Rationale: A scatter plot of 'Billing Amount' against 'Age' might reveal patterns or trends regarding healthcare costs across different age groups.

5. What is