In [None]:
from ploomber.spec import DAGSpec
from unicef_cpe.config import PROJ_ROOT
from unicef_cpe.utils import generate_output_excel
import subprocess
from datetime import datetime

In [None]:
spec = DAGSpec(PROJ_ROOT / 'pipeline.yaml')
dag = spec.to_dag()

country = spec.env.get('COUNTRY')

### Generating the AI Report  

Once you have completed the **README** setup instructions, you can execute this notebook to process the data and generate the **AI report**.

**1. Configure the Report Parameters**
    - Open the **`env.yaml`** file.  
    - Set the **year** for which you want to generate the report.  
    - Specify the **country code** (e.g., for Georgia, use `GEO`).  

**2. Understand the Processing Pipeline**
    - The **`pipeline.yaml`** file defines the different processing steps executed in this workflow.  
    - The corresponding notebooks for each step are located in the **`/pipelines`** directory.  

**3. Modify the Pipeline Execution (Optional)**
- If you **do not** want to execute a specific step, **comment out** the corresponding section in the **`pipeline.yaml`** file.  

**4. Start Processing** 
- Run the next cell in the notebook to begin data processing.  

> ⚠ **Note:**  
> - By default, this version assumes that the **LLM model** is from **OpenAI**.  
> - If you **do not** have an OpenAI account, you can use **Ollama**, but you must **manually set** the model name and version in the relevant notebooks.  

In [None]:
build = dag.build()

### Handling Execution Failures  

If the execution of the previous cell **fails**, the system will display the notebook that encountered an error.  

**Locating the Error**  
- The error details can be found in the **respective notebook** within the **`notebooks/pipeline_output/`** folder.  

**Fixing the Issue**  
- To resolve the issue, edit the notebook located in **`/pipelines`**, **not** in `notebooks/pipeline_output`, as the latter stores executed notebooks.  

**Re-executing the Process**  
- Once the issue is fixed, **rerun the previous cell**.  
- Successfully executed notebooks **will not be reprocessed**.  

**Forcing a Full Re-execution**  
- If you want to **reprocess all notebooks**, modify the execution command:  
  ```python
  build = dag.build(force=True)
  ```

In [None]:
# Generate Excel Output file 
generate_output_excel(PROJ_ROOT, country)

### Locating the Generated Data

Once the previous cell is successfully executed, you can find the output datasets in:
**Processed datasets**:
```plaintext
data/processed/{country}/
```
**Excel file containing all generated datasets in separate sheets**: 
```plaintext
data/outputs/{country}/cpe_evaluation_data.xlsx
```

### Generating the AI Report with Quarto

Once the data processing pipeline has successfully executed, you can generate the **AI-assisted country report** using **Quarto**.  

**How the Report Generation Works**
1. **Quarto renders the report template** (`report.qmd`) located in `reports/notebooks/`.
2. The **report is generated as an HTML file**.
```plaintext
data/outputs/{country}/unicef-ecaro-cpe-report-{country}-vYYYYMMDD.html
```

In [None]:
date_version = datetime.today().strftime('%Y%m%d')  # Generate date-based version

# Define command (without absolute or relative path in -o)
quarto_cmd = (
    f'cd reports/notebooks && '
    f'quarto render report.qmd '
    f'-o unicef-ecaro-cpe-report-{country}-v{date_version}.html '
    f'-P COUNTRY={country} -M subtitle="Country Report"'
)

# Print the command for reference
print(f"Running: {quarto_cmd}")

# Execute the command
subprocess.run(quarto_cmd, shell=True, check=True)

# Move the output file to the desired directory
output_src = f'reports/notebooks/unicef-ecaro-cpe-report-{country}-v{date_version}.html'
output_dest = f'data/outputs/{country}/unicef-ecaro-cpe-report-{country}-v{date_version}.html'

subprocess.run(f"mv {output_src} {output_dest}", shell=True, check=True)

print(f"Report successfully saved to {output_dest}")