[Feature Request]: Enhance LLM usage logging in indexing workflows

### Do you need to file an issue?

- [x] I have searched the existing issues and this feature is not already filed.
- [x] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
- [x] I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

### Is your feature request related to a problem? Please describe.

Hi team, I’ve been trying to apply GraphRAG in a production use case (with Azure OpenAI) and noticed that while the **query** phase has detailed LLM usage logging, the **indexing** workflows currently lack similar observability. This makes it hard to track during indexing, especially when optimizing LLM cost and latency. 

### Describe the solution you'd like

I’m implementing this enhancement by adding logging for LLM calls in the indexing phase, aligned with the existing logging structure (e.g., via `--verbose`).  Here is the final example `stats.json` file after running the index command:

```
{
    "total_runtime": 98.39666604995728,
    "num_documents": 1,
    "update_documents": 0,
    "input_load_time": 0,
    "workflows": {
        "load_input_documents": {
            "overall": 0.05307793617248535
        },
        "create_base_text_units": {
            "overall": 0.05640888214111328
        },
        "create_final_documents": {
            "overall": 0.054161787033081055
        },
        "extract_graph": {
            "overall": 44.75299096107483
        },
        "finalize_graph": {
            "overall": 17.086341857910156
        },
        "extract_covariates": {
            "overall": 0.0010209083557128906
        },
        "create_communities": {
            "overall": 0.12611031532287598
        },
        "create_final_text_units": {
            "overall": 0.06224370002746582
        },
        "create_community_reports": {
            "overall": 14.221134662628174
        },
        "generate_text_embeddings": {
            "overall": 21.974961280822754
        }
    },
    # New Added
    "total_llm_calls": 20,
    "total_prompt_tokens": 104652,
    "total_completion_tokens": 9691,
    "llm_usage_by_workflow": {
        "extract_graph": {
            "llm_calls": 5,
            "prompt_tokens": 66766,
            "completion_tokens": 5757
        },
        "create_community_reports": {
            "llm_calls": 10,
            "prompt_tokens": 29149,
            "completion_tokens": 3934
        },
        "generate_text_embeddings": {
            "llm_calls": 5,
            "prompt_tokens": 8737,
            "completion_tokens": 0
        }
    }
}
```

If anyone has **tips, design thoughts, or prior work** in this area, I’d love to hear your feedback, thanks!

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request]: Enhance LLM usage logging in indexing workflows #2103

Do you need to file an issue?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request]: Enhance LLM usage logging in indexing workflows #2103

Description

Do you need to file an issue?

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions