[BUG] LiteLLMModel fails on structured output and uses blocking async calls

### Checks

- [x] I have updated to the lastest minor and patch version of Strands
- [x] I have checked the documentation and this is not expected behavior
- [x] I have searched [./issues](./issues?q=) and there are no duplicates of my issue

### Strands Version

0.2.1

### Python Version

3.11

### Operating System

macOS sonoma 14.5

### Installation Method

pip

### Steps to Reproduce

Run below script

```
import asyncio
import os
from pydantic import BaseModel
from strands import Agent
from strands.models.litellm import LiteLLMModel



# 1. Define your Pydantic model
class PersonInfo(BaseModel):
    name: str
    age: int
    occupation: str

async def main():
    # 2. Create a LiteLLMModel instance
    #    Make sure your API key is set as an environment variable
    litellm_model = LiteLLMModel(
        model_id="gemini/gemini",
        client_args={"api_key": os.getenv("GEMINI_API_KEY")}
    )

    # 3. Pass the model to the Agent during initialization
    agent = Agent(model=litellm_model)

    # 4. Call structured_output (it's an async generator)
    async for result in agent.structured_output(
        PersonInfo,
        "John Smith is a 30-year-old software engineer"
    ):
        # The result is wrapped in a dictionary with the key "output"
        person = result["output"]
        print(f"Name: {person.name}")
        print(f"Age: {person.age}")
        print(f"Occupation: {person.occupation}")

if __name__ == "__main__":
    asyncio.run(main())
```

### Expected Behavior

Success: {'output': User(name='Jane Doe', age=28)}

### Actual Behavior

  File "/opt/anaconda3/envs/llms/lib/python3.11/site-packages/strands/models/litellm.py", line 146, in structured_output
    raise ValueError("No tool_calls found in response")
ValueError: No tool_calls found in response

### Additional Context

I have tried using using gemini and azure openai models. 
There's one more underlying issue, the method is async but it doesn't use async litellm method showing inconsistency.

### Possible Solution

The issue seems to be in the structured output method setup in litellm.py.
The code expects a response where the LLM's "finish reason" is tool_calls, but because it isn't properly configured to use tools, the LLM gives a standard text answer with a finish reason of stop everytime and it then fallbacks to the Value Error.
Error while debugging on local:
ERROR - Error during insight generation: No tool_calls found in response. Available choices: ['stop']


```
      tool = _pydantic_to_tool_spec(output_model)

        response = await litellm.acompletion(
            model=self.get_config()["model_id"],
            messages=super().format_request(prompt)["messages"],
            tools=[tool],
            tool_choice={"type": "function", "function": {"name": tool["function"]["name"]}},
            **self.client_args,
            **(self.get_config().get("params", {})),
        )
```


This works on my local tested it. Also updated it to use async method from litellm. 
 Would like to work on a PR for this. Is it okay if I take this on?

### Related Issues

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] LiteLLMModel fails on structured output and uses blocking async calls #402

Checks

Strands Version

Python Version

Operating System

Installation Method

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Possible Solution

Related Issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] LiteLLMModel fails on structured output and uses blocking async calls #402

Description

Checks

Strands Version

Python Version

Operating System

Installation Method

Steps to Reproduce

Expected Behavior

Actual Behavior

Additional Context

Possible Solution

Related Issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions