Skip to content

Feat/cartesia integration fixes#2496

Merged
dirkbrnd merged 8 commits intoagno-agi:feat/cartesia-integrationfrom
Mustafa-Esoofally:feat/cartesia-integration-fixes
Apr 8, 2025
Merged

Feat/cartesia integration fixes#2496
dirkbrnd merged 8 commits intoagno-agi:feat/cartesia-integrationfrom
Mustafa-Esoofally:feat/cartesia-integration-fixes

Conversation

@Mustafa-Esoofally
Copy link
Contributor

Description

  • Summary of changes: Describe the key changes in this PR and their purpose.
  • Related issues: Mention if this PR fixes or is connected to any issues.
  • Motivation and context: Explain the reason for the changes and the problem they solve.
  • Environment or dependencies: Specify any changes in dependencies or environment configurations required for this update.
  • Impact on metrics: (If applicable) Describe changes in any metrics or performance benchmarks.

Fixes # (issue)


Type of change

Please check the options that are relevant:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Model update (Addition or modification of models)
  • Other (please describe):

Checklist

  • Adherence to standards: Code complies with Agno’s style guidelines and best practices.
  • Formatting and validation: You have run ./scripts/format.sh and ./scripts/validate.sh to ensure code is formatted and linted.
  • Self-review completed: A thorough review has been performed by the contributor(s).
  • Documentation: Docstrings and comments have been added or updated for any complex logic.
  • Examples and guides: Relevant cookbook examples have been included or updated (if applicable).
  • Tested in a clean environment: Changes have been tested in a clean environment to confirm expected behavior.
  • Tests (optional): Tests have been added or updated to cover any new or changed functionality.

Additional Notes

Include any deployment notes, performance implications, security considerations, or other relevant information (e.g., screenshots or logs if applicable).

@Mustafa-Esoofally Mustafa-Esoofally requested a review from a team as a code owner March 22, 2025 00:02
@Mustafa-Esoofally Mustafa-Esoofally mentioned this pull request Mar 22, 2025
12 tasks
from dotenv import load_dotenv

# Get Cartesia API key from environment or use a default for demo
cartesia_api_key = os.environ.get("CARTESIA_API_KEY", "sk_car_4y7Jz9aKsF6VeLpBKzKwJ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your API key will get leaked

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 2 to 4
Example script for using the Cartesia toolkit with an Agno agent for text-to-speech generation.
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually add setup steps at the top for that tool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.client = Cartesia(api_key=self.api_key)

# Set default output directory for audio files
self.output_dir = os.path.join(os.getcwd(), "cartesia_output")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tmp/cartesia

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.client = Cartesia(api_key=self.api_key)

# Set default output directory for audio files
self.output_dir = os.path.join(os.getcwd(), "tmp/cartesia")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use from pathlib import Path instead of os.path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the change

- Set the following environment variable:
export CARTESIA_API_KEY="your_api_key"
import os
import sys
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to import these two libraries entirely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the change

# )

# Example 5: Stream TTS - Generate speech with streaming capabilities
agent.print_response(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please comment out the first example, this looks odd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

import json
from os import getenv
from typing import Dict, List, Optional, Any
import os
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
import os

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

Comment on lines 14 to 15
cartesia_api_key = os.environ.get("CARTESIA_API_KEY")
load_dotenv()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need either of these

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

output_format_bit_rate: int = 128000,
output_format_encoding: str = None,
output_path: str = None,
**kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have kwargs. How would the model know what to pass?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

output_format_sample_rate: int = 44100,
output_format_bit_rate: int = 128000,
output_format_encoding: str = None,
**kwargs,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved

logger.error(f"Error saving audio data: {e}")
return json.dumps({"error": str(e)})

def text_to_speech_stream(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference with this function? It still just dumps everything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cartesia basically has 3 different text-to-speech options: https://docs.cartesia.ai/2024-06-10/api-reference/tts/bytes. I have added all of them. We might need them later when we build advanced agents / apps

@dirkbrnd dirkbrnd merged commit b9fecec into agno-agi:feat/cartesia-integration Apr 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments