I utilized Cohere's Command R+ model and web browser tool calling to summarize Ulysses with the aid of [Hugging Face Chat](https://huggingface.co/docs/chat-ui/en/index). 

Chat UI is a fantastic resource for exploring and experimenting with open-source tools and models at no cost. The app uses MongoDB and SvelteKit behind the scenes. Tool calling allows users to define a schema, which the model then uses to generate output and invoke external tools. The process simplifies the selection of tools and their parameters.

This approach offers a seamless way to leverage the capabilities of large language models and integrate them with external tools to generate structured output, making it a powerful technique for various applications.

The conversation is available at:
[Huggingface Chat](https://hf.co/chat/r/wWlL4Df)

In [None]:
!pip install git+https://github.com/huggingface/parler-tts.git


In [None]:
import torch
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
import soundfile as sf

device = "cuda:0" if torch.cuda.is_available() else "cpu"

model = ParlerTTSForConditionalGeneration.from_pretrained("parler-tts/parler_tts_mini_v0.1").to(device)
tokenizer = AutoTokenizer.from_pretrained("parler-tts/parler_tts_mini_v0.1")

In [None]:
description = """
A female speaker with a low-pitched voice delivers 
her words quite expressively, 
in a very confined sounding environment with clear audio quality. 
She speaks at a normal speed.
Ensure people understand her clearly.
The tone is confident, enthusiastic, personal, informative, casual and friendly.
Reverberation times is between 1.0-1.5 s,
Normal speaking pitch is between 165 hertz and 255 hertz.
"""

prompt = """

James Joyce's Ulysses, is a challenging and innovative novel.

It takes place over the course of a single day in Dublin, Ireland. 
The book follows three main characters: Stephen Dedalus, Leopold Bloom, and Molly Bloom.

It offers an in-depth exploration of their inner lives through 
a stream-of-consciousness narrative style. 

"""

In [None]:
input_ids = tokenizer(description, return_tensors="pt").input_ids.to(device)
prompt_input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)

generation = model.generate(input_ids=input_ids, prompt_input_ids=prompt_input_ids)
audio_arr = generation.cpu().numpy().squeeze()

In [None]:
from IPython.display import Audio
Audio(audio_arr, rate=model.config.sampling_rate, autoplay=True)

https://huggingface.co/spaces/parler-tts/parler_tts_mini

* What is a normal speaking pitch?
Typical male voices range in pitch from 85 hertz to 180 hertz; typical female voices, from 165 hertz to 255 hertz. [LINK](https://www.dpamicrophones.com/mic-university/facts-about-speech-intelligibility#:~:text=In%20general%2C%20the%20fundamental%20frequency,f0%20is%20around%20300%20Hz.)

The model does not perform well. It is not good for reading numbers. Also, it does not finish the sentence.
