<a target="_blank" href="https://colab.research.google.com/github.com/SylphAI-Inc/AdalFlow/blob/main/notebooks/adalflow_dataclasses.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>


# 🤗 Welcome to AdalFlow!
## The PyTorch library to auto-optimize any LLM task pipelines

Thanks for trying us out, we're here to provide you with the best LLM application development experience you can dream of 😊 any questions or concerns you may have, [come talk to us on discord,](https://discord.gg/ezzszrRZvT) we're always here to help! ⭐ <i>Star us on <a href="https://github.com/SylphAI-Inc/AdalFlow">Github</a> </i> ⭐


# Quick Links

Github repo: https://github.com/SylphAI-Inc/AdalFlow

Full Tutorials: https://adalflow.sylph.ai/index.html#.

Deep dive on each API: check out the [developer notes](https://adalflow.sylph.ai/tutorials/index.html).

Common use cases along with the auto-optimization:  check out [Use cases](https://adalflow.sylph.ai/use_cases/index.html).

# Outline

This is a quick introduction of what AdalFlow is capable of. We will cover:

* How to use adalflow dataclass
* How to do nested dataclass with optional fields

**Next: Try our [auto-optimization](https://colab.research.google.com/drive/1n3mHUWekTEYHiBdYBTw43TKlPN41A9za?usp=sharing)**


# Installation

1. Use `pip` to install the `adalflow` Python package. We will need `openai`, `groq`, and `faiss`(cpu version) from the extra packages.

  ```bash
  pip install adalflow[openai,groq,faiss-cpu]
  ```
2. Setup  `openai` and `groq` API key in the environment variables

### Install adalflow

In [30]:
# Install adalflow with necessary dependencies
from IPython.display import clear_output

!pip install -U adalflow[openai,groq,faiss-cpu]

clear_output()

### Set Environment Variables

Note: Enter your api keys in below cell

In [None]:
%%writefile .env

OPENAI_API_KEY="PASTE-OPENAI_API_KEY_HERE"
GROQ_API_KEY="PASTE-GROQ_API_KEY-HERE"

Writing .env


### Import necessary libraries

In [None]:
# Import required libraries
from IPython.display import clear_output
from dataclasses import dataclass, field
from typing import List, Dict, Optional
import adalflow as adal
from adalflow.components.model_client import GroqAPIClient
from adalflow.utils import setup_env

In [None]:
# Load environment variables - Make sure to have OPENAI_API_KEY in .env file and .env is present in current folder
setup_env(".env")

### Basic Vannila Example

In [None]:
# Define the output structure using dataclass
@dataclass
class BasicQAOutput(adal.DataClass):
    explanation: str = field(
        metadata={"desc": "A brief explanation of the concept in one sentence."}
    )
    example: str = field(
        metadata={"desc": "An example of the concept in a sentence."}
    )
    # Control output fields order
    __output_fields__ = ["explanation", "example"]

# Define the template using jinja2 syntax
qa_template = r"""<SYS>
You are a helpful assistant.
<OUTPUT_FORMAT>
{{output_format_str}}
</OUTPUT_FORMAT>
</SYS>
User: {{input_str}}"""

In [None]:
# Define the QA component
class QA(adal.Component):
    def __init__(self, model_client: adal.ModelClient, model_kwargs: Dict):
        super().__init__()

        # Initialize the parser with the output dataclass
        parser = adal.DataClassParser(data_class=BasicQAOutput, return_data_class=True)

        # Set up the generator with model, template, and parser
        self.generator = adal.Generator(
            model_client=model_client,
            model_kwargs=model_kwargs,
            template=qa_template,
            prompt_kwargs={"output_format_str": parser.get_output_format_str()},
            output_processors=parser,
        )

    def call(self, query: str):
        """Synchronous call to generate response"""
        return self.generator.call({"input_str": query})

    async def acall(self, query: str):
        """Asynchronous call to generate response"""
        return await self.generator.acall({"input_str": query})


In [None]:
# Example usage
def run_basic_example():
    # Instantiate the QA class with Groq model
    qa = QA(
        model_client=GroqAPIClient(),
        model_kwargs={"model": "llama3-8b-8192"},
    )

    # Print the QA instance details
    print(qa)

    # Test the QA system
    response = qa("What is LLM?")
    print("\nResponse:")
    print(response)
    print(f"Explanation: {response.data.explanation}")
    print(f"Example: {response.data.example}")

### Example 1 - Movie analysis data class

In [None]:
# 1. Basic DataClass with different field types
@dataclass
class MovieReview(adal.DataClass):
    title: str = field(
        metadata={"desc": "The title of the movie"}
    )
    rating: float = field(
        metadata={
            "desc": "Rating from 1.0 to 10.0",
            "min": 1.0,
            "max": 10.0
        }
    )
    pros: List[str] = field(
        default_factory=list,
        metadata={"desc": "List of positive points about the movie"}
    )
    cons: List[str] = field(
        default_factory=list,
        metadata={"desc": "List of negative points about the movie"}
    )

    __output_fields__ = ["title", "rating", "pros", "cons"]


In [None]:

@dataclass
class Actor(adal.DataClass):
    name: str = field(metadata={"desc": "Actor's full name"})
    role: str = field(metadata={"desc": "Character name in the movie"})

In [None]:
from adalflow.core.functional import custom_asdict, dataclass_obj_from_dict

In [None]:
# 2. Nested DataClass example

@dataclass
class DetailedMovieReview(adal.DataClass):
    basic_review: MovieReview
    cast: List[Actor] = field(
        default_factory=list,
        metadata={"desc": "List of main actors in the movie"}
    )
    genre: List[str] = field(
        default_factory=list,
        metadata={"desc": "List of genres for the movie"}
    )
    recommend: bool = field(
        default_factory=str,
        metadata={"desc": "Whether you would recommend this movie"}
    )

    __output_fields__ = ["basic_review", "cast", "genre", "recommend"]

In [None]:
# 3. DataClass with optional fields
@dataclass
class MovieAnalysis(adal.DataClass):
    review: DetailedMovieReview
    box_office: Optional[float] = field(
        default=None,
        metadata={"desc": "Box office earnings in millions of dollars"}
    )
    awards: Optional[Dict[str, int]] = field(
        default=None,
        metadata={"desc": "Dictionary of award categories and number of wins"}
    )

    __output_fields__ = ["review", "box_office", "awards"]

In [None]:
# Example template for movie review
movie_review_template = r"""<SYS>
You are a professional movie critic. Analyze the given movie and provide a detailed review.
<OUTPUT_FORMAT>
{{output_format_str}}
</OUTPUT_FORMAT>
</SYS>
User: Review this movie: {{movie_title}}"""


In [None]:
# Create the MovieReviewer component with MovieAnalysis data class
class MovieReviewer(adal.Component):
    def __init__(self, model_client: adal.ModelClient, model_kwargs: Dict):
        super().__init__()
        self.additional_structure_prompt = "Dont use 'type' and 'properties' in output directly give as dict"
        parser = adal.DataClassParser(
            data_class=MovieAnalysis,
            return_data_class=True
        )
        self.generator = adal.Generator(
            model_client=model_client,
            model_kwargs=model_kwargs,
            template=movie_review_template,
            prompt_kwargs={"output_format_str": parser.get_output_format_str() + self.additional_structure_prompt},
            output_processors=parser,
        )

    def call(self, movie_title: str):
        return self.generator.call({"movie_title": movie_title})

In [None]:
# Use SongReviewer Class for QA
def run_movie_analysis_example():
    reviewer = MovieReviewer(
        model_client=GroqAPIClient(),
        model_kwargs={"model": "llama3-8b-8192"},
    )

    # Get a movie review
    analysis = reviewer.call("The Matrix")

    # Access nested data
    print(f"Movie Title: {analysis.data.review['basic_review']['title']}")
    print(f"Rating: {analysis.data.review['basic_review']['rating']}")
    print("\nPros:")
    for pro in analysis.data.review["basic_review"]["pros"]:
        print(f"- {pro}")

    print("\nCast:")
    for actor in analysis.data.review["cast"]:
            print(f"- {actor['name']} as {actor['role']}")

    if analysis.data.box_office:
        print(f"\nBox Office: ${analysis.data.box_office} million")

    if analysis.data.awards:
        print("\nAwards:")
        for category, count in analysis.data.awards.items():
            print(f"- {category}: {count}")

In [None]:
run_movie_analysis_example()

Movie Title: The Matrix
Rating: 8.7

Pros:
- Groundbreaking special effects
- Thought-provoking storyline
- Innovative action sequences

Cast:
- Keanu Reeves as Neo
- Laurence Fishburne as Morpheus
- Carrie-Anne Moss as Trinity

Box Office: $463.5 million


### Example 2: Song Review
Note: Song Review is modified by keeping Example 1 - Movie Review as a reference so that we would know how to use DataClasses for similar purposes

In [None]:
# 1. Basic DataClass with different field types
@dataclass
class SongReview(adal.DataClass):
    title: str = field(
        metadata={"desc": "The title of the song"}
    )
    album: str = field(
        metadata={"desc": "The album of the song"}
    )
    ranking: int = field(
        metadata={
            "desc": "Billboard peak ranking from 1 to 200",
            "min": 1,
            "max": 200
        }
    )
    streaming: Dict[str, int] = field(
        default_factory=list,
        metadata={"desc": "Dict of lastest approximate streaming count in spotify and in youtube. Gives the count in millions"}
    )
    pros: List[str] = field(
        default_factory=list,
        metadata={"desc": "List of positive points about the song"}
    )
    cons: List[str] = field(
        default_factory=list,
        metadata={"desc": "List of negative points about the song"}
    )

    __output_fields__ = ["title", "rating", "streaming", "pros", "cons"]


In [None]:

@dataclass
class Artist(adal.DataClass):
    name: str = field(metadata={"desc": "Artist's full name"})
    role: str = field(metadata={"desc": "Artist's role in the song"})

In [None]:
# 2. Nested DataClass example

@dataclass
class DetailedSongReview(adal.DataClass):
    basic_review: SongReview = field(
        default=SongReview, metadata={"desc": "basic Song review details"}
    )
    cast: List[Artist] = field(
        default_factory=list,
        metadata={"desc": "List of main singer, lyrisist and musicians in the song"}
    )
    genre: List[str] = field(
        default_factory=list,
        metadata={"desc": "List of genres for the song"}
    )
    recommend: bool = field(
        default_factory=str,
        metadata={"desc": "Whether you would recommend this song"}
    )

    __output_fields__ = ["basic_review", "cast", "genre", "recommend"]

In [None]:
# 3. DataClass with optional fields
@dataclass
class SongAnalysis(adal.DataClass):
    review: DetailedSongReview = field(
        default=DetailedSongReview, metadata={"desc": "Song review details"}
    )
    duration: Optional[float] = field(
        default=float,
        metadata={"desc": "Duration of the song"}
    )
    awards: Optional[Dict[str, int]] = field(
        default=Dict[str, int],
        metadata={"desc": "Dictionary of award categories and number of wins"}
    )

    __output_fields__ = ["review", "duration", "awards"]

In [None]:
# Example template for song review
song_review_template = r"""<SYS>
You are a professional song critic. Analyze the given song and provide a detailed review.
<OUTPUT_FORMAT>
{{output_format_str}}
</OUTPUT_FORMAT>
</SYS>
User: Review this song: {{song_title}}"""


In [None]:
# Create the SongReviewer component with SongAnalysis data class
class SongReviewer(adal.Component):
    def __init__(self, model_client: adal.ModelClient, model_kwargs: Dict):
        super().__init__()
        self.additional_structure_prompt = "Dont use 'type' and 'properties' in output directly give as dict"
        parser = adal.DataClassParser(
            data_class=SongAnalysis,
            return_data_class=False,
            format_type="json"
        )
        self.generator = adal.Generator(
            model_client=model_client,
            model_kwargs=model_kwargs,
            template=song_review_template,
            prompt_kwargs={"output_format_str": parser.get_output_format_str() + self.additional_structure_prompt },
            output_processors=parser,
        )

    def call(self, song_title: str):
        return self.generator.call({"song_title": song_title})

In [None]:
# Use SongReviewer Class for QA
def run_song_analysis_example():
    reviewer = SongReviewer(
        model_client=GroqAPIClient(),
        model_kwargs={"model": "llama3-8b-8192"},
    )

    # Get a movie review
    analysis = reviewer.call("A Thousand Years")
    print(analysis)
    # Access nested data
    print(f"Song Title: {analysis.data['review']['basic_review']['title']}")
    print(f"Album: {analysis.data['review']['basic_review']['album']}")
    print(f"Ranking: {analysis.data['review']['basic_review']['ranking']}")

    for platform, views in analysis.data['review']['basic_review']['streaming'].items():
        print(f"- {platform} - {views} million views")
    print("\nPros:")
    for pro in analysis.data['review']["basic_review"]["pros"]:
        print(f"- {pro}")

    print("\nArtist's:")
    for actor in analysis.data['review']["cast"]:
            print(f"- {actor['name']} as {actor['role']}")

    if analysis.data['review']['genre']:
        print(f"\nGenere:  ")
        for genre in analysis.data['review']['genre']:
            print(f" {genre} ")

    if analysis.data['duration']:
        print(f"\nDuration: {analysis.data['duration']} minutes")

    if analysis.data['awards']:
        print("\nAwards:")
        for category, count in analysis.data['awards'].items():
            print(f"- {category}: {count}")

In [None]:
run_song_analysis_example()

GeneratorOutput(id=None, data={'review': {'basic_review': {'title': 'A Thousand Years', 'album': 'The Twilight Saga: Breaking Dawn - Part 1', 'ranking': 17, 'streaming': {'spotify': 1.2, 'youtube': 1.5}, 'pros': ['timeless lyrics', 'catchy melody', 'beautiful violin solo'], 'cons': ['can be overplayed', 'some may find it cheesy']}, 'cast': [{'name': 'Christina Perri', 'role': 'lead vocals, songwriting'}], 'genre': ['pop', 'rock'], 'recommend': True}, 'duration': 4.45, 'awards': {'Teen Choice Awards': 1, 'MTV Video Music Awards': 1, "People's Choice Awards": 1}}, error=None, usage=CompletionUsage(completion_tokens=201, prompt_tokens=589, total_tokens=790), raw_response='```\n{\n    "review": {\n        "basic_review": {\n            "title": "A Thousand Years",\n            "album": "The Twilight Saga: Breaking Dawn - Part 1",\n            "ranking": 17,\n            "streaming": {"spotify": 1.2, "youtube": 1.5},\n            "pros": ["timeless lyrics", "catchy melody", "beautiful violi

# Issues and feedback

If you encounter any issues, please report them here: [GitHub Issues](https://github.com/SylphAI-Inc/LightRAG/issues).

For feedback, you can use either the [GitHub discussions](https://github.com/SylphAI-Inc/LightRAG/discussions) or [Discord](https://discord.gg/ezzszrRZvT).