<a href="https://colab.research.google.com/github/Ashish-Soni08/Playground/blob/main/haystack/Advent_of_Haystack_Create_A_Recentness_Ranker(Ashish_Soni).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advent of Haystack - Day 8

Here we have some documents that contain meeting notes (generated with ChatGPT 🤝), and the date of the meeting in the `meta` field.

🚀 Your task is to create a custom component that can rank these documents from newets to oldest based on the date field in `meta`. Do this in **Step 3**

We have prepared a pipeline that needs to use this component. The pipeline already has a component added with `pipe.add_component("recentness", date_ranker)`.

#Installation
**Note:** There is a known issue with colab due to a version conflict error related to `llmx` which comes with Colab. You might get an `llmx` error. You can safely ignore this, or run `pip uninstall -y llmx`

In [1]:
%%capture

!pip install haystack-ai

### Enabling Telemetry

Knowing you’re running this challenge helps us know whether Advent of Haystack is helping people learn about Haystack 2.0-Beta. But you can always opt out by commenting the following line.

In [2]:
from haystack.telemetry import tutorial_running

tutorial_running("challenge_8")

## 1. Set up our Meeting Notes

In [3]:
import datetime
from haystack.dataclasses import Document

documents = [Document(content="Decision: Prioritize Project A over Project B for the upcoming quarter. Rationale: Project A has a more immediate impact on client satisfaction. Action Items: Project teams to reallocate resources accordingly.", meta={"date": datetime.datetime(2023, 11, 10)}),
             Document(content="Decision: Revert back to the original plan, prioritizing Project B. Rationale: Client feedback and market analysis indicate higher long-term potential for Project B. Action Items: Project teams to readjust resources, and communicate changes to stakeholders.", meta={"date": datetime.datetime(2023, 11, 12)}),
             Document(content="Decision: Allocate 20% of the training budget to online courses. Rationale: Online courses offer cost-effective and flexible learning options. Action Items: HR to update the budget and communicate the changes to employees.", meta={"date": datetime.datetime(2023, 11, 11)}),]

## 2. Create a prompt template and Generator
Here, we've created a prompt template that asks for a summary of meeting notes.

In [4]:
from getpass import getpass

api_key = getpass("OpenAI Key: ")

OpenAI Key: ··········


In [5]:
from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import GPTGenerator

prompt_template = """
You will be provided meeting notes in order. The order is from newest to oldest . Create
a summary of the decisions, indicating the progression.

Meeting notes in order or recency:
{% for document in documents %}
  "Meeting Notes:"
  {{document.content}}
{% endfor %}
"""

prompt_builder = PromptBuilder(template=prompt_template)
llm = GPTGenerator(model_name="gpt-3.5-turbo-1106", api_key=api_key)

In [7]:
llm.run("HI")

{'replies': ['Hello! How can I help you today?'],
 'metadata': [{'model': 'gpt-3.5-turbo-1106',
   'index': 0,
   'finish_reason': 'stop',
   'usage': {'prompt_tokens': 8, 'completion_tokens': 9, 'total_tokens': 17}}]}

## 3. Create a custom `DateRanker`

Complete the custom component below so as to have a ranker which can rank a List of Documents based on date.

In [8]:
type(documents)

list

In [9]:
documents[0].content

'Decision: Prioritize Project A over Project B for the upcoming quarter. Rationale: Project A has a more immediate impact on client satisfaction. Action Items: Project teams to reallocate resources accordingly.'

In [10]:
type(documents[0].content)

str

In [11]:
documents[0].meta.get('date')

datetime.datetime(2023, 11, 10, 0, 0)

In [12]:
type(documents[0].meta.get('date'))

datetime.datetime

In [13]:
from typing import List, Optional
from haystack import component

@component
class DateRanker():
    def __init__(self, date_field: str = "date"):
        self.date_field = date_field

    @component.output_types(documents=List[Document])
    def run(self, documents: List[Document]) -> dict:
        """
        Rank a list of documents based on the specified date field.

        Args:
            documents (List[Document]): List of input documents.

        Returns:
            dict: Dictionary with the ranked documents.
        """
        # Sort all documents based on the date field
        ranked_documents = sorted(documents, key=lambda doc: doc.meta.get(self.date_field), reverse=True)

        return {"documents": ranked_documents}

In [14]:
date_ranker = DateRanker()

## 4. Create and run the RAG pipeline

Below is the pipeline that we would like to run to create a summary of the meeting notes. This pipeline uses a component calles `date_ranker` that you should create in the section above

In [15]:
pipe = Pipeline()
pipe.add_component("recentness", date_ranker)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

In [16]:
pipe.connect("recentness.documents", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

In [17]:
pipe.draw("/content/pipeline_day_8.png")

In [18]:
result = pipe.run(data={"recentness":{"documents": documents}})


print(result['llm']["replies"][0])

Summary of Decisions:

1. Reverting back to the original plan and prioritizing Project B due to higher long-term potential.
2. Allocating 20% of the training budget to online courses for cost-effective and flexible learning options.
3. Prioritizing Project A over Project B for the upcoming quarter due to its immediate impact on client satisfaction.
