<a href="https://colab.research.google.com/github/nickprock/appunti_data_science/blob/master/semantic-search/advent-of-haystack/Advent_of_Haystack_Create_A_Recentness_Ranker.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Advent of Haystack - Day 8

Here we have some documents that contain meeting notes (generated with ChatGPT 🤝), and the date of the meeting in the `meta` field.

🚀 Your task is to create a custom component that can rank these documents from newets to oldest based on the date field in `meta`. Do this in **Step 3**

We have prepared a pipeline that needs to use this component. The pipeline already has a component added with `pipe.add_component("recentness", date_ranker)`.

#Installation
**Note:** There is a known issue with colab due to a version conflict error related to `llmx` which comes with Colab. You might get an `llmx` error. You can safely ignore this, or run `pip uninstall -y llmx`

In [None]:
!pip install haystack-ai



### Enabling Telemetry

Knowing you’re running this challenge helps us know whether Advent of Haystack is helping people learn about Haystack 2.0-Beta. But you can always opt out by commenting the following line.

In [None]:
from haystack.telemetry import tutorial_running

tutorial_running("challenge_8")

## 1. Set up our Meeting Notes

In [None]:
import datetime
from haystack.dataclasses import Document

documents = [Document(content="Decision: Prioritize Project A over Project B for the upcoming quarter. Rationale: Project A has a more immediate impact on client satisfaction. Action Items: Project teams to reallocate resources accordingly.", meta={"date": datetime.datetime(2023, 11, 10)}),
             Document(content="Decision: Revert back to the original plan, prioritizing Project B. Rationale: Client feedback and market analysis indicate higher long-term potential for Project B. Action Items: Project teams to readjust resources, and communicate changes to stakeholders.", meta={"date": datetime.datetime(2023, 11, 12)}),
             Document(content="Decision: Allocate 20% of the training budget to online courses. Rationale: Online courses offer cost-effective and flexible learning options. Action Items: HR to update the budget and communicate the changes to employees.", meta={"date": datetime.datetime(2023, 11, 11)}),]

## 2. Create a prompt template and Generator
Here, we've created a prompt template that asks for a summary of meeting notes.

In [None]:
from getpass import getpass

api_key = getpass("OpenAI Key: ")

OpenAI Key: ··········


In [None]:
from haystack import Pipeline
from haystack.components.builders.prompt_builder import PromptBuilder
from haystack.components.generators import GPTGenerator

prompt_template = """
You will be provided meeting notes in order. The order is from newest to oldest . Create
a summary of the decisions, indicating the progression.

Meeting notes in order or recency:
{% for document in documents %}
  "Meeting Notes:"
  {{document.content}}
{% endfor %}
"""

prompt_builder = PromptBuilder(template=prompt_template)
llm = GPTGenerator(model_name="gpt-4", api_key=api_key)

## 3. Create a custom `DateRanker`

Complete the custom component below so as to have a ranker which can rank a List of Documents based on date.

In [None]:
from typing import List, Optional
from haystack import component

@component
class DateRanker():
  def __init__(self, date_field: str = "date"):
    self.date_field = date_field

  @component.output_types(documents=List[Document])
  def run(self, documents):
    rdocuments = sorted(documents, key=lambda d: d.meta[self.date_field], reverse=True)

    return {"documents": rdocuments}

In [None]:
date_ranker = DateRanker()

## 4. Create and run the RAG pipeline

Below is the pipeline that we would like to run to create a summary of the meeting notes. This pipeline uses a component calles `date_ranker` that you should create in the section above

In [None]:
pipe = Pipeline()
pipe.add_component("recentness", date_ranker)
pipe.add_component("prompt_builder", prompt_builder)
pipe.add_component("llm", llm)

pipe.connect("recentness.documents", "prompt_builder.documents")
pipe.connect("prompt_builder", "llm")

In [None]:
result = pipe.run(data={"recentness":{"documents": documents}})
print(result['llm']["replies"][0])

Summary of decisions:

- The latest decision was to revert back to the original plan, i.e., prioritizing Project B over Project A. This decision stemmed from client feedback and market analysis that indicated higher long-term potential for Project B. The project teams need to re-adjust resources and communicate changes to stakeholders. 

- The meeting before that decided to allocate 20% of the training budget to online courses, due to their cost-effectiveness and flexibility in learning options. The HR department was tasked with updating the budget and communicating these changes to the employees.

- The earliest meeting decided to prioritize Project A over Project B for the upcoming quarter due to its more immediate impact on client satisfaction. At this point, Project teams were asked to reallocate resources accordingly. However, this decision got reversed in the latest meeting due to client feedback and market analysis favoring Project B.


In [None]:
result

{'llm': {'replies': ['Summary of decisions:\n\n- The latest decision was to revert back to the original plan, i.e., prioritizing Project B over Project A. This decision stemmed from client feedback and market analysis that indicated higher long-term potential for Project B. The project teams need to re-adjust resources and communicate changes to stakeholders. \n\n- The meeting before that decided to allocate 20% of the training budget to online courses, due to their cost-effectiveness and flexibility in learning options. The HR department was tasked with updating the budget and communicating these changes to the employees.\n\n- The earliest meeting decided to prioritize Project A over Project B for the upcoming quarter due to its more immediate impact on client satisfaction. At this point, Project teams were asked to reallocate resources accordingly. However, this decision got reversed in the latest meeting due to client feedback and market analysis favoring Project B.'],
  'metadata':

In [None]:
date_ranker.run(documents=documents)

{'documents': [Document(id=0a758bf69bcda18d52d880b54634a9ed4222b7121fd6d7479f42f392ab82d665, content: 'Decision: Revert back to the original plan, prioritizing Project B. Rationale: Client feedback and m...', meta: {'date': datetime.datetime(2023, 11, 12, 0, 0)}),
  Document(id=784e5a552e623d66f61028c83381760a9260a9a950ce489791049521902b0015, content: 'Decision: Allocate 20% of the training budget to online courses. Rationale: Online courses offer cos...', meta: {'date': datetime.datetime(2023, 11, 11, 0, 0)}),
  Document(id=a678b8c411d6d0c63e6884362fb7ad5fb1c7b0176564e8d9804f8f4d8e7a002c, content: 'Decision: Prioritize Project A over Project B for the upcoming quarter. Rationale: Project A has a m...', meta: {'date': datetime.datetime(2023, 11, 10, 0, 0)})]}