# Meta (previously Facebook) Key Highlights 2024 Report System [❗️ Work In Progress]

> Disclaimer 1 💡: This project is for educational, academic and research purposes. Please use this technology ethically, legally and responsibly.

> Disclaimer 2 💡: This project is not officially affiliated with Meta or any of their partner organizations. It's entirely independent and created from ground-up by [AdiPat](https://www.github.com/AdiPat) at [The Hackers Playbook](https://www.thehackersplaybook.substack.com).

Welcome! Before we begin, let's set some context. This will help you better understand the underlying motivation and intent of the project. Hopefully, it will motivate and inspire you to improve your programming skills and upgrade your abilities. I've designed this notebook to help you at every step, so if you ever get stuck, spend some time reading each line and folllowing each step exactly as prescribed and you will definitely find a solution. I can't accurately articulate how that works, but it works.

## Personal Note

- Firstly, this section is subjective. Feel free to ignore it. Whether you choose to believe or disbelieve the contents of this section, in either situation, the outcome of this project is independent of your most likely, well-intentioned judgement.

- I've been following Mark Zuckerberg since I was in school, which is approximately since 2009/2010. I don't know the exact year but it was around this time when I learnt about him through the Internet.

- We share several similarities in terms of personality, attitude towards life, and most importantly Programming. I don't know him personally so this is based on his public appearances and whatever I have heard or read about him online and from people. Assuming that all the information I gathered is true, then if the world were devoid of biological and cultural nuances, and we were all judged purely as Programmers, it would be fair to say that I'm like Mark Zuckerberg's younger "programming brother".

- Most importantly, he was and is still a "hacker" at heart. If you don't believe me, observe every public announcement Meta makes.

- I have never officially worked at Meta but I like to consider myself as an invisible and unofficial contributor (not employee) at Meta. I came up with this idea to motivate myself to keep up to the standards of the ever evolving tech world and to maintain a cultural identity of myself which closely resembles "The Hacker Culture".

- There's a lot to add, but as time progresses, I'll continue talking about "Mark Zuckerberg's influence on AdiPat's life" in greater detail!

If you ever have any questions, email us at `thehackersplaybook0@gmail.com`.

## Naming Convention

The file name and title includes Facebook for backward compatability with search engines and crawlers. Since this research project is Free & Open Source, it's important for the Hacker Culture that it reaches maximum number of people so that humanity can benefit from the efforts and energy directed into this endevour.

## Goals & Objectives

**Note:** Goals are broad, long-term aspirations that provide direction, while objectives are specific, measurable, and time-bound steps to achieve those goals.

| Goals                                                                                                  | Objectives                                                                                                                                                    |
| ------------------------------------------------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| To demonstrate the power of Llama 3.1 in AI-powered automated report generation for enterprises.       | To create a valuable content product for The Hackers Playbook which will help our students upskill.                                                           |
| Open a discussion for official partnerships with Meta and their partner organizations.                 | Successfully utilize FireCrawl for information retrieval with measurable markers that can be used by FireCrawl (Mendable) to improve it's product experience. |
| Contribute to Meta's Free & Open Source repository of software products. React inspires all of us.     | Contribute to the Python ecosystem. Create an executable experiment for Python Programmers to learn from.                                                     |
| Extract key learnings from Meta's 2024 highlights to drive strategic insights at The Hackers Playbook. | Highlight AI-safety, AI-security, Ethical AI and Responsible AI practises in action.                                                                          |
| Encourage people to "build in open".                                                                   |                                                                                                                                                               |
| Further developments in Generative AI and march towards AGI (Artificial General Intelligence).         |                                                                                                                                                               |

**Enough talk, let's start hacking!**


In [None]:
# Boilerplate: This block goes into every notebook.
# It sets up the environment, installs the requirements, and checks for the required environment variables.

import os
from IPython.display import clear_output

requirements_installed = False
max_retries = 3
retries = 0
REQUIRED_ENV_VARS = [
    "GROQ_API_KEY",
    "FIRECRAWL_API_KEY",
    "SERPER_API_KEY",
    "GEMINI_API_KEY",
]


def install_requirements():
    """Installs the requirements from requirements.txt file"""
    global requirements_installed
    if requirements_installed:
        print("Requirements already installed.")
        return

    print("Installing requirements...")
    install_status = os.system("pip install -r requirements.txt")
    if install_status == 0:
        print("Requirements installed successfully.")
        requirements_installed = True
    else:
        print("Failed to install requirements.")
        if retries < max_retries:
            print("Retrying...")
            retries += 1
            return install_requirements()
        exit(1)
    return


from dotenv import load_dotenv
import os


def setup_env():
    """Sets up the environment variables"""

    def check_env(env_var):
        value = os.getenv(env_var)
        if value is None:
            print(f"Please set the {env_var} environment variable.")
            exit(1)
        else:
            print(f"{env_var} is set.")

    load_dotenv()

    variables_to_check = REQUIRED_ENV_VARS

    for var in variables_to_check:
        check_env(var)


install_requirements()
setup_env()
clear_output()
print("🚀 Setup complete. Continue to the next cell.")

In [119]:
import json
from typing import Dict, List

DEFAULT_LINKS_CACHE_FILE = "outputs/links_cache.json"


class LinksCache:
    """A simple in-memory cache for storing links."""

    def __init__(self, links_cache_file=DEFAULT_LINKS_CACHE_FILE):
        """Initializes the links cache."""
        self.load_links_cache_file = links_cache_file
        self.links_cache_file = links_cache_file
        self.links_cache = self.__load_links_cache()

    def __load_links_cache(self) -> Dict[str, list]:
        """Loads the links cache from the file"""
        if not os.path.exists(self.links_cache_file):
            with open(self.links_cache_file, "w") as f:
                json.dump({}, f)
            self.links_cache = {}
            return self.links_cache

        with open(self.links_cache_file, "r") as f:
            self.links_cache = json.load(f)
        return self.links_cache

    def __save_links_cache(self) -> None:
        """Saves the links cache to the file"""
        with open(self.links_cache_file, "w") as f:
            json.dump(self.links_cache, f)

    def add_links_to_cache(self, query: str, links: List[str]) -> None:
        """Adds the links to the cache"""
        self.links_cache[query] = links
        self.__dedupe_links_cache()
        # self.save_links_cache() # Not required because dedupe links cache already saves the cache.

    def has(self, query: str) -> bool:
        """Checks if the query is in the cache"""
        return query in self.links_cache

    def get(self, query: str) -> List[str]:
        """Gets the links from the cache"""
        return self.links_cache.get(query)

    def __dedupe_links_cache(self) -> None:
        """Dedupes the links cache"""
        for query, links in self.links_cache.items():
            self.links_cache[query] = self.__dedupe_links(links)
        self.__save_links_cache()

    def __dedupe_links(self, links: List[str]) -> List[str]:
        """Dedupes the links"""
        return list(set(links))

In [120]:
import re


class Utils:
    """Common utility functions for text processing and analysis."""

    @staticmethod
    def count_tokens(text: str) -> int:
        """
        Approximate the number of tokens in a text input for an LLM.

        Args:
            text (str): The input text to calculate tokens for.
            encoding_model (str): The tokenization model to use (e.g., 'cl100k_base').
                                This allows integration with a tokenizer library for better accuracy.

        Returns:
            int: Approximate number of tokens in the input text.
        """
        # Clean text and normalize spaces
        text = re.sub(r"\s+", " ", text.strip())

        # Approximate tokenization
        # 1. Split on spaces, punctuation, and common subword patterns
        # 2. Adjust weights based on encoding_model if needed
        tokens = re.findall(r"\w+|[^\w\s]", text, re.UNICODE)

        # Return the token count
        return len(tokens)

In [121]:
from firecrawl import FirecrawlApp
from langchain_community.utilities import GoogleSerperAPIWrapper
from typing import List, Any, Union
import traceback
from pprint import pp
from datetime import datetime
import time
from typing import Dict

DEFAULT_SEARCH_RETRY_COUNT = 3
EXPONENTIAL_BACKOFF_FACTOR = 2


def get_exponential_backoff_delay(
    retry_count: int, backoff_factor=EXPONENTIAL_BACKOFF_FACTOR
) -> int:
    """Gets the exponential backoff delay for retries."""
    return backoff_factor**retry_count


def exponential_backofff_retry(
    retry_count: int, backoff_factor=EXPONENTIAL_BACKOFF_FACTOR
):
    """Retries the function using exponential backoff."""

    def exponential_backoff_decorator(func):
        """Decorator function for exponential backoff."""

        def exponential_backoff_wrapper(*args, **kwargs):
            try:
                result = func(*args, **kwargs)
                error = result.get("error") if type(result) == dict else None

                if error and "page is not supported" in error:
                    print("Page is not supported, skipped retry.")
                    return result

                if not result or (type(result) == dict and error):
                    raise Exception(f"{func.__name__} failed.")
                return result
            except Exception as e:
                error = str(e)
                is_page_not_supported = "This website is no longer supported" in error
                if is_page_not_supported:
                    print("Page is not supported, skipped retry.")
                    return {
                        "url": args[1],
                        "markdown": "",
                        "error": "This page is not supported by FireCrawl",
                        "tokens_fetched": 0,
                    }
                if retry_count <= 0:
                    print(f"Failed to execute function: {func.__name__}")
                    return None
                print("Adding exponential backoff delay.")
                delay_time = get_exponential_backoff_delay(
                    retry_count, backoff_factor=backoff_factor
                )
                time.sleep(delay_time)
                print(f"Retrying function: {func.__name__}.")
                return exponential_backoff_wrapper(*args, **kwargs)

        return exponential_backoff_wrapper

    return exponential_backoff_decorator


class MARCSearchEngine:
    """MARC (Machine Automated Retrieval & Curation) Search Engine:: A simple search engine built on FireCrawl Python SDK."""

    def __init__(self):
        """Intializes the FireCrawlSearchEngine class."""

        self.firecrawl = FirecrawlApp(api_key=os.environ.get("FIRECRAWL_API_KEY"))
        self.cache = {}
        self.links_cache = LinksCache()
        self.docs_cache = {}
        print("🔥 MARC Search Engine initialized. 🕷️")

    @exponential_backofff_retry(DEFAULT_SEARCH_RETRY_COUNT)
    def __serper_search(self, query: str) -> Union[Dict[str, Any], None]:
        """Searches a search engine, currently Google via Serper API."""
        try:
            print(f"Running Serper search for query: {query}")
            search = GoogleSerperAPIWrapper()
            response = search.results(query)
            print(f"Got response for query: {query}", response)
            return response
        except Exception as e:
            print(f"Failed to search for query: {query}")
            traceback.print_exc()
            return None

    def search(self, query: str) -> List[str]:
        """Searches the query using the search engine."""
        try:
            start_time = datetime.now()
            print(f"📡 Searching for query: {query}")
            results = {}
            processed_pages = []
            initial_links = self.get_links_for_query(query)
            print(f"Got {len(initial_links)} links for query: {query}")
            for link in initial_links:
                try:
                    print(f"Getting page content in markdown from URL: {link}")
                    result = {"url": link}
                    page_crawl_result = self.__get_page_markdown(link)
                    page_markdown = page_crawl_result.get("markdown")
                    tokens_fetched = page_crawl_result.get("tokens_fetched")
                    error = page_crawl_result.get("error")
                    result["markdown"] = page_markdown
                    result["tokens_fetched"] = tokens_fetched
                    result["error"] = error
                    processed_pages.append(result)
                except Exception as e:
                    print(f"Failed to get page content in markdown from URL: {link}")
                    traceback.print_exc()
                    result = {
                        "url": link,
                        "markdown": "",
                        "error": str(e),
                        "tokens_fetched": 0,
                    }
                    processed_pages.append(result)
            results["processed_pages"] = processed_pages
            results["total_tokens_fetched"] = sum(
                [page["tokens_fetched"] for page in processed_pages]
            )
            end_time = datetime.now()
            time_difference_seconds = str((end_time - start_time).total_seconds()) + "s"
            print(f"🕒 Search completed in {time_difference_seconds}")
            return results
        except Exception as e:
            print(f"Search failed with unexpected error: {query}")
            traceback.print_exc()
            return None

    def get_links_for_query(self, query: str) -> List[str]:
        """Gets the links for a given query."""

        def collect_links(response: Dict[str, Any]) -> List[str]:
            """Collects the links from the response"""
            organic = response.get("organic")

            if not organic:
                print("No organic results found.")
                return []

            links = []
            for item in organic:
                sitelinks = item.get("sitelinks")
                if sitelinks or (type(sitelinks) == list and len(sitelinks) > 0):
                    for link in sitelinks:
                        links.append(link["link"])
                links.append(item["link"])
            return links

        if self.links_cache.has(query):
            print(f"Using cached links for query: {query}")
            return self.links_cache.get(query)

        response = self.__serper_search(query)
        if not response:
            print(f"Failed to get links for query: {query}")
            return []
        links = collect_links(response)
        self.links_cache.add_links_to_cache(query, links)
        return self.links_cache.get(query)

    def get_links(self, input_url: str) -> List[str]:
        """Gets the links from the given URL."""
        try:
            print(f"Getting links from URL: {input_url}")
            cache_key = f"{input_url}_links"
            cached_links = self.cache.get(cache_key)
            if cached_links:
                print(f"Using cached links for URL: {input_url}")
                return cached_links

            app = self.firecrawl
            crawl_result = app.map_url(input_url)

            success = crawl_result["success"]

            if not success:
                raise RuntimeError(f"Failed to get links from URL: {input_url}")

            links = crawl_result["links"]
            print(f"Got {len(links)} links from URL: {input_url}")
            self.cache[cache_key] = links

            return links
        except Exception as e:
            print(f"Failed to get links from URL: {input_url}")
            traceback.print_exc()
            return []

    @exponential_backofff_retry(DEFAULT_SEARCH_RETRY_COUNT)
    def __get_page_markdown(self, link: str) -> Any:
        """Gets the documentation from the given link."""
        try:
            print(f"Getting page content in markdown from URL: {link}")
            cached_doc = self.docs_cache.get(link)
            if cached_doc:
                print(f"Using cached docs for URL: {link}")
                return {
                    "url": link,
                    "markdown": cached_doc,
                    "error": "Fetching markdown page failed due no response.",
                    "tokens_fetched": 0,
                }

            app = self.firecrawl
            print(f"Getting page content in markdown from URL: {link}")
            scrape_result = app.scrape_url(link, params={"formats": ["markdown"]})

            if not scrape_result:
                return {
                    "url": link,
                    "markdown": "",
                    "error": "Fetching markdown page failed due no response.",
                    "tokens_fetched": 0,
                }

            success = scrape_result["metadata"]["statusCode"] == 200

            if not success:
                print(f"Failed to get docs from URL: {link}")
                return {
                    "url": link,
                    "markdown": "",
                    "error": "Fetching markdown page failed due failed success flag.",
                    "tokens_fetched": 0,
                }

            print(f"Got page content in markdown from URL: {link}")

            markdown = scrape_result.get("markdown")

            if not markdown:
                print(f"No markdown found in scrape result.")
                return {
                    "url": link,
                    "markdown": "",
                    "error": "Fetching markdown page failed due to no markdown.",
                    "tokens_fetched": 0,
                }
            self.docs_cache[link] = markdown

            token_count = Utils.count_tokens(markdown)
            tokens_fetched = token_count
            print(f"Token count: {token_count}")

            print(f"Markdown size: {len(markdown)}")

            return {
                "url": link,
                "markdown": markdown,
                "error": "",
                "tokens_fetched": tokens_fetched,
            }
        except Exception as e:
            is_page_not_supported = "This website is no longer supported" in str(e)
            print(f"Failed to get page content in markdown from URL: {link}")
            traceback.print_exc()
            return {
                "url": link,
                "markdown": "",
                "error": (
                    "This page is not supported by FireCrawl"
                    if is_page_not_supported
                    else str(e)
                ),
                "tokens_fetched": 0,
            }

In [122]:
from pprint import pp
from IPython.display import clear_output
import json
from typing import List, Dict, Any
import os

CACHED_FILE_THRESHOLD_SIZE = 2048


def ensure_file_and_directory_exists(file_path: str) -> None:
    """
    Ensures that the directory and file at the given path exist.
    If the directory does not exist, it is created recursively.
    If the file does not exist, an empty file is created.

    Args:
        file_path (str): The path to the file to check and create.
    """
    # Extract the directory from the file path
    directory = os.path.dirname(file_path)

    # Check if the directory exists, if not, create it recursively
    if not os.path.exists(directory):
        os.makedirs(directory)
        print(f"Directory created: {directory}")

    # Check if the file exists, if not, create an empty file
    if not os.path.exists(file_path):
        with open(file_path, "w") as file:
            json.dump({}, file)  # Create an empty file
        print(f"File created: {file_path}")
    else:
        print(f"File already exists: {file_path}")


def query_to_alphanumeric(query: str) -> str:
    """
    Converts a query string into an alphanumeric string with underscores.
    Non-alphanumeric characters are replaced with underscores.

    Args:
        query (str): The input query string.

    Returns:
        str: The sanitized alphanumeric string with underscores.
    """
    # Replace non-alphanumeric characters with underscores
    sanitized = re.sub(r"\W+", "_", query)
    # Remove leading or trailing underscores
    sanitized = sanitized.strip("_")
    return sanitized


def get_query_output_file_path(query: str) -> str:
    """Gets the output file path for the query."""
    file_name = query_to_alphanumeric(query)
    return f"outputs/search_results_{file_name}.json"


def write_results_to_file(results, query) -> None:
    """Writes the search results to a file."""
    print("Writing search results to file...")
    ensure_file_and_directory_exists(get_query_output_file_path(query))
    file_path = get_query_output_file_path(query)
    does_file_exist = False
    try:
        with open(file_path, "r") as f:
            does_file_exist = True
    except FileNotFoundError:
        does_file_exist = False
    if not does_file_exist:
        with open(file_path, "a") as file:
            pass  # Do nothing, just ensure the file is created
        with open(file_path, "w") as f:
            json.dump({}, f)
    with open(file_path, "w") as f:
        json.dump(results, f)
    print("Search results written to file.")


def read_results_from_file(query: str) -> Dict[str, Any]:
    """Reads the search results from the file."""
    try:
        ensure_file_and_directory_exists(get_query_output_file_path(query))
        file_path = get_query_output_file_path(query)
        does_file_exist = False
        try:
            with open(file_path, "r") as f:
                does_file_exist = True
        except FileNotFoundError:
            does_file_exist = False
        if not does_file_exist:
            with open(file_path, "a") as file:
                pass  # Do nothing, just ensure the file is created
            with open(file_path, "w") as f:
                json.dump({}, f)
            return {}
        with open(file_path, "r") as f:
            results = json.load(f)
        return results
    except Exception as e:
        print(f"Failed to read search results from file: {query}")
        traceback.print_exc()
        return {}


def is_search_results_cached(query: str) -> bool:
    """Checks if the search results are cached."""
    content = read_results_from_file(query)
    content_json = json.dumps(content)
    return content is not None and len(content_json) > CACHED_FILE_THRESHOLD_SIZE


def search_internal(query: str) -> Any:
    """Searches the query using the FireCrawl search engine."""
    if is_search_results_cached(query):
        print(f"Search results for {query} are cached.")
        return read_results_from_file(query)
    search_engine = MARCSearchEngine()
    response = search_engine.search(query)
    print(f"Search results for {query} computed.")
    return response


def run_search_engine(query: str, write_output=True) -> Any:
    """Runs the FireCrawl search engine with the given query."""
    print(f"Running search engine for query: {query}")
    response = search_internal(query)
    print("Search engine completed.")
    if write_output:
        print("!!! Writing search results to file.")
        write_results_to_file(response, query)
    return response


def run_search_engine_batch(queries: List[str]) -> Dict[str, Any]:
    """Runs the FireCrawl search engine with the given queries."""
    start_time = datetime.now()
    batch_query = f"meta_report_batch_{start_time.timestamp()}"
    existing_results = read_results_from_file(batch_query)
    print(f"Running search engine in batch mode for {len(queries)} queries!")
    results = {}
    for query in queries:
        cached_query = existing_results.get(query)
        if (
            cached_query
            and cached_query.get("processed_pages")
            and len(cached_query.get("processed_pages")) > 0
        ):
            print(f"Skipping query: {query} as it is already cached.")
            results[query] = existing_results[query]
            continue  # Skip if the query is already cached
        print(f"Running search engine for query: {query}")
        results[query] = run_search_engine(query, write_output=False)
        # results[query] = {}
        print(f"Search engine completed for query: {query}")
        print("Writing search results to file...")
        write_results_to_file(results, batch_query)
        print("Search results written to file.")
    write_results_to_file(results, batch_query)
    end_time = datetime.now()
    time_difference_seconds = str((end_time - start_time).total_seconds()) + "s"
    print(f"Batch search engine completed in {time_difference_seconds}")
    return results

In [None]:
query = "Meta's Performance in AR/VR and Metaverse Projects in 2024"
results = run_search_engine(query=query)
pp(results)

In [None]:
queries = [
    "Big Tech Meta Key Highlights 2024",
    "Meta 2024 News",
    "Meta 2024 Updates",
    "Meta 2024 Business Model Changes and Revenue Streams",
    "Meta Hiring Trends and Job Market Analysis 2024",
    "Meta Key Strategic Decisions and Initiatives 2024",
    "Meta's Approach to AI and Emerging Technologies in 2024",
    "Meta Quarterly Financial Reports 2024: Analysis and Insights",
    "Meta's Investments and Acquisitions in 2024",
    "Meta's Impact on the Social Media Ecosystem in 2024",
    "Meta's Policy and Compliance Updates 2024: User Privacy and Regulation",
    "Meta's Performance in AR/VR and Metaverse Projects in 2024",
    "Meta's Global Expansion and Regional Strategies in 2024",
]

results = run_search_engine_batch(queries=queries)
pp(results)

In [None]:
# Unsupported URL

from firecrawl import FirecrawlApp

app = FirecrawlApp(api_key=os.environ.get("FIRECRAW_API_KEY"))

url = "https://www.youtube.com/watch?v=1lr0Fi4ZP34"
try:
    response = app.scrape_url(url, params={"formats": ["markdown"]})
    print(response)
except Exception as e:
    error_message = str(e)
    print(f"Failed to scrape URL: {url}")
    if "This website is no longer supported" in error_message:
        print("Error: This website is no longer supported.")

In [125]:
import google.generativeai as genai
import os

gemini_api_key = os.getenv("GEMINI_API_KEY")
DEFAULT_GEMINI_MODEL = "gemini-1.5-flash"


def get_gemini_client(model=DEFAULT_GEMINI_MODEL):
    genai.configure(api_key=gemini_api_key)
    model = genai.GenerativeModel(model_name=model)
    return model


def gemini(query, model=DEFAULT_GEMINI_MODEL):
    model = get_gemini_client(model)
    response = model.generate_content(query)
    return response.text

In [126]:
from IPython.display import clear_output, Markdown, display


def read_data_from_file(file_path: str) -> Dict[str, Any]:
    """Reads the data from the given file path."""
    with open(file_path, "r") as f:
        data = json.load(f)
    return data


def extract_markdown(file_path: str):
    data = read_data_from_file(file_path)
    titles = data.keys()
    result_markdown = ""
    for title in titles:
        title_markdown = f"# {title}\n\n"
        body_markdown = ""
        processed_pages = data[title].get("processed_pages")
        cur_page_count = 0
        if processed_pages:
            print(f"Processing {len(processed_pages)} pages for {title}.")
            body_markdown = "\n\n".join([page["markdown"] for page in processed_pages])
            cur_page_count += len(processed_pages)
            print(f"Processed pages for {title}: {cur_page_count}")
            result_markdown += title_markdown + body_markdown + "\n----\n"
        else:
            print(f"No processed pages found for {title}.")
    return result_markdown


def show_raw_report(file_path: str) -> None:
    markdown = extract_markdown(file_path)
    clear_output()
    display(Markdown(markdown))


def show_cleaned_report(
    file_path: str, output_file_path: str = "outputs/meta_key_highlights_2024.md"
) -> None:
    start_time = datetime.now()
    print(f"Generating cleaned report from raw data at: {file_path}")
    markdown = extract_markdown(file_path)
    prompt = f"""
        Generate a cleaned report titled "Meta Key Highlights 2024" from the raw report data.

        - Add an Introduction and Conclusion in addition to the sections.
        - Number each page. It should be a 5 page report.
        - Each page should have atleast 4 sections. 
        - Each section should have atleast 2 paragraphs or 10 bullet points + 1 paragraph introduction.
        - Each section should have a title.
        - This report is for enterprise use, so make sure the language is technical and professional. 
        - Don't hallucinate or make up factual information.
        - Add a 'References' section at the end with links to the original content. Include atleast 10 references.

        Raw Report: '''{markdown}'''
    """
    print("Generating cleaned report.")
    cleaned_report = gemini(prompt)
    print("Generated cleaned report.")
    print("Writng cleaned report to file.")
    with open(output_file_path, "w") as f:
        f.write(cleaned_report)
    print(f"Cleaned report written to file: {output_file_path}")
    clear_output()
    end_time = datetime.now()
    time_difference_seconds = str((end_time - start_time).total_seconds()) + "s"
    print(f"Generated cleaned report in {time_difference_seconds}.")
    display(Markdown(cleaned_report))

In [None]:
data_file_path = "data/meta_report/raw_report_dump_v1.json"

show_raw_report(data_file_path)

In [None]:
from datetime import datetime

data_file_path = "data/meta_report/raw_report_dump_v1.json"

timestamp = datetime.now().timestamp()

output_file_path = f"outputs/meta_key_highlights_2024_{timestamp}.md"

show_cleaned_report(data_file_path, output_file_path=output_file_path)