# Welcome to Lumigator Foxfooding from [Mozilla.ai](https://www.mozilla.ai/)! 🐊 🦊

## Agenda

+ Working with Jupyter Notebooks
+ Lumigator Platform Overview  🐊
+ Understanding Machine Learning Workflows 
+ Thunderbird Dataset Walkthrough
+ Explanation of and Examination of Thunderbird Ground Truth
+ Model Selection ( 1 encoder/decoder), (2 decoder), eval against GPT4
+ Run experiment and show results
+ Evaluate results and discuss

## Jupyter Walkthrough

[Jupyter Notebooks](https://jupyter-notebook.readthedocs.io/en/stable/) are an executable code/text environment for (usually) Python code. Our Jupyter environment is in JupyterHub. To work with Jupyter, click "run cell" to run the code and see results below the cell you're currently running. Cells are executed sequentially. 

In some cells, you will see cases where there are variables that you'll need to pre-populate before running the cell. They look like this. The code will not work unless you replace it!  

```python
# suggestion: "lumigator_enthusiasts"
team_name = TEAM_NAME_HERE
```

# Running cells 
To run a cell, press the "play" icon in the top bar (you can also hit Shift+Enter to run and proceed to the following cell).


<img src="running.png" alt="drawing" width="400"/>

Your files are located on the left-hand side. They'll be saved for the duration of our session, but if you'd like to keep them, make sure to download them. 

<img src="files.png" alt="drawing" width="400"/>


In [1]:
## Lets' try running some code!

print("Welcome to Lumigator!🐊")

# You can see the output below!

Welcome to Lumigator!🐊


For more on notebooks and how cell works, check out this demo. [You can click links in cells.](https://github.com/nbgallery/Jupyter4Analysts/blob/main/J4A%20Notebook%201%20-%20Jupyter%20Syntax%20and%20Other%20Things.ipynb)

## Glossary of terms 

Some terms you'll hear us using throughout the session: 

+ **Machine learning** - The process of creating a model that learns from data
+ **Dataset** - Data used to train models
+ **LLM** - Large language model, [a text-based model that performs next-word predictions](https://www.nvidia.com/en-us/glossary/large-language-models/) 
+ **Tokens** - Words broken up into pieces to be used in an LLM 
+ **Inference** - The process of getting a prediction from a large language model 
+ **Embeddings** - Numerical representations of text generated by modeling 
+ **Encoder-decoder models** - a neural network architecture comprised of two neural networks, an encoder that takes the input vectors from our data and creates an embedding of a fixed length, and a decoder, also a neural network, which takes the embeddings encoded as input and generates a static set of outputs such as translated text or a text summary
+ **Decoder-only models** - Receive a prompt of text directly and predict the next word
+ **Task** - Machine learning tasks to fit a specific model type, including translation, summarization, completion, etc. 
+ **Ground truth** - A dataset that has been evaluated to be true by humans (or LLMs, in some cases) to be correct, that we can use as a point of comparison for our model. 

# Machine Learning Workflows

In machine learning, we are looking to generate a model artifact from data. We have several stages we care about: the data preprocessing, model training, model generation, inference, and evaluation. 

<img src="ml_workflow.png" alt="drawing" width="400"/>

Within the universe of modeling approaches, there are supervised and unsupervised approaches, as well as reinforcement learning. When we think of language modeling, that falls in the realm of neural network approaches. 


<img src="ml_family.png" alt="drawing" width="400"/>


Lumigator focuses on **inference** and **evaluation** for large language models: we want to be able to take our own dataset, perform inference on it, and evaluate the results to see if the model we would like to use produces good results for our use-cases. Use-cases include cases that are specific to our business. 


In order to select an LLM, we need the following stages: 

1. Generate ground truth for our business use-case
2. Pick several models we'd like to use to evaluate
3. Run an evaluation loop consisting of looking at the ground truth in comparison to model results
4. Analyze our evaluations. 

These are the steps that Lumigator completes. Here's a platform overview


<img src="platform.png" alt="drawing" width="400"/>


## Machine learning is alchemy

When we think of traditional software application workflows, we think of an example such as adding a button. We can clearly test that we've added a blue button to our application, and that it works correctly. Machine learning is not like this! It involves a lot of experimentation, tweaking of hyperparameters and prompts and trying different models. Expect for the process to be imperfect, with many iterative loops. Luckily, Lumigator helps take away the uncertainty of at least model selection :)

> There’s a self-congratulatory feeling in the air. We say things like “machine learning is the new electricity”. I’d like to offer an alternative metaphor: machine learning has become alchemy. - [Ben Recht and Ali Rahimi](https://archives.argmin.net/2017/12/05/kitchen-sinks/)

Ultimately, the final conclusion of whether a model is good is if humans think it's good. 

With that in mind, let's dive into setting up experiments with Lumigator to test our models!

In [12]:
# We have a library of utility functions that will help us connect to the Lumigator API
# Let's take a second to walk through them 

import lumigator_demo as ld


In [8]:
# Importing packages we need to work with data 
# python standard libraries
import time

import pandas as pd
import matplotlib.pyplot as plt
import os

# third-party libraries
from datasets import load_dataset
from IPython.display import clear_output

# wrap columns for inspection
pd.set_option('display.max_colwidth', 0)
# stylesheet for visibility
plt.style.use("fast")

# Understanding the Lumigator App and API 

 The app itself consists of an API, which you can access and test out methods with in the [OpenAPI spec](https://swagger.io/specification/), at the platform URL, under docs. 

<img src="openapi.png" alt="drawing" width="200"/>

[Here](https://lumigator.mzai.dev/docs) are the docs for the Lumigator API. In looking at them, we can see that we have 7 endpoints. 
The application is split up into `jobs`, `deployments`, `datasets`, `experiments`, and `completions`.

+ `Datasets` - Data that we add to the platform for evaluation. We can upload, delete, and save different data in the platform. 
+ `Experiments` - Our actual evaluation experiments. We can list all previous evaluations, create new ones, and get their results.
+ `Jobs` - Check running status of lm-buddy evaluation jobs
+ `Deployments` - Running Ray-serve deployments with locally-hosted models
+ `Completions` - Access to external APIs such as Mistral and OpenAI
+ `Health` - Status of the application, jobs and deployments. 




# Experiments
Let's start by creating a team name for our experiments to organize our data, pick a team name below and run the cell. 

In [None]:
# suggestion: "lumigator_enthusiasts"
team_name = TEAM_NAME_HERE

## Model Task

The task we'll be working with is summarization, aka we want to generate a summary of our text. In this particular case, emails. Finding a good model for summarization is a daunting task, as the typical intuition that larger parameter models generally perform better goes out the window. For summarization, we need to consider the input, which will likely be of a longer context size, and finding models that efficiently deal with those longer contexts is of paramount importance. In our business case, which is to create summaries of conversation threads, much as you might see in Slack or an email chain, the models need to be able to extract key information from those threads while still being able to accept a large context window to capture the entire conversation history.We identified that it is far more valuable to conduct abstractive summaries, or summaries that identify important sections in the text and generate highlights,  rather than extractive ones, which pick a subset of sentences and add them together for our use cases since the final interface will be natural language. We want the summary results not to need to be interpreted from often incoherent text snippets produced by extractive summaries. 

For more on summarization as a use-case, [see our blog post here.](https://blog.mozilla.ai/on-model-selection-for-text-summarization/)

## Ground Truth for Models

In order to generate ground truth, we need to generate ground truth, or a human-generated baseline that we can compare the model against to see how to good it is. The term ground truth comes from geology and geospatial sciences, where actual information was collected on the ground to validate data acquired through remote sensing, such as satellite imagery or aerial photography. Since then, the concept has been adopted in other fields, particularly in machine learning and artificial intelligence, to refer to the accurate, real-world data used for training and testing models. 

 We'll do this by performing inference against existing models that are trained for summarization. Let's take a look at the data we'll be using first. [Generating GT](https://thunderbird.topicbox.com/groups/addons/T18e84db141355abd-M4cca8e3f9e4fee9ae14b9dbb/self-hosted-version-of-extension-is-incorrectly-appearing-in-atn)


## Generating Data for Ground Truth Evaluation

In order to generate a ground truth summary for our data, we first need an input dataset. In this case we use threads from the [Thunderbird public mailing list.](https://thunderbird.topicbox.com/latest)  To generate the ground truth and then later evaluate the model, we need at least 100 samples to start with, where a sample is a single email or single email conversation.

Our selection criteria: 

+ Collect 100 recent and "complete" email threads for evaluation
+ Clean them of email formatting such as `>`
+ One consideration here will be that BART, the baseline model we're using, accepts up to a 1024-token-long context window as input. This means that we have to have input email threads that are ~ approximately 1000 words, so keeping on the conservative side.

Once we've collected them, we'd like to take a look at the data before we generate summaries. 

## How do LLMs work? 

The goal of a transformer model is to take a piece of multimodal content,and learn the latent relationships by creating multiple views of groups of words in the input corpus (multiple context windows). The self-attention mechanism, implemented as scaled dot-product attention in the Transformer paper, creates different context windows of the data a number of times through the six encoder and six decoder layers. The output is the result of the specifimachine learning task — a translated sentence, a summarized paragraph –and the next-to-last layer is the model’s embeddings, which we can use for downstream work.

There are many different kinds of LLMs and many different kinds of architectures. For our stack, we use two different kinds:

+ Encoder/Decoder - BART model
+ Decoder-only - most GPT-family models like Mistral, GPT, and others we'll be working with. 

In [13]:
# show information about the Thunderbird dataset
dataset_id = "db7ff8c2-a255-4d75-915d-77ba73affc53"
r = ld.dataset_info(dataset_id)

{
  "id": "db7ff8c2-a255-4d75-915d-77ba73affc53",
  "filename": "thunderbird_samples.csv",
  "format": "experiment",
  "size": 157184,
  "created_at": "2024-07-17T16:25:37.673229Z"
}


In [14]:
# download the dataset into a pandas dataframe
df = ld.dataset_download(dataset_id)
df.head()

Unnamed: 0,examples
0,"I recently added a beta release, version 7.0b1, of my extension, Clippings for Thunderbird, and selected the option to self-host it so that I can make it available to testers separately while regular users continue to see version 6.3.5, the current release version. However, the beta release is now incorrectly appearing in the Add-ons for Thunderbird public listing. URL to ATN listing: https://addons.thunderbird.net/en-US/thunderbird/addon/clippings-tb/. Thanks for the report, it is best to file an issue with the addons-server: https://github.com/thunderbird/addons-server I have not looked at it in detail, but self-hosted add-ons need an update_url entry in their manifest and that should prevent it from being accepted on ATN. Since we do not sign add-ons, self-hosted add-ons do not need to be submitted to ATN at all.My best-practice advice is:\n- remove the beta version from ATN,\n- create a dedicated branch/repo which holds the update information (either a branch in each add-on repo you want to self-host, or a single update-repo with the information for all your add-ons)\n- host the XPI files either as an ""beta"" asset in a ""github release"", or directly as a file in the repo (I think the asset is the better choice),\n- make sure that the manifest of your self-hosted XPI points to the correct update.json. John I've removed the beta version of my extension from ATN, and the listing for my extension now shows the current stable release. Thanks for your help!"
1,"the Thunderbird team is preparing the next big release: Thunderbird 128 ESR. Now is a good time to check if your add-ons are compatible. Thunderbird 128 is currently being shipped through the beta release channel. If you have not already installed Thunderbird Beta, you can get it from Thunderbird's download page [1]: Select the desired language, your operating system, and switch the ""release channel"" selection to ""Beta"".\n\nA list of known required changes to make add-ons compatible with Thunderbird 128 can be found on developer.thunderbird.net [2]. These changes mostly affect Experiment add-ons, which directly interact with Thunderbird code. WebExtensions usually do not need updates, but the add-ons team had to introduce the messagesUpdate permission, and browser.messages.update() will stop working, if the new permission has not been requested.\nAnother notably change is the official support of Manifest Version 3 in Thunderbird 128. The add-ons team removed deprecated elements and made additional changes to resolve inconsistencies in the APIs. The full list of changes can be found on webextension-api.thunderbird.net [3].\n\nStarting with Thunderbird 128, the API documentation on webextension-api.thunderbird.net [4] not only includes the WebExtension APIs added by Thunderbird, but also those inherited from Firefox (Thunderbird and Firefox share a significant amount of code). We are listing only methods, which are actually supported and working.\n\nDo not hesitate to reach out for help [5]. Looking forward to see your add-ons running in Thunderbird 128!\n\nJohn\n\n\n[1] : https://www.thunderbird.net/thunderbird/all/\n[2] : https://developer.thunderbird.net/add-ons/updating/tb128\n[3] : https://webextension-api.thunderbird.net/en/128-esr-mv3/changes/esr128.html\n[4] : https://webextension-api.thunderbird.net/en/stable/\n[5] : https://developer.thunderbird.net/add-ons/community"
2,"Hi all, Well I think everything is in the title. We have an extension that shows pdf thumbs in the message pane (faster than openning them). We used to provide PDF.js. Is this still needed? Regards, We usually do not encourage add-ons to depend on files shipped with Thunderbird, because the file could be moved/renamed, which will break your add-on. Ship your own version, so your add-on always uses a known version and stays compatible, even if Thunderbird updates its internal file, which might include API changes which could break your add-on as well. John"
3,"I'm studying how to develop addons. So, I'm sorry if my question is noob.\n\nMy code is here: https://github.com/gersonjferreira/Zulip-Thunderbird\n\nThe extension adds an icon to open the Zulip team chat on a Thunderbird tab. The purpose is to concentrate all chat/email apps into a single window. It is a simple extension and it is working fine... but...\n\nWithin the Zulip chats (same for Slack or other examples), there are external links, or links to PDF files, etc... when I left-click on these, the Thunderbird internal browser breaks and stays black. Probably because it cannot show the ""save as"" dialog or ""open with"", and so on...\n\nBut I if right-click and select ""open in browser"", it works fine. It takes me to Firefox and opens the desired link, or downloads the PDF, and so on...\n\nIs there a way to fix this, so that the left-click takes me to Firefox? Or to make it all work within Thunderbird?\n\nPS: my extension is quite simple and I'm still learning a lot of details. Please feel free to give feedback and tips on what I should improve. I have no idea how many people would be interested in using it.\n\nBest regards,\nGerson\n\nHm,\n\nI do not know if it is related: Do all those ""broken"" a-tags have a target=""_blank"" attribute? It appears those do not work as expected in Thunderbird. Filed a bug for it:\nhttps://bugzilla.mozilla.org/show_bug.cgi?id=1905616\n\nThe link handler of Thunderbird content tabs will always open links of the same site in the same Thunderbird tab. Links to other sites will open in the default browser.\n\nTo change this behaviour, you can use a content script. Add this to your manifest:\n\n""content_scripts"": [ { ""matches"": [""<all_urls>""], ""js"": [""content-script.js""] } ],\n\nYou may limit the content script to a more strict match, if you know which pages you are going to use with your extension. In the content script define a global clickhandler:\n\nwindow.addEventListener(""click"", clickhandler);\n\nfunction clickhandler(event) {\nevent.preventDefault();\nevent.stopPropagation();\n\nconst anchor = event.target.closest(""a"");\nif (!anchor) return;\nbrowser.windows.openDefaultBrowser(anchor.getAttribute('href'));\n}\n\nDoes this help?\n\nJohn\n\nYes, at the moment I confirm that all links that fail have target=""_blank"", so it seems related to the bug you are reporting.\n\nRegarding you suggestion, it sounds promising, but I have a busy day of work today. I'll test this changes later tonight and I'll reply here if it works.\n\nI've just tested your suggestion, and it helped me identify some details:\n\nFirst, indeed the links that need to be fixed are the ones with target=""_blank"" for sure, but not all of then. In zulip, there are links that start with /user_uploads, like /user_uploads/bla_bla.pdf, and these are the broken ones in my addon. To make these work, I need to add the Zulip prefix ""https://my_org.zulipchat.com/"" + ""/user_uploads/bla_bla.pdf"".\n\nTo make this work, I had to split your suggestion into a background and a content-script code like this:\n// background.js\nbrowser.runtime.onMessage.addListener(function (message) {\n prefix = browser.storage.sync.get('zulip_url');\n prefix.then(function(result) {\n let prefix = result.zulip_url;\n href = message.data;\n if (href.startsWith(""/user_uploads"")) {\n href = prefix + href;\n }\n browser.windows.openDefaultBrowser(href);\n });\n});\nand\n// content-script.js\nwindow.addEventListener(""click"", clickhandler);\n\nfunction clickhandler(event) {\n event.preventDefault();\n event.stopPropagation();\n\n const anchor = event.target.closest(""a"");\n if (!anchor) return;\n browser.runtime.sendMessage({\n data: anchor.getAttribute('href')\n });\n};\n\nI'll try this for a couple of days to see if I spot other issues. If you have any suggestion on how I can improve this code, please let me know.\n\nThanks for the help,\nBest regards,\nGerson"
4,"I think this is going to require some expertise from someone who knows how Thunderbird works internally, but I thought I would ask anyway.\n\nWhen you set mail.tabs.drawInTitlebar=true in the Config Editor, the OS-supplied titlebar -- with the window title and the minimize/maximize/restore/close buttons -- is hidden, and Thunderbird-supplied equivalents for the buttons are added instead. This is great.\n\nAnother thing that happens is that you can DRAG in the toolbox/toolbars at the top of the window to move the Thunderbird window around, as the titlebar is no longer available for this purpose. This is also great.\n\nI have been trying to implement the same behavior for an extension I have been working on. I can hide the OS-supplied titlebar and add my own buttons, and that is working well, but I cannot figure out how to make it so I can drag on my extension's toolbox/toolbars to make the windows for my extension's move.\n\nI seem to have isolated what triggers this behavior for Thunderbird's main window (and the message view and compose windows.) Using the Developer Toolbox, I just add Attribute chromemargin=""0,2,2,2"" (or chromemargin with ANY values, actually) to the <html> tag in the HTML/XUL for the Thunderbird window. It's like magic. Add the chromemargin Attribute, and the titlebar disappears, and you can drag the window around using the toolbox/toolbars at the top of the window. Remove the chromemargin Attribute and the titlebar re-appears and you CANNOT drag the window around using the toolbox/toolbars at the top of the window.\n\nThis does not work for the windows for my extension. Dragging on the toolbox/toolbars does nothing.\n\nI have looked to see what event listeners might be attached to the Toolbox/toolbars that would respond to mouse drags and I cannot see anything. Perhaps it's something in the source code for the app itself? Or some other value stored somewhere, some toolbox or toolbar ID, or CSS class, or attribute or something. I just don't know.\n\nI have tried looking at the Thunderbird source code, and I understand it a bit, but not well enough to figure this out.\n\nIf you have any knowledge of this, could you please give me some help?\n\nMany thanks in advance,\n---Mark\nJust in case anyone is interested in this, user morat provided the answer in this article on mozillaZine:\n\nhttp://forums.mozillazine.org/viewtopic.php?t=3121128&sid=d068573e2643604e638528cdb09a5a63"


In [30]:
# We'd like to make sure our data is clean for LLM input
# This is often not necessary since the model is trained on internet-formatted data
# But we'll be careful here

import re
from string import punctuation

def preprocess_text(text:str):
    text = text.lower()  # Lowercase text
    text = re.sub(f"[{re.escape(punctuation)}]", "", text)  # Remove punctuation
    text = " ".join(text.split())  # Remove extra spaces, tabs, and new lines
    text = re.sub(r"\b[0-9]+\b\s*", "", text)
    return text

df["examples"].map(preprocess_text)

0     i recently added a beta release version 70b1 of my extension clippings for thunderbird and selected the option to selfhost it so that i can make it available to testers separately while regular users continue to see version the current release version however the beta release is now incorrectly appearing in the addons for thunderbird public listing url to atn listing httpsaddonsthunderbirdnetenusthunderbirdaddonclippingstb thanks for the report it is best to file an issue with the addonsserver httpsgithubcomthunderbirdaddonsserver i have not looked at it in detail but selfhosted addons need an updateurl entry in their manifest and that should prevent it from being accepted on atn since we do not sign addons selfhosted addons do not need to be submitted to atn at allmy bestpractice advice is remove the beta version from atn create a dedicated branchrepo which holds the update information either a branch in each addon repo you want to selfhost or a single updaterepo with the inform

In [31]:
# Examine a single sample 
# we define the data with examples
df['examples'].iloc[0]

'I recently added a beta release, version 7.0b1, of my extension, Clippings for Thunderbird, and selected the option to self-host it so that I can make it available to testers separately while regular users continue to see version 6.3.5, the current release version. However, the beta release is now incorrectly appearing in the Add-ons for Thunderbird public listing. URL to ATN listing: https://addons.thunderbird.net/en-US/thunderbird/addon/clippings-tb/. Thanks for the report, it is best to file an issue with the addons-server: https://github.com/thunderbird/addons-server I have not looked at it in detail, but self-hosted add-ons need an update_url entry in their manifest and that should prevent it from being accepted on ATN. Since we do not sign add-ons, self-hosted add-ons do not need to be submitted to ATN at all.My best-practice advice is:\n- remove the beta version from ATN,\n- create a dedicated branch/repo which holds the update information (either a branch in each add-on repo y

In [32]:
# Add a function to do some simple character counts for model input
df['char_count'] = df['examples'].str.len()

In [33]:
df.head

<bound method NDFrame.head of                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           

In [None]:
# Show statistics about characters count
df['char_count'].describe()

In [None]:
fig, ax = plt.subplots(figsize=(12, 6))
ax.hist(df['char_count'], bins=30)
ax.set_xlabel('Character Count')
ax.set_ylabel('Frequency')

stats = df['char_count'].describe().apply(lambda x: f"{x:.0f}")

# Add text boxes for statistics
plt.text(1.05, 0.95, stats.to_string(), 
         transform=ax.transAxes, verticalalignment='top')

# Adjust layout
plt.tight_layout()
fig.subplots_adjust(right=0.75)

plt.show()

In [108]:
## Perform Ground Truth Generation with Mistral 

mistral_responses = []

for sample in df['examples'][0:10]:
    res = ld.get_mistral_ground_truth(sample)
    print(f"Mistral Summary:", res)
    mistral_responses.append((sample, res['text']))

Request failed: 500 Server Error: Internal Server Error for url: http://127.0.0.1/api/v1/completions/mistral


HTTPError: 500 Server Error: Internal Server Error for url: http://127.0.0.1/api/v1/completions/mistral

In [42]:
# Let's create a result set we can look at
mistral_results_df = pd.DataFrame(mistral_responses, columns=['original', 'mistral_response'])

mistral_results_df

Unnamed: 0,original,mistral_response
0,"I recently added a beta release, version 7.0b1, of my extension, Clippings for Thunderbird, and selected the option to self-host it so that I can make it available to testers separately while regular users continue to see version 6.3.5, the current release version. However, the beta release is now incorrectly appearing in the Add-ons for Thunderbird public listing. URL to ATN listing: https://addons.thunderbird.net/en-US/thunderbird/addon/clippings-tb/. Thanks for the report, it is best to file an issue with the addons-server: https://github.com/thunderbird/addons-server I have not looked at it in detail, but self-hosted add-ons need an update_url entry in their manifest and that should prevent it from being accepted on ATN. Since we do not sign add-ons, self-hosted add-ons do not need to be submitted to ATN at all.My best-practice advice is:\n- remove the beta version from ATN,\n- create a dedicated branch/repo which holds the update information (either a branch in each add-on repo you want to self-host, or a single update-repo with the information for all your add-ons)\n- host the XPI files either as an ""beta"" asset in a ""github release"", or directly as a file in the repo (I think the asset is the better choice),\n- make sure that the manifest of your self-hosted XPI points to the correct update.json. John I've removed the beta version of my extension from ATN, and the listing for my extension now shows the current stable release. Thanks for your help!","The user has released a beta version 7.0b1 of their extension, Clippings for Thunderbird, and self-hosted it for testing purposes. However, the beta release appeared in the Add-ons for Thunderbird public listing, which is incorrect. The user was advised to file an issue with the addons-server, as self-hosted add-ons need an 'update_url' entry in their manifest to prevent this. The best practice is to remove the beta version from the ATN listing, create a dedicated branch/repo for update information, host the XPI files as a GitHub release asset or repo file, and ensure the manifest points to the correct update.json. The user has since removed the beta version from ATN, correcting the issue."
1,"the Thunderbird team is preparing the next big release: Thunderbird 128 ESR. Now is a good time to check if your add-ons are compatible. Thunderbird 128 is currently being shipped through the beta release channel. If you have not already installed Thunderbird Beta, you can get it from Thunderbird's download page [1]: Select the desired language, your operating system, and switch the ""release channel"" selection to ""Beta"".\n\nA list of known required changes to make add-ons compatible with Thunderbird 128 can be found on developer.thunderbird.net [2]. These changes mostly affect Experiment add-ons, which directly interact with Thunderbird code. WebExtensions usually do not need updates, but the add-ons team had to introduce the messagesUpdate permission, and browser.messages.update() will stop working, if the new permission has not been requested.\nAnother notably change is the official support of Manifest Version 3 in Thunderbird 128. The add-ons team removed deprecated elements and made additional changes to resolve inconsistencies in the APIs. The full list of changes can be found on webextension-api.thunderbird.net [3].\n\nStarting with Thunderbird 128, the API documentation on webextension-api.thunderbird.net [4] not only includes the WebExtension APIs added by Thunderbird, but also those inherited from Firefox (Thunderbird and Firefox share a significant amount of code). We are listing only methods, which are actually supported and working.\n\nDo not hesitate to reach out for help [5]. Looking forward to see your add-ons running in Thunderbird 128!\n\nJohn\n\n\n[1] : https://www.thunderbird.net/thunderbird/all/\n[2] : https://developer.thunderbird.net/add-ons/updating/tb128\n[3] : https://webextension-api.thunderbird.net/en/128-esr-mv3/changes/esr128.html\n[4] : https://webextension-api.thunderbird.net/en/stable/\n[5] : https://developer.thunderbird.net/add-ons/community","Thunderbird is preparing to release Thunderbird 128 ESR, and add-on developers should check their compatibility. Thunderbird 128 is currently available through the beta release channel on Thunderbird's download page. The main changes affecting add-ons are updates to Experiment add-ons that interact directly with Thunderbird code, the introduction of the messagesUpdate permission for WebExtensions, and the official support of Manifest Version 3. The full list of changes can be found on the appropriate Thunderbird developer websites. Additionally, the API documentation now includes both Thunderbird and Firefox WebExtension APIs, focusing on those that are actually supported and working. Developers are encouraged to reach out for help if needed."
2,"Hi all, Well I think everything is in the title. We have an extension that shows pdf thumbs in the message pane (faster than openning them). We used to provide PDF.js. Is this still needed? Regards, We usually do not encourage add-ons to depend on files shipped with Thunderbird, because the file could be moved/renamed, which will break your add-on. Ship your own version, so your add-on always uses a known version and stays compatible, even if Thunderbird updates its internal file, which might include API changes which could break your add-on as well. John","The message is about an extension that displays PDF thumbnails in the message pane for faster access than opening them. Initially, this extension used PDF.js, but the author is questioning if it's still necessary due to potential issues with depending on files shipped with Thunderbird, which could be moved or renamed, breaking the add-on. The author suggests shipping their own version of PDF.js to ensure compatibility, even if Thunderbird updates its internal file, which may include API changes that could also break the add-on. John is likely encouraging developers to handle their dependencies independently to maintain compatibility with different Thunderbird versions."
3,"I'm studying how to develop addons. So, I'm sorry if my question is noob.\n\nMy code is here: https://github.com/gersonjferreira/Zulip-Thunderbird\n\nThe extension adds an icon to open the Zulip team chat on a Thunderbird tab. The purpose is to concentrate all chat/email apps into a single window. It is a simple extension and it is working fine... but...\n\nWithin the Zulip chats (same for Slack or other examples), there are external links, or links to PDF files, etc... when I left-click on these, the Thunderbird internal browser breaks and stays black. Probably because it cannot show the ""save as"" dialog or ""open with"", and so on...\n\nBut I if right-click and select ""open in browser"", it works fine. It takes me to Firefox and opens the desired link, or downloads the PDF, and so on...\n\nIs there a way to fix this, so that the left-click takes me to Firefox? Or to make it all work within Thunderbird?\n\nPS: my extension is quite simple and I'm still learning a lot of details. Please feel free to give feedback and tips on what I should improve. I have no idea how many people would be interested in using it.\n\nBest regards,\nGerson\n\nHm,\n\nI do not know if it is related: Do all those ""broken"" a-tags have a target=""_blank"" attribute? It appears those do not work as expected in Thunderbird. Filed a bug for it:\nhttps://bugzilla.mozilla.org/show_bug.cgi?id=1905616\n\nThe link handler of Thunderbird content tabs will always open links of the same site in the same Thunderbird tab. Links to other sites will open in the default browser.\n\nTo change this behaviour, you can use a content script. Add this to your manifest:\n\n""content_scripts"": [ { ""matches"": [""<all_urls>""], ""js"": [""content-script.js""] } ],\n\nYou may limit the content script to a more strict match, if you know which pages you are going to use with your extension. In the content script define a global clickhandler:\n\nwindow.addEventListener(""click"", clickhandler);\n\nfunction clickhandler(event) {\nevent.preventDefault();\nevent.stopPropagation();\n\nconst anchor = event.target.closest(""a"");\nif (!anchor) return;\nbrowser.windows.openDefaultBrowser(anchor.getAttribute('href'));\n}\n\nDoes this help?\n\nJohn\n\nYes, at the moment I confirm that all links that fail have target=""_blank"", so it seems related to the bug you are reporting.\n\nRegarding you suggestion, it sounds promising, but I have a busy day of work today. I'll test this changes later tonight and I'll reply here if it works.\n\nI've just tested your suggestion, and it helped me identify some details:\n\nFirst, indeed the links that need to be fixed are the ones with target=""_blank"" for sure, but not all of then. In zulip, there are links that start with /user_uploads, like /user_uploads/bla_bla.pdf, and these are the broken ones in my addon. To make these work, I need to add the Zulip prefix ""https://my_org.zulipchat.com/"" + ""/user_uploads/bla_bla.pdf"".\n\nTo make this work, I had to split your suggestion into a background and a content-script code like this:\n// background.js\nbrowser.runtime.onMessage.addListener(function (message) {\n prefix = browser.storage.sync.get('zulip_url');\n prefix.then(function(result) {\n let prefix = result.zulip_url;\n href = message.data;\n if (href.startsWith(""/user_uploads"")) {\n href = prefix + href;\n }\n browser.windows.openDefaultBrowser(href);\n });\n});\nand\n// content-script.js\nwindow.addEventListener(""click"", clickhandler);\n\nfunction clickhandler(event) {\n event.preventDefault();\n event.stopPropagation();\n\n const anchor = event.target.closest(""a"");\n if (!anchor) return;\n browser.runtime.sendMessage({\n data: anchor.getAttribute('href')\n });\n};\n\nI'll try this for a couple of days to see if I spot other issues. If you have any suggestion on how I can improve this code, please let me know.\n\nThanks for the help,\nBest regards,\nGerson","Gerson is developing an addon for Thunderbird that adds an icon to open Zulip team chat. The extension works fine, but when clicking on external links or PDF files within Zulip, Thunderbird's internal browser breaks and stays black. Right-clicking on these links and selecting ""open in browser"" works fine. Gerson is looking for a way to make left-click work as well, or to make it all work within Thunderbird.\n\nJohn suggested using a content script, adding an event listener for clicks and overriding the default behavior to open links in the default browser. Gerson tested the suggestion and identified that not all links with target=""_blank"" are broken, only some of them that start with ""/user_uploads"". Gerson then split John's suggestion into a background and a content-script code, using browser.runtime.sendMessage to add the Zulip prefix to the broken links and open them in the default browser. Gerson will test this solution for a couple of days to see if any other issues arise."
4,"I think this is going to require some expertise from someone who knows how Thunderbird works internally, but I thought I would ask anyway.\n\nWhen you set mail.tabs.drawInTitlebar=true in the Config Editor, the OS-supplied titlebar -- with the window title and the minimize/maximize/restore/close buttons -- is hidden, and Thunderbird-supplied equivalents for the buttons are added instead. This is great.\n\nAnother thing that happens is that you can DRAG in the toolbox/toolbars at the top of the window to move the Thunderbird window around, as the titlebar is no longer available for this purpose. This is also great.\n\nI have been trying to implement the same behavior for an extension I have been working on. I can hide the OS-supplied titlebar and add my own buttons, and that is working well, but I cannot figure out how to make it so I can drag on my extension's toolbox/toolbars to make the windows for my extension's move.\n\nI seem to have isolated what triggers this behavior for Thunderbird's main window (and the message view and compose windows.) Using the Developer Toolbox, I just add Attribute chromemargin=""0,2,2,2"" (or chromemargin with ANY values, actually) to the <html> tag in the HTML/XUL for the Thunderbird window. It's like magic. Add the chromemargin Attribute, and the titlebar disappears, and you can drag the window around using the toolbox/toolbars at the top of the window. Remove the chromemargin Attribute and the titlebar re-appears and you CANNOT drag the window around using the toolbox/toolbars at the top of the window.\n\nThis does not work for the windows for my extension. Dragging on the toolbox/toolbars does nothing.\n\nI have looked to see what event listeners might be attached to the Toolbox/toolbars that would respond to mouse drags and I cannot see anything. Perhaps it's something in the source code for the app itself? Or some other value stored somewhere, some toolbox or toolbar ID, or CSS class, or attribute or something. I just don't know.\n\nI have tried looking at the Thunderbird source code, and I understand it a bit, but not well enough to figure this out.\n\nIf you have any knowledge of this, could you please give me some help?\n\nMany thanks in advance,\n---Mark\nJust in case anyone is interested in this, user morat provided the answer in this article on mozillaZine:\n\nhttp://forums.mozillazine.org/viewtopic.php?t=3121128&sid=d068573e2643604e638528cdb09a5a63","The user, Mark, is trying to replicate the behavior in Thunderbird where the OS-supplied titlebar with the minimize/maximize/restore/close buttons is hidden and replaced with Thunderbird-supplied equivalents, and where you can drag the window around using the toolbox/toolbars at the top of the window. He has managed to hide the OS-supplied titlebar and add his own buttons in an extension, but cannot make the window dragable with his extension's toolbox/toolbars.\n\nHe has found that adding the attribute ""chromemargin='0,2,2,2'"" to the <html> tag in the HTML/XUL for Thunderbird's window makes it work, but this does not work for his extension's windows. He has tried looking for event listeners and can't find anything, and has also looked into Thunderbird's source code but finds it too complex to figure out. He is asking for help to understand this behavior.\n\nA solution was provided by a user named morat in a MozillaZine forum thread: https://forums.mozillazine.org/viewtopic.php?t=312"
5,"My extension uses a third-party script which offers the option of using web workers during some phases of processing. I would like to attempt to use this option since they may offer performance improvements; however, when constructing the Worker object a content security policy exception is generated to the effect of:\n\nContent-Security-Policy: The page’s settings blocked the loading of a resource at blob:moz-extension://95dccb93-c83d-4a14-a837-e4c30420784f/74bc237c-be9c-4ff4-b846-1a87ee995454 (“script-src”).\n\nI am guessing that a meta tag needs to be added in the popup page, but I am not sure of the exact content of the tag, assuming I am on the right track and any of this is permissible in the first place.\n\nThe relevant documentation that I could locate is below.\n\ndeveloper.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy\ndeveloper.mozilla.org/en-US/docs/Web/API/Web_Workers_API/Using_web_workers#content_security_policy\n\nThank you.\nCan you provide a link to the add-on, so we could have a more detailed look?\n\nJohn\nHere is a link to the repository:\n\nhttps://github.com/tmccoid-tech/extract-em\n\nI checked in a change that produces the error (main branch).\n\nThe background on this is that I am switching from the JSZip library for producing .zip files to zip.js since the former was not reliably able to handle larger (~750 MB) output files.\n\nzip.js offers Web Workers for generating the .zip files, but it is not mandatory. No error is produced if the ww option isn't used.\n\nIf I am not mistaken, the error is incurred in the getWebWorker method of the code file zip.js (line 2276) where the Worker object is constructed. It is difficult to trace though presumably due to async execution and the fact that this is not a trappable error.\n\nPlease see the screenshot of the error message presented in the debugger:\n\nhttps://github.com/tmccoid-tech/extract-em/issues/5\n\nThanks,\nTom\nPlease ignore this for the time being -- the developer of the zip library reached out to me with a possible solution.","The user is experiencing a Content Security Policy (CSP) error while attempting to use web workers in their extension, which utilizes a third-party script for processing. The error message suggests that the ""script-src"" directive in the CSP header is blocking the loading of a resource from ""blob:moz-extension://"". The user believes that adding a meta tag in the popup page might solve the issue but is unsure of the exact content.\n\nThe issue arises when attempting to create a Worker object in the zip.js library, which offers Web Workers for generating .zip files, and the user is switching from another library due to its inability to handle larger files. The error occurs in the getWebWorker method of zip.js (line 2276). However, the developer of the zip library has suggested a possible solution, which the user has ignored for the time being. The user has provided a link to their repository for further examination: https://github.com/tmccoid-tech/extract-em."
6,"So, I have a content script that I'm writing, but calls to console.log() do not appear on the debug console launched with the inspect button. Messages from the background script and a popup.js I registered via the manifest appear fine. If it matters, the script is a message display script that is invoked by injecting the script the same way it is done in the message-content-script example - not via manifest. Do I need to set something in the debug console or manifest to see these messages?\nThe script has registered an onMessage listener, but on sending a message from the background script I get ""Error: Could not establish connection. Receiving end does not exist"" thrown form the sending (background) code. The script definitely runs when I select the message, as I can see the redboxed text that was used in the message-content-script example. I just can't get it to react to a message form the background. I understand that the script will be sleeping/inactive normally, but shouldn't it wake on receipt of an onMessage? Is it even possible to listen to messages from the background in a content script? Or should I be injecting the code freshly somehow?","It seems like you're having issues with a content script not receiving messages from the background script and not showing console logs in the debug console. Here's a summary of your problem and some potential solutions:\n\n1. Your content script's console.log() calls are not appearing on the debug console launched with the inspect button. Although messages from the background script and a popup.js registered via the manifest are showing up fine, your content script is not.\n\n Possible causes and solutions:\n - Ensure you're opening the correct tab or frame in the debug console.\n - Check that Chrome's `--no-sandbox` flag is not being used, as it can prevent scripts from communicating effectively.\n - Verify that the content script is loaded in the correct context.\n\n2. When sending a message from the background script, you're encountering an ""Error: Could not establish connection. Receiving end does not exist"" error. The content script runs when you select the message, but it does not react to messages from the background.\n\n Possible causes and solutions:\n - Ensure both the background script and content script are correctly specified in the manifest file."
7,"I'm a long time TB user, and programmer by trade, but have never done any TB or add-on coding.\nI'm intested to get an idea of if what I would like to develop as an extension is possible, as I don't know what the limitations of add-on capabilities are.\nMy thought was to create a basic CRM - client relationship managment add-on.\nPlease let me know if any of this is not possible with an add-on so that I'm not wasting my time coding until I hit a wall.\n\nNew button on received email (like where delete/spam buttons are) for create folder - this would create a new mail folder under some designated existing folder. Prompt for Folder Name, and creates a mail filter which will store sent emails and incoming emails in the new folder based on domain name (or other entered text).\nIf a matching config exists already for the received email (using the mail filter), the button would be to send this email there instead (like when done reviewing in inbox).\nAnother button for schedule response - when clicked, would prompt for a date and title/desc (can schedule be in TB directly or utilize calendar features?). Once added and the date/time arrives, TB will pop up the text entered. ""Check back with client regarding..."" Most likely, an upcoming list of scheduled items could appear where the current 'task' list would be (when using the calendar in TB).\n\nSo, that is the jist of it. Super basic way to manage when the user wants to respond/check back to a clients email. Probably more stuff later, but this would improve my process enough I would not need the hassle of an entirely different CRM software.\n\nThanks, Wes\nHi Wes,\n\nI have seen a few CRM add-ons passing through review, for example\n\n* https://addons.thunderbird.net/addon/rt-archive-emails-to-crm/\n* https://addons.thunderbird.net/addon/kundenmeister-mail-export/\n\nThose seem to work only with specific systems, but they might give an impression of what is possible.\n\nYou can:\n* have a button in the message header area\n* you can create folders\n* you can prompt the user in popus or in the options page for configuration data\n* you cannot yet create a ""real"" email filter (one which shows up in Thunderbirds filters), but you can react on incoming email and do something with new mail\n* you can implement a reminder functionality, or use https://addons.thunderbird.net/addon/mailmindr/ (or cooperate with that add-on: add-ons can communicate with other add-ons if they implement a public API)\n\nHope that helps,\nJohn","Wes is a long-time Thunderbird user and programmer who is interested in developing a CRM add-on for Thunderbird. He wants to know the capabilities and limitations of adding an extension to Thunderbird. His proposed features include:\n\n1. A new button on received emails for creating a new mail folder under an existing folder. The button will prompt the user to provide a folder name and create a mail filter that will store sent and incoming emails based on the domain name or entered text. If a matching configuration exists, the button will direct the email to that folder.\n2. Another button for scheduling responses. When clicked, it will prompt the user to enter a date and title/description. Once the date/time arrives, Thunderbird will pop up the entered text as a reminder to check back with the client. The upcoming scheduled items can appear in the place of the current 'task' list.\n\nJohn, a reviewer, acknowledges that there are existing CRM add-ons for Thunderbird but notes that they only work with specific systems. He confirms that Wes can create a button in the message header area, create folders, prompt the user for configuration data, and react to incoming emails."
8,"Hello Mozillians and friends of Thunderbird,\n\nThe next Thunderbird Council election is coming up. As in previous years, people eligible to vote and to stand as candidates need to be on an electoral roll. Contributors are eligible to be an elector if they have contributed 10 or more hours per year of involvement in the Thunderbird project. To illustrate this, if a contributor first became involved three years ago then the expected contribution is at least 30 hours, if two years ago then at least 20 hours, and if one year ago or less then at least 10 hours. Contributors may self-nominate, and on request must submit examples that illustrate their level of involvement.\n\nContribution can be any of the following:\n\ntriaging bugs;\nfixing, or reviewing code changes;\nproviding support on SUMO or other forums;\nlocalizing Thunderbird, related websites, or extensions;\ntesting, writing, or reviewing add-ons;\nfurthering the Thunderbird cause by constructive contributions to tb-planning;\nor public relations, including writing blog posts, posting on social media, artwork, etc.\nPeople working for Mozilla who dedicate time to Thunderbird are also eligible.\n\nThis simple rule will spare us any complicated metric, such as “x number of patches submitted”, “x bugs submitted” or “x SUMO comments written”, etc. If we have missed a group of contributors, or if you are unsure if your contribution qualifies, please reach out to the Council (at council@thunderbird.net), or me personally.\n\nSince we already know some of the people who are dedicating time, their most precious asset, to Thunderbird, we have prepared a preliminary electoral roll based on previous years. The roll is available on GItHub. New additions are listed at the bottom under ""(new for 2022)"".\nWe ask voters who were previously added to the electoral roll to check for themselves whether they still fulfill the criteria mentioned above and inform us if they are no longer contributing to the Thunderbird project at the required level or do not wish to participate in the election. We also kindly ask you to notify others who may be eligible to self-nominate if you notice someone is missing from the electoral roll.\nTo reiterate, people who are not on the preliminary electoral roll can contact any member of the Council (at council@thunderbird.net), mentioning their contribution and asking to be added to the electoral roll. The idea is not to make it hard to be on the electoral roll.\n\nIn general, the election process will work as in past years (as documented in the bylaws), follow-up emails will include more instructions on the exact timeline, but you will have until at least November 28th, 2022 to be added to the electoral roll.\nThe election itself will be run by neutral 3rd parties: Peter Saint-Andre, with G. Matthew Rice helping to moderate the list.\n\nThanks,\nAndrei Hajdukewycz\nThunderbird Council Secretary","The Thunderbird Council election is approaching, and to be eligible to vote or run as a candidate, individuals must be on an electoral roll. Contributors who have at least 10 hours of involvement per year in the Thunderbird project during the past year are eligible. Contributions can include triaging bugs, fixing or reviewing code changes, providing support, localizing Thunderbird, testing, writing, or reviewing add-ons, furthering the Thunderbird cause, or public relations work. Mozilla employees dedicating time to Thunderbird are also eligible. A simple rule is used instead of complex metrics to determine eligibility. The preliminary electoral roll is available on GitHub, and those who are not on it can request to be added. New additions and changes are listed at the bottom. Voters and eligible individuals should check their status and inform the Council if they no longer meet the eligibility criteria. The election process will follow past years' bylaws, and the election will be run by neutral 3rd parties."
9,"Notes:\n* FIXED indicates resolved in DAILY development builds in last 24 hours. May take a week or more to be fixed beta. https://www.thunderbird.net/notes/beta lists issues definitely resolved in BETA.\n* A bug may change from FIXED to some other resolution in the time period, so this report might list bugs whose current resolution is something other than fixed.\n* Excludes bugs whose version field is 115 at the time this bug list was generated.\n* Includes bugs reported as version 128 and future versions, as these may also exist in beta.\n* PINNED 📌 posts at https://thunderbird.topicbox.com/groups/beta has links to more bug lists.\n* Report runs at 11am UTC and includes the following, but may be empty if no activity in the 24 hour period:\n** New bugs in last 24 hours\n** Resolved bugs in last 24 hours\n\nThis search was scheduled by vseerror@fastmail.com.\n\nNew bugs in last 24 hours\nID\tType\tSev\tPri\tPlt\tAssignee\tStatus\tResolution\tSummary\n1907113\tdefect\t--\t--\tx86_64\tishikawa@yk.rim.or.jp\tREOPENED\t---\tMany C-C TB xpcshell tests fail, Hit MOZ_CRASH(assertion `left != right` failed: src and dst must not alias ) at mozilla/netwerk/base/idna_glue/src/lib.rs:66\n1907115\tdefect\t--\t--\tUnspecified\tnobody@mozilla.org\tNEW\t---\tTwo Factor OAuth (for text and email) not working - Office365\n1907245\tdefect\t--\t--\tUnspecified\tnobody@mozilla.org\tUNCONFIRMED\t---\tExceptionCode: c0000005 (Access violation)\n1907248\tdefect\t--\t--\tUnspecified\tnobody@mozilla.org\tUNCONFIRMED\t---\tSending format ""Only Plain Text"" should switch off HTML editor\n1907249\tdefect\t--\t--\tUnspecified\tnobody@mozilla.org\tUNCONFIRMED\t---\tRefresh Calendar drop-down items not rendered while editing remote calendars\n1907255\tenhancement\t--\t--\tUnspecified\tnobody@mozilla.org\tUNCONFIRMED\t---\tAllow calendar invitations to be encrypted\n1907262\tdefect\t--\t--\tUnspecified\tnobody@mozilla.org\tUNCONFIRMED\t---\tScrolling hides date headers in Calendar Week view\n1907270\tdefect\tS3\t--\tDesktop\tnobody@mozilla.org\tNEW\t---\tCannot delete emails using the DEL button while word search mode is activated\n1907282\tdefect\t--\t--\tUnspecified\tnobody@mozilla.org\tUNCONFIRMED\t---\tRNP failed to parse signature in certificate breaking OpenPGP support\nResolved bugs in last 24 hours\nID\tType\tSev\tPri\tPlt\tAssignee\tStatus\tResolution\tSummary\n1906835\tdefect\t--\t--\tUnspecified\tgeoff@thunderbird.net\tRESOLVED\tFIXED\tnsMsgDBFolder backup database isn't closed if the normal database is already closed",This report summarizes the new and resolved bugs found in Thunderbird's Beta version in the last 24 hours followed by a note that the report was generated by vseerror@fastmail.com.\n\nNew bugs include:\n1. A C-C TB xpcshell test failure issue with the assertion failing at `mozilla/netwerk/base/idna_glue/src/lib.rs:66` (ID: 1907113)\n2. Two Factor OAuth not working with Office365 (ID: 1907115)\n3. An exception with an Access violation (ID: 1907245)\n4. Unconfirmed issue with switch off HTML editor (ID: 1907248)\n5. Unrendered drop-down items in Calendar while editing remote calendars (ID: 1907249)\n6. Unconfirmed issue to Allow calendar invitations to be encrypted (ID: 1907255)\n7. Scrolling hiding date headers in Calendar Week view (ID


Large language models today are consumed in one of several ways:

+ As API endpoints for proprietary models hosted by OpenAI, Anthropic, or major cloud providers
+ As model artifacts downloaded from HuggingFace’s Model Hub and/or trained/fine-tuned using HuggingFace libraries and hosted on local storage
+ As model artifacts available in a format optimized for local inference, typically GGUF, and accessed via applications like llama.cpp or ollama
+ As ONNX, a format which optimizes sharing between backend ML frameworks

We use API endpoints and local storage in Lumigator. 

In [15]:
# Let's take a look at all available deploys
ld.get_deployments()

{
  "total": 44,
  "items": [
    {
      "id": "bd6e1a72-037e-4b76-ab5c-53adac282b1b",
      "name": "summarizer",
      "description": "Text summarization model",
      "status": "created",
      "created_at": "2024-07-29T20:23:49.330389Z",
      "updated_at": null
    },
    {
      "id": "3052d787-0773-4649-9e4a-b93fd31b5056",
      "name": "summarizer",
      "description": "Text summarization model",
      "status": "created",
      "created_at": "2024-07-31T16:55:17.888442Z",
      "updated_at": null
    },
    {
      "id": "6dc6f7d8-19cc-4c64-ab2b-40d21b686b36",
      "name": "summarizer",
      "description": "Text summarization model",
      "status": "created",
      "created_at": "2024-07-31T17:01:09.772139Z",
      "updated_at": null
    },
    {
      "id": "6ae90573-4555-4c0a-8158-b14eb58665bf",
      "name": "summarizer",
      "description": "Text summarization model",
      "status": "created",
      "created_at": "2024-07-31T17:03:12.373845Z",
      "updated_at": nu

<Response [200]>

In [107]:
## Perform Ground Truth Generation with BART

# UPDATE DEPLOYMENT ID 
deployment_id = "540b155e-40c1-4330-a17f-0eb5b0cc7771"

bart_responses = []

for prompt in df['examples'][0:10]:
    response = ld.get_bart_ground_truth(deployment_id, prompt)
    response_dict = json.loads(response.text)
    results = response_dict.get('deployment_response', {}).get('result', 'No result found')
    bart_responses.append(prompt, results)

Request failed: HTTPConnectionPool(host='127.0.0.1', port=80): Read timed out. (read timeout=10)


ReadTimeout: HTTPConnectionPool(host='127.0.0.1', port=80): Read timed out. (read timeout=10)

In [None]:
bart_results_df = pd.DataFrame(bart_responses, columns=['original', 'bart_response'])
bart_results_df

In [None]:
# Combine results and examine multiple versions of ground truth
merged_df = pd.merge(bart_results_df, mistral_results_df, on='original', how='outer')
merged_df.to_csv('ground_truth.csv', index=False)

In [None]:
merged_df 

In [None]:
# Now that we have the data, let's save it to the cluster so we can use it later on
ld.dataset_upload("ground_truth.csv")

In [None]:
# And let's check that data
ld.get_datasets()

## Loading Data

### Loading data
The following dataset is already in the format that we need as input: 
- one field called `examples` containing the text to summarize
- one field called `ground_truth` containing the summaries to the models' outputs against

Note that you can load many different types of file formats in a similar way (see https://huggingface.co/docs/datasets/loading#local-and-remote-files)

In [None]:
#TODO:

huggingface datasets versus csv
and lm-buddy prefixes

## Dataset Upload

In [None]:
dataset_name = "thunderbird.csv"
dataset_id = "f5d54efa-247d-4910-9393-f6003da9fb68" # thunderbird pre-saved dataset HuggingFace

r = ld.dataset_info(dataset_id)

## Model Selection

What you see below are different lists of models we have already tested for the summarization task.
The `models` variable at the end provides you with a selection, but you can choose any combination of them.

In [None]:
enc_dec_models = [
    'hf://facebook/bart-large-cnn',
    'hf://mikeadimech/longformer-qmsum-meeting-summarization', 
    'hf://mrm8488/t5-base-finetuned-summarize-news',
    'hf://Falconsai/text_summarization',
]

dec_models = [
    'mistral://open-mistral-7b',
]

gpts = [
    "oai://gpt-4o-mini",
    "oai://gpt-4-turbo",
    "oai://gpt-3.5-turbo-0125"  
]

# TODO: add llamafile

models = [
    enc_dec_models[0], # bart-large-cnn
    dec_models[0], # Mistral-7B-Instruct-v0.3
    gpts[0] # gpt-4o-mini
]

# show selected models

models

In [None]:
# TODO
Introduce metrics 

## Run Evaluations

In [None]:
# change the following to 0 to use all samples in the dataset
max_samples = 10

responses = []
for model in models:
    descr = f"Testing {model} summarization model on {dataset_name}"
    responses.append(ld.experiments_submit(model, team_name, descr, dataset_id, max_samples))

In [None]:
# TODO
Discuss Ray dashboard/show dashboard

### Track evaluation jobs

Run the following to track your evaluation jobs.

- *NOTE*: you won't be able to run other cells while this one is running. However, you can interrupt it whenever you want by clicking on the "stop" button above and run it at a later time.

In [None]:
job_ids = [ld.get_resource_id(r) for r in responses]

wip = ld.show_experiment_statuses(job_ids)
while wip == True:
    time.sleep(5)
    clear_output()
    wip=ld.show_experiment_statuses(job_ids)

## Show evaluation results

In [None]:
# after the jobs complete, gather evaluation results
eval_results = []
for job_id in job_ids:
    eval_results.append(ld.experiments_result_download(job_id))

# convert results into a pandas dataframe
eval_table = ld.eval_results_to_table(models, eval_results)

In [None]:
eval_table

In [None]:
eval_results[0]

## Evaluation Results

In [None]:
#TODO

add eval discussion 