# wordslab-notebooks-lib.jupyterlab

> Access wordslab-notebooks Jupyterlab extension version, current notebook path, json content and cell id, and create or update cells.

## Work together with AI in a Jupyterlab notebook - the Solveit method

A Jupyter notebook is a convenient way to build context for a LLM one cell after the other: you are working in a fully editable conversation, while interacting with AI and with code.

**Jeremy Howard** and his team at **Answer.ai** explored how to work efficiently in this kind of conversation: they developed a method and platform called Solveit.

https://solve.it.com/

We would like to replicate this approach to working with AI in a wordslab notebooks environment. 

Here is how we chose to do it:
- in a jupyterlab notebook, there are two types of cells: markdown and code
- we want to simulate a third type of cell: a "prompt" cell
- the content of this cell is a prompt (text in markdown format) which is sent to an llm when the cell is executed, along with the text of all the cells situated above in the notebook (context)
- the llm response is streamed just below and formatted as markdown.

To simulate this "prompt" cell we need to develop a **Jupyterlab frontend extension** which implements the following behaviors :
- three buttons are added to the cell toolbar: "note", "prompt", "code"
- a click on one of these buttons changes the type of the cell
  - "note" selects a classic markdown cell
  - "prompt" selects a code cell, modified with the special "prompt behavior" defined below
  - "code" selects a classic code cell
- a "prompt" cell is distinguished from a regular code cell by a metadata property registered in the ipynb file
- each cell type is visualized by a specific color in the left border of the cell
  - "note" cell has a green border
  - "prompt" cell a red border
  - "code" cell has a blue border
- the "prompt" cell is a code cell with the specific modified behaviors
  - the code syntax highlighting is replaced by markdown syntax highlighting when the user types text in this cell
  - when the user executes this cell, the frontend extension does the following
    - calls the kernel to inject the following variables
      - __notebook_path with the path and name of the notebook in the workspace
      - __notebook_content with the full json representation of the notebook
      - __cell_id with the id of the current cell
    - then calls the kernel to execute a specific chat(message) python function
      - where the message parameter is the content of the cell
      - and the content of the notebook above the current cell is inluded as context
  - the python chat() function streams the response tokens from the llm to the output section of the code cell, with markdown rendering

See the section "Develop a Jupyterlab frontend extension" at the bottom of this page to understand how the extension was developed.

## Install the Jupyterlab extension - wordslab-notebooks-lib

If you want to use "prompt" cells, you will first need to install the Jupyterlab frontend extension:
- activate your Jupyterlab python virtual environment
- **pip install wordslab-notebooks-lib**
- restart your Jupyterlab server

The extension is**already pre-installed in the wordslab-notebooks environment**.

To be clear: the wordslab-notebooks-lib package contains both: the Javascript Jupyterlab frontend extension AND the python library wich is loaded in the python kernel.

The Jupyterlab frontend extension is reloaded and re-initialized each time you refresh your browser page: 
- to check is the extension is installed and running, look at the browser console and llok for the message 'Wordslab notebooks extension vx.y.z activated'
- hit the refresh button if you encounter a bug and the extension stops working

## Communicate with the Jupyterlab extension

In [3]:
#| export
import nbformat

In [116]:
class JupyterlabNotebook:
    @property
    def path(self):
        return globals()["__notebook_path"]

    @property
    def content(self):
        return nbformat.from_dict(globals()["__notebook_content"])
    
    @property
    def cell_id(self):
        return globals()["__cell_id"]

In [117]:
notebook = JupyterlabNotebook()

In [118]:
notebook.path

'wordslab-notebooks-lib/nbs/02_jupyterlab.ipynb'

In [121]:
notebook.content.metadata

{'kernelspec': {'display_name': 'wordslab-notebooks-lib',
  'language': 'python',
  'name': 'wordslab-notebooks-lib'},
 'language_info': {'codemirror_mode': {'name': 'ipython', 'version': 3},
  'file_extension': '.py',
  'mimetype': 'text/x-python',
  'name': 'python',
  'nbconvert_exporter': 'python',
  'pygments_lexer': 'ipython3',
  'version': '3.12.12'}}

In [120]:
notebook.cell_id

'd16ad869-d651-40bd-af2c-623d82b4edf0'

In [97]:
__cell_id

'59347d6e-d1f7-4d23-b041-143e42887f6d'

In [None]:
6+6

In [98]:
__jupyterlab_extension_version

'0.0.11'

In [99]:
__notebook_path

'wordslab-notebooks-lib/nbs/01_context.ipynb'

In [100]:
__notebook_content['metadata']

{'kernelspec': {'display_name': 'wordslab-notebooks-lib',
  'language': 'python',
  'name': 'wordslab-notebooks-lib'},
 'language_info': {'codemirror_mode': {'name': 'ipython', 'version': 3},
  'file_extension': '.py',
  'mimetype': 'text/x-python',
  'name': 'python',
  'nbconvert_exporter': 'python',
  'pygments_lexer': 'ipython3',
  'version': '3.12.12'}}

In [101]:
__cell_id

'bd99d9c8-a4b6-4b80-b4b7-a43535b2566d'

In [2]:
#| default_exp context

## Develop a Jupyterlab frontend extension

### Understand Jupyterlab kernels and frontend extensions

Jupyter kernels technical implementation details

https://chatgpt.com/share/692bea08-4510-8004-b9ab-c02feeb97c08

Jupyterlab extension development tutorial

https://jupyterlab.readthedocs.io/en/latest/extension/extension_tutorial.html

### Intialize the components of a frontend extension

The source code of the Jupyterlab frontend extension can be found in the following files:

Typescript source code, dependencies, and compilation config:

- `src/index.ts`
- `package.json`
- `tsconfig.json`
- `.yarnrc.yml`

Extension manifest and Javascript compiled code

- wordslab_notebooks_lib/labextension
  - package.json
  - static/remoteEntry.97d57e417eaf8ebadeb6.js 

This is how the extension files are included in the python package:

- `MANIFEST.in` 

```
include install.json
include package.json
recursive-include wordslab_notebooks_lib/labextension *

graft wordslab_notebooks_lib/labextension
graft src
```

This is how the extension files are installed in Jupyterlab extensions directory when the python package is installed:

- `pyproject.toml`

```toml
[tool.setuptools]
include-package-data = true 

[tool.setuptools.data-files]
"share/jupyter/labextensions/wordslab-notebooks-lib" = [
  "wordslab_notebooks_lib/labextension/package.json",
  "install.json"
]
"share/jupyter/labextensions/wordslab-notebooks-lib/static" = [
  "wordslab_notebooks_lib/labextension/static/*"
]
```

This how the command `jupyter labextension develop` finds the directory where the extension files live:

- `wordslab_notebooks_lib\__init__.py`

```python
def _jupyter_labextension_paths():
    return [{
        "src": "labextension",
        "dest": "wordslab-notebooks-lib"
    }]
```

This is how the python package is identified as a Jupyterlab extension in pypi:

- `pyproject.toml`

```
classifiers = [ "Framework :: Jupyter :: JupyterLab :: Extensions :: Prebuilt" ]
```

### Install the Jupyterlab frontend extension in development mode

Open a Terminal

```bash
cd $WORDSLAB_WORKSPACE/wordslab-notebooks-lib
source $JUPYTERLAB_ENV/.venv/bin/activate

# Install Javascript dependencies
jlpm install

# Build TypeScript extension
jlpm build

# Register the extension with JupyterLab during development
# jupyter labextension develop . --overwrite
rm $JUPYTERLAB_ENV/.venv/share/jupyter/labextensions/wordslab-notebooks-lib
ln -s $WORDSLAB_WORKSPACE/wordslab-notebooks-lib/wordslab_notebooks_lib/labextension/ $JUPYTERLAB_ENV/.venv/share/jupyter/labextensions/wordslab-notebooks-lib

# Verify extension is found
jupyter labextension list
```

### Test the Jupyterlab frontend extension 

After installing the extension in development mode once, you can iterate fast:
- update the code in `src/index.ts`
- build the extension with `jlpm build`

```bash
cd $WORDSLAB_WORKSPACE/wordslab-notebooks-lib
source $JUPYTERLAB_ENV/.venv/bin/activate

# Build TypeScript extension
jlpm build
```
- **refresh** the Jupyterlab single page app in your browser
- test the updated extension

No need to reinstall the extension or to restart Jupyterlab itself, just refresh your browser page.

#### Install the python client library in development mode

```bash
cd $WORDSLAB_WORKSPACE/wordslab-notebooks-lib
source .venv/bin/activate

# Install nbdev and twine
# Install the wordslab-notebooks-lib python library in editable mode
uv sync --dev
```

#### Generate the python library from the source notebooks

```bash
cd $WORDSLAB_WORKSPACE/wordslab-notebooks-lib
source .venv/bin/activate

# Export notebooks to Python modules
nbdev_export

# Clean the notebooks before commit in git
nbdev_clean
```

#### Test the python client library

After installing the client library in development mode once, you can iterate fast:
- create a notebook using the kernel "wordslab-notebooks-lib"
- restart the kernel if needed
- import wordslab_notebooks_lib
- use the functions defined in the library

#### Publish the extension to pypi when ready

Create a file called ~/.pypirc with your token details. It should have these contents:

```toml
[pypi]
username = __token__
password = your_pypi_token
```

Then execute the following commands:

```bash
cd $WORDSLAB_WORKSPACE/wordslab-notebooks-lib
source .venv/bin/activate

# Bump the version number
nbdev_bump_version

# Publish to PyPI
nbdev_pypi
```

### Develop a python client for the extension

In [None]:
import asyncio
from ipykernel.comm import Comm
from fastcore.utils import *
from fastcore.xml import to_xml, Src, Source,Out,Outs,Cell

#### Get notebook cells

In [4]:
#| export
def _notebook_data():
    future = asyncio.Future()
    
    def on_msg(msg):
        if not future.done():
            future.set_result(msg['content']['data'])
    
    comm = Comm(target_name='wordslab_notebook_comm', show_warning=False)
    comm.on_msg(on_msg)
    comm.send({'request': 'get_notebook_data'})

    return future

async def get_notebook_data(timeout=1):
    future = _notebook_data()
    try:
        return await asyncio.wait_for(future, timeout=timeout)
    except asyncio.TimeoutError:
        try:
            future = _notebook_data()
            return await asyncio.wait_for(future, timeout=timeout)
        except asyncio.TimeoutError:   
            raise TimeoutError("Failed to receive notebook context from Jupyterlab frontend: install wordslab-notebooks-lib extension, or increase the timeout parameter in seconds, or try to refresh the web page.")

In [5]:
data =  await get_notebook_data()
data["cell_id"]

TimeoutError: Failed to receive notebook context from Jupyterlab frontend: install wordslab-notebooks-lib extension, or increase the timeout parameter in seconds, or try to refresh the web page.

In [None]:
data =  await get_notebook_data()
data["cell_id"]

In [None]:
data =  await get_notebook_data()
notebook_content_dict = data["notebook"]
executing_cell_id = data["cell_id"]
executing_cell_id

#### Explore the notebook format

https://nbformat.readthedocs.io/en/latest/format_description.html

In [None]:
nb = nbformat.from_dict(notebook_content_dict)

code_language = nb.metadata.language_info.name
print("> " + code_language + " notebook")

for cell in nb.cells:
    if cell.id == executing_cell_id: break
        
    is_markdown = cell.cell_type == "markdown"
    is_code = cell.cell_type == "code"
    is_raw = cell.cell_type == "raw"

    print("---------------------")
    print("cell", cell.id, cell.cell_type)
    print("---------------------")
    if is_markdown:
        print(cell.source[:100])
    elif is_code:
        print(f"```{code_language}\n" + cell.source[:100] + "\n```")
    elif is_raw:
        print(cell.source[:100])
    if is_code and cell.execution_count>0 and len(cell.outputs)>0:
        print("---------------------")
        print("cell outputs", cell.id, cell.execution_count)
        print("---------------------")
        for output in cell.outputs:
            if output.output_type == "stream":
                print(f"<{output.name}>")
                print(output.text[:100])
                print(f"</{output.name}>")
            elif output.output_type == "display_data":
                print("<display>")
                if "data" in output:
                    print("  <data>")
                    repr(output.data)
                    print("  </data>")
                if "metadata" in output and len(output.metadata)>0:
                    print("  <metadata>")
                    repr(output.metadata)
                    print("  </metadata>")
                print("</display>")
            elif output.output_type == "execute_result":
                print("<result>")
                if "data" in output:
                    print("  <data>")
                    print(output.data)
                    print("  </data>")
                if "metadata" in output and len(output.metadata)>0:
                    print("  <metadata>")
                    print(output.metadata)
                    print("  </metadata>")
                print("</result>")
            elif output.output_type == "error":
                print("<error>")
                print(output.ename)
                print(output.evalue)
                for frame in output.traceback:
                    print(frame)
                print("</error>")
        print("---------------------")

#### Format the notebook cells for LLMs

Convert notebook contents to compact XML - code and format copied from **toolslm by AnswerDotAI**:

https://github.com/AnswerDotAI/toolslm/blob/main/00_xml.ipynb

In [None]:
#| exports
def get_mime_text(data):
    "Get text from MIME bundle, preferring markdown over plain"
    if 'text/markdown' in data: return ''.join(list(data['text/markdown']))
    if 'text/plain' in data: return ''.join(list(data['text/plain']))

In [None]:
#| exports
def cell2out(o):
    "Convert single notebook output to XML format"
    if hasattr(o, 'data'): 
        txt = get_mime_text(o.data)
        if txt: return Out(txt, mime='markdown' if 'text/markdown' in o.data else 'plain')
    if hasattr(o, 'text'):
        txt = o.text if isinstance(o.text, str) else ''.join(o.text)
        return Out(txt, type='stream', name=o.get('name', 'stdout'))
    if hasattr(o, 'ename'): return Out(f"{o.ename}: {o.evalue}", type='error')

In [None]:
#| exports
def cell2xml(cell):
    "Convert notebook cell to concise XML format"
    cts = Source(''.join(cell.source)) if hasattr(cell, 'source') and cell.source else None
    out_items = L(getattr(cell,'outputs',[])).map(cell2out).filter()
    outs = []
    if out_items: outs = Outs(*out_items)
    parts = [p for p in [cts, outs] if p]
    return Cell(*parts, type=cell.cell_type)

In [None]:
#| exports
def nb2xml(nb, until_cell_id):
    cells_xml = []
    for c in nb.cells:
        if c.id == until_cell_id: break
        if c.cell_type in ('code','markdown'):
            cells_xml.append(to_xml(cell2xml(c), do_escape=False))
    return '\n'.join(cells_xml)     

In [None]:
nb2xml(nb, executing_cell_id)[:3000]

In [None]:
#| exports
async def get_notebook_context(timeout=1):
    data = await get_notebook_data(timeout=timeout)
    notebook_content = data["notebook"]
    nb = nbformat.from_dict(notebook_content)
    cell_id = data["cell_id"]
    return nb2xml(nb, cell_id)

In [None]:
await get_notebook_context(timeout=1)

You can see that the content of this cell, which is below the call to get_notebook_context(), doesn't appear in the context.